Functions#

@publisher(source: AzureSource | LocalFileSource | MariaDBSource | MySQLSource | OracleSource | PostgresSource | S3Source | SourcePlugin, tables: TableOutput | str | List[str], name: str = None, trigger_by: str | List[str] | None = None) callable[source]#

Decorator that defines a function as a Tabsdata Publisher.

A Publisher is a function that reads data from an external source and publishes it as Tabsdata table(s).

Parameters:
  • source – A connector to the external source that will provide the data to publish in the Tabsdata server.

  • tables – Table(s) name(s) where to the data will be published. Name are ‘<table>’ (table(s) are always in the same collection as the Publisher).

  • name – (optional) The name with which the Publisher will be registered. If not provided, the Python function name will be used.

  • trigger_by – (optional) The table(s) that will cause the Publisher to execute when a new version is created. Names can be <collection>/<table> (when the table is in a different collection than the Publisher) or <table> (when the table is in the same collection as the Publisher). If not specified, it default to None.

Example:

>>> import tabsdata as td
>>>
>>> @td.publisher(
>>>     source=td.LocalFileSource("/opt/dropzone/users.csv"),
>>>     tables=["users"]
>>> )
>>> def publish_users(users: td.TableFrame) -> td.TableFrame:
>>>     return users
@transformer(input_tables: TableInput | str | List[str], output_tables: TableOutput | str | List[str], name: str = None, trigger_by: str | List[str] | None = '*') callable[source]#

Decorator that defines a function as a Tabsdata Transformer.

A Transformer is a function that reads data from Tabsdata tables, processes them, and writes the result to one or more Tabsdata tables.

Parameters:
  • input_tables – Input table(s) name(s). Names can be <collection>/<table> (when the table is in a different collection than the Transformer) or <table> (when the table is in the same collection as the Transformer).

  • output_tables – Output table(s) name(s). Name are ‘<table>’ (output table(s) are always in the same collection as the Transformer).

  • name – (optional) The name with which the Transformer will be registered. If not provided, the Python function name will be used.

  • trigger_by – (optional) The table(s) that will cause the Transformer to execute when a new version is created. Names can be <collection>/<table> (when the table is in a different collection than the Transformer) or <table> (when the table is in the same collection as the Transformer). If not specified, it default to all input_tables.

Example:

>>> import tabsdata as td
>>>
>>> @td.transformer(
>>>     input_tables=["users"],
>>>     output_tables=["active_users", "inactive_users"]
>>> )
>>> def split_users(
>>>     users: td.TableFrame
>>> ) -> (td.TableFrame, td.TableFrame):
>>>     active_users = users.filter(td.col("status") == 'active')
>>>     inactive_users = users.filter(td.col("status") == 'inactive')
>>>     return active_users, inactive_users
@subscriber(tables: TableInput | str | List[str], destination: AzureDestination | LocalFileDestination | MariaDBDestination | MySQLDestination | OracleDestination | PostgresDestination | S3Destination | DestinationPlugin, name: str = None, trigger_by: str | List[str] | None = '*') callable[source]#

Decorator that defines a function as a Tabsdata Subscriber.

A Subscriber is a function that reads data from Tabsdata tables, and writes data to an external system.

Parameters:
  • tables – Input table(s) name(s). Names can be <collection>/<table> (when the table is in a different collection than the Transformer) or <table> (when the table is in the same collection as the Subscriber).

  • destination – A connector to the external source (the subscriber) that will receive the data from the Tabsdata server.

  • name – (optional) The name with which the Subscriber will be registered. If not provided, the Python function name will be used.

  • trigger_by – (optional) The table(s) that will cause the Subscriber to execute when a new version is created. Names can be <collection>/<table> (when the table is in a different collection than the Subscriber) or <table> (when the table is in the same collection as the Publisher). If not specified, it default to all tables.

Example:

>>> import tabsdata as td
>>>
>>> @td.subscriber(
>>>     tables=["active_users"],
>>>     destination= td.MariaDBDestination(
>>>         uri="mariadb+mysqlconnector://127.0.0.1:3307/mydb",
>>>         destination_table="ACTIVE_USER",
>>>         credentials=td.UserPasswordCredentials("admin", "supersecret"),
>>>     )
>>> )
>>> def export_active_users(active_users: td.TableFrame) -> td.TableFrame:
>>>     return active_users