Destinations

class AzureDestination(
uri: str | list[str],
credentials: AzureCredentials,
format: str | FileFormat = None,
)

Bases: DestinationPlugin

Categories:

destination

Azure-file-based data outputs.

class SupportedFormats(
*values,
)

Bases: Enum

Enum for the supported formats for the AzureDestination.

avro = <class 'tabsdata._format.AvroFormat'>
csv = <class 'tabsdata._format.CSVFormat'>
ndjson = <class 'tabsdata._format.NDJSONFormat'>
parquet = <class 'tabsdata._format.ParquetFormat'>
property allow_fragments: bool

Whether to allow fragments in the output.

Type:

bool

chunk(
working_dir: str,
*results,
)

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property credentials: AzureCredentials

The credentials required to access Azure.

Type:

AzureCredentials

property format: FileFormat

The format of the file. If not provided, it will be inferred from the file extension of the URI.

Type:

FileFormat

stream(
working_dir: str,
*results,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property uri: str | list[str]

‘az://path/to/files’.

Type:

str | list[str]

Type:

The URI of the files with format

class BigQueryDest(
conn: BigQueryConn,
tables: TableSpec | None = None,
if_table_exists: IfTableExistStrategySpec = 'append',
schema_strategy: SchemaStrategySpec = 'update',
)

Bases: DestinationPlugin

Categories:

destination

BigQuery based data outputs. The data is first stored in parquet files in a GCS bucket, and then loaded into the BigQuery tables.

property conn: BigQueryConn
property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property schema_strategy: Literal['update', 'strict']

The strategy to follow when appending to a table with an existing schema.

Type:

str

stream(
working_dir: str,
*results: list[LazyFrame | None] | LazyFrame | None,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property tables: list[str] | None
class DatabricksDestination(
host_url: str,
token: str | Secret,
tables: list[str] | str,
volume: str,
catalog: str | None = None,
schema: str | None = None,
warehouse: str | None = None,
warehouse_id: str | None = None,
if_table_exists: Literal['append', 'replace'] = 'append',
schema_strategy: Literal['update', 'strict'] = 'update',
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

Databricks based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) List[None | str]

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property host_url: str
property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property schema_strategy: Literal['update', 'strict']

The strategy to follow when appending to a table with an existing schema.

Type:

str

stream(
working_dir: str,
*results: List[LazyFrame | None] | LazyFrame | None,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property tables: List[str]
property token: Secret
write(
files,
)

This method is used to write the files to the databricks. It is called from the stream method, and it is not intended to be called directly.

class GCSDestination(
uri: str | list[str],
credentials: GCPCredentials,
format: str | FileFormat = None,
)

Bases: DestinationPlugin

Categories:

destination

GCS-file-based data outputs.

class SupportedFormats(
*values,
)

Bases: Enum

Enum for the supported formats for the GCSDestination.

avro = <class 'tabsdata._format.AvroFormat'>
csv = <class 'tabsdata._format.CSVFormat'>
ndjson = <class 'tabsdata._format.NDJSONFormat'>
parquet = <class 'tabsdata._format.ParquetFormat'>
property allow_fragments: bool

Whether to allow fragments in the output.

Type:

bool

chunk(
working_dir: str,
*results,
)

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property credentials: GCPCredentials

The credentials required to access GCS.

Type:

GCPCredentials

property format: FileFormat

The format of the file. If not provided, it will be inferred from the file extension of the URI.

Type:

FileFormat

stream(
working_dir: str,
*results,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property uri: str | list[str]

‘gs://path/to/files’.

Type:

str | list[str]

Type:

The URI of the files with format

class LocalFileDestination(
path: str | list[str],
format: str | FileFormat = None,
)

Bases: DestinationPlugin

Categories:

destination

LocalFile-based data outputs.

class SupportedFormats(
*values,
)

Bases: Enum

Enum for the supported formats for the LocalFileDestination.

avro = <class 'tabsdata._format.AvroFormat'>
csv = <class 'tabsdata._format.CSVFormat'>
ndjson = <class 'tabsdata._format.NDJSONFormat'>
parquet = <class 'tabsdata._format.ParquetFormat'>
property allow_fragments: bool

Whether to allow fragments in the output.

Type:

bool

chunk(
working_dir: str,
*results,
)

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property format: FileFormat

The format of the file or files. If not provided, it will be inferred from the file extension in the path.

Type:

FileFormat

property path: str | list[str]

The path or paths to store the files.

Type:

str | list[str]

stream(
working_dir: str,
*results,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

class MSSQLDestination(
connection_string: str,
destination_table: str | list[str],
credentials: dict | UserPasswordCredentials | None = None,
server: str | Secret = None,
database: str | Secret = None,
driver: str | Secret = None,
if_table_exists: Literal['append', 'replace'] = 'append',
chunk_size: int = 50000,
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

MSSQL-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) list[None | str]

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property connection_string: str

Get the connection string for the database.

Returns:

The connection string.

Return type:

str

property credentials: UserPasswordCredentials | None

The credentials required to access Microsoft SQL Server. If no credentials were provided, it will return None.

Type:

UserPasswordCredentials | None

property destination_table: list[str]

Get the destination table(s) where the data will be stored.

Returns:

The destination table(s).

Return type:

list[str]

property if_table_exists: Literal['append', 'replace']

Returns the value of the if_table_exists property. This property determines what to do if the table already exists.

stream(
working_dir: str,
*results: list[LazyFrame | None] | LazyFrame | None,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

write(
files,
)

This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.

class MariaDBDest(
conn: MariaDBConn,
destination_tables: str | list[str],
if_table_exists: Literal['append', 'replace'] = 'append',
transactional: bool = True,
chunk_size: Annotated[int, Gt(gt=0)] = 100000,
loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
)

Bases: DestinationPlugin

Categories:

destination

Destination plugin for writing data to MySQL.

property chunk_size: int

The chunk size for writing large results in batches.

property conn: MariaDBConn

The MariaDB connection configuration.

property destination_tables: list[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

property loader: Literal['polars_sqlalchemy']

The data processing loader to use for executing DB write operation.

stream(
working_dir: str,
*results: list[LazyFrame | None],
)

Store the results into the MariaDB database.

Parameters:
  • working_dir – The working directory where the results are stored.

  • results – The results to store in the SQL destination.

property transactional: bool

Whether to use transactions for writing data to ensure consistency.

class MariaDBDestination(
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

MariaDB-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) list[str | None]

Store the results in the SQL destination.

Parameters:
  • working_dir – The working directory where the results will be stored.

  • results (list[pl.LazyFrame | None]) – The results to store in the SQL destination.

property credentials: UserPasswordCredentials

The credentials required to access the MariaDB database.

Type:

UserPasswordCredentials

property destination_table: str | List[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

Type:

str | List[str]

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property uri: str

The URI of the database where the data is going to be stored.

Type:

str

write(
files: list[str | None],
)

Given a file or a list of files, write to the desired destination. Note: this method might materialize the data in the files it receives, so chunks should be of an appropriate size.

Parameters:

files (str) – The file or files to be stored in the final destination.

class MongoDBDestination(
uri: str,
collections_with_ids: tuple[str, str | None] | List[tuple[str, str | None]],
credentials: UserPasswordCredentials = None,
connection_options: dict = None,
if_collection_exists: Literal['append', 'replace'] = 'append',
use_trxs: bool = False,
docs_per_trx: int = 1000,
maintain_order: bool = False,
update_existing: bool = True,
fail_on_duplicate_key: bool = True,
log_intermediate_files: bool = False,
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

MongoDB-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) List[None | List[str]]

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property collections_with_ids: List[tuple[str, str | None]]
property connection_options: dict
property credentials: UserPasswordCredentials | None
property if_collection_exists: Literal['append', 'replace']
stream(
working_dir: str,
*results: List[LazyFrame | None] | LazyFrame | None,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property uri: str
write(
files,
)

This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.

class MySQLDest(
conn: MySQLConn,
destination_tables: str | list[str],
if_table_exists: Literal['append', 'replace'] = 'append',
transactional: bool = True,
chunk_size: Annotated[int, Gt(gt=0)] = 100000,
loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
)

Bases: DestinationPlugin

Categories:

destination

Destination plugin for writing data to MySQL.

property chunk_size: int

The chunk size for writing large results in batches.

property conn: MySQLConn

The MySQL connection configuration.

property destination_tables: list[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

property loader: Literal['polars_sqlalchemy']

The data processing loader to use for executing DB write operation.

stream(
working_dir: str,
*results: list[LazyFrame | None],
)

Store the results into the MySQL database.

Parameters:
  • working_dir – The working directory where the results are stored.

  • results – The results to store in the SQL destination.

property transactional: bool

Whether to use transactions for writing data to ensure consistency.

class MySQLDestination(
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

MySQL-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) list[str | None]

Store the results in the SQL destination.

Parameters:
  • working_dir (str) – The working directory where the results will be stored.

  • results (list[pl.LazyFrame | None]) – The results to store in the SQL destination.

property credentials: UserPasswordCredentials

The credentials required to access the MySQLDatabase.

Type:

UserPasswordCredentials

property destination_table: str | List[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

Type:

str | List[str]

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property uri: str

The URI of the database where the data is going to be stored.

Type:

str

write(
files: list[str | None],
)

Given a file or a list of files, write to the desired destination. Note: this method might materialize the data in the files it receives, so chunks should be of an appropriate size.

Parameters:

files (str) – The file or files to be stored in the final destination.

class OracleDestination(
uri: str,
destination_table: List[str] | str,
credentials: UserPasswordCredentials = None,
if_table_exists: Literal['append', 'replace'] = 'append',
)

Bases: DestinationPlugin

Categories:

destination

Oracle-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) list[str | None]

Store the results in the SQL destination.

Parameters:
  • working_dir (str) – The working directory where the results will be stored.

  • results (list[pl.LazyFrame | None]) – The results to store in the SQL destination.

property credentials: UserPasswordCredentials

The credentials required to access the Oracle database.

Type:

UserPasswordCredentials

property destination_table: str | List[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

Type:

str | List[str]

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property uri: str

The URI of the database where the data is going to be stored.

Type:

str

write(
files: list[str | None],
)

Given a file or a list of files, write to the desired destination. Note: this method might materialize the data in the files it receives, so chunks should be of an appropriate size.

Parameters:

files (str) – The file or files to be stored in the final destination.

class PostgresDest(
conn: PostgresConn,
destination_tables: str | list[str],
if_table_exists: Literal['append', 'replace'] = 'append',
transactional: bool = True,
chunk_size: Annotated[int, Gt(gt=0)] = 100000,
loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
)

Bases: DestinationPlugin

Categories:

destination

Destination plugin for writing data to PostgreSQL.

property chunk_size: int

The chunk size for writing large results in batches.

property conn: PostgresConn

The Postgres connection configuration.

property destination_tables: list[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

property loader: Literal['polars_sqlalchemy']

The data processing loader to use for executing DB write operation.

stream(
working_dir: str,
*results: list[LazyFrame | None],
)

Store the results into the PostgreSQL database.

Parameters:
  • working_dir – The working directory where the results are stored.

  • results – The results to store in the SQL destination.

property transactional: bool

Whether to use transactions for writing data to ensure consistency.

class PostgresDestination(
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

Postgres-based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) list[str | None]

Store the results in the SQL destination.

Parameters:
  • working_dir (str) – The working directory where the results will be stored.

  • results (list[pl.LazyFrame | None]) – The results to store in the SQL destination.

property credentials: UserPasswordCredentials

The credentials required to access the Postgres database.

Type:

UserPasswordCredentials

property destination_table: str | List[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

Type:

str | List[str]

property if_table_exists: Literal['append', 'replace']

The strategy to follow when the table already exists.

Type:

str

property uri: str

The URI of the database where the data is going to be stored.

Type:

str

write(
files: list[str | None],
)

Given a file or a list of files, write to the desired destination. Note: this method might materialize the data in the files it receives, so chunks should be of an appropriate size.

Parameters:

files (str) – The file or files to be stored in the final destination.

class S3Destination(
uri: str | list[str],
credentials: S3Credentials,
format: str | FileFormat = None,
region: str = None,
catalog: AWSGlue = None,
)

Bases: DestinationPlugin

Categories:

destination

S3-file-based data outputs.

class SupportedFormats(
*values,
)

Bases: Enum

Enum for the supported formats for the S3Destination.

avro = <class 'tabsdata._format.AvroFormat'>
csv = <class 'tabsdata._format.CSVFormat'>
ndjson = <class 'tabsdata._format.NDJSONFormat'>
parquet = <class 'tabsdata._format.ParquetFormat'>
property allow_fragments: bool

Whether to allow fragments in the output.

Type:

bool

property catalog: AWSGlue

The catalog to store the data in.

Type:

Catalog

chunk(
working_dir: str,
*results,
)

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property credentials: S3Credentials | S3AccessKeyCredentials

The credentials required to access the S3 bucket.

Type:

S3Credentials

property format: FileFormat

The format of the file. If not provided, it will be inferred from the file.

Type:

FileFormat

property region: str | None

The region where the S3 bucket is located.

Type:

str

stream(
working_dir: str,
*results,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

property uri: str | list[str]

‘s3://path/to/files’.

Type:

str | list[str]

Type:

The URI of the files with format

class SnowflakeDestination(
connection_parameters: dict,
destination_table: List[str] | str,
if_table_exists: Literal['append', 'replace'] = 'append',
stage: str | None = None,
**kwargs,
)

Bases: DestinationPlugin

Categories:

destination

Snowflake based data outputs.

chunk(
working_dir: str,
*results: LazyFrame | None,
) List[None | str]

Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.

Parameters:
  • working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

A list of the intermediate files created

property destination_table: list[str]

Get the destination table(s) where the data will be stored.

Returns:

The destination table(s).

Return type:

str | list[str]

property if_table_exists: Literal['append', 'replace']

Returns the value of the if_table_exists property. This property determines what to do if the table already exists.

stream(
working_dir: str,
*results: List[LazyFrame | None] | LazyFrame | None,
)

Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.

Parameters:
  • working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)

  • results – The data to be exported. It is a list of polars LazyFrames or None.

Returns:

None

write(
files,
)

This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.

class TableOutput(
table: str | list[str],
)

Bases: DestinationPlugin

Categories:

destination

Table-based data outputs.

property table: str | list[str]

The table(s) to create. If multiple tables are provided, they must be provided as a list.

Type:

str | list[str]