Destinations
- class AzureDestination( )
Bases:
DestinationPlugin- Categories:
destination
Azure-file-based data outputs.
- class SupportedFormats(
- *values,
Bases:
EnumEnum for the supported formats for the AzureDestination.
- avro = <class 'tabsdata._format.AvroFormat'>
- csv = <class 'tabsdata._format.CSVFormat'>
- ndjson = <class 'tabsdata._format.NDJSONFormat'>
- parquet = <class 'tabsdata._format.ParquetFormat'>
- chunk(
- working_dir: str,
- *results,
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property credentials: AzureCredentials
The credentials required to access Azure.
- Type:
AzureCredentials
- property format: FileFormat
The format of the file. If not provided, it will be inferred from the file extension of the URI.
- Type:
FileFormat
- stream(
- working_dir: str,
- *results,
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- class BigQueryDest(
- conn: BigQueryConn,
- tables: TableSpec | None = None,
- if_table_exists: IfTableExistStrategySpec = 'append',
- schema_strategy: SchemaStrategySpec = 'update',
Bases:
DestinationPlugin- Categories:
destination
BigQuery based data outputs. The data is first stored in parquet files in a GCS bucket, and then loaded into the BigQuery tables.
- property conn: BigQueryConn
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- property schema_strategy: Literal['update', 'strict']
The strategy to follow when appending to a table with an existing schema.
- Type:
- stream( )
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- class DatabricksDestination(
- host_url: str,
- token: str | Secret,
- tables: list[str] | str,
- volume: str,
- catalog: str | None = None,
- schema: str | None = None,
- warehouse: str | None = None,
- warehouse_id: str | None = None,
- if_table_exists: Literal['append', 'replace'] = 'append',
- schema_strategy: Literal['update', 'strict'] = 'update',
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
Databricks based data outputs.
- chunk( ) List[None | str]
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property host_url: str
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- property schema_strategy: Literal['update', 'strict']
The strategy to follow when appending to a table with an existing schema.
- Type:
- stream( )
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- property token: Secret
- write(
- files,
This method is used to write the files to the databricks. It is called from the stream method, and it is not intended to be called directly.
- class GCSDestination( )
Bases:
DestinationPlugin- Categories:
destination
GCS-file-based data outputs.
- class SupportedFormats(
- *values,
Bases:
EnumEnum for the supported formats for the GCSDestination.
- avro = <class 'tabsdata._format.AvroFormat'>
- csv = <class 'tabsdata._format.CSVFormat'>
- ndjson = <class 'tabsdata._format.NDJSONFormat'>
- parquet = <class 'tabsdata._format.ParquetFormat'>
- chunk(
- working_dir: str,
- *results,
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property credentials: GCPCredentials
The credentials required to access GCS.
- Type:
GCPCredentials
- property format: FileFormat
The format of the file. If not provided, it will be inferred from the file extension of the URI.
- Type:
FileFormat
- stream(
- working_dir: str,
- *results,
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- class LocalFileDestination( )
Bases:
DestinationPlugin- Categories:
destination
LocalFile-based data outputs.
- class SupportedFormats(
- *values,
Bases:
EnumEnum for the supported formats for the LocalFileDestination.
- avro = <class 'tabsdata._format.AvroFormat'>
- csv = <class 'tabsdata._format.CSVFormat'>
- ndjson = <class 'tabsdata._format.NDJSONFormat'>
- parquet = <class 'tabsdata._format.ParquetFormat'>
- chunk(
- working_dir: str,
- *results,
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property format: FileFormat
The format of the file or files. If not provided, it will be inferred from the file extension in the path.
- Type:
FileFormat
- stream(
- working_dir: str,
- *results,
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- class MSSQLDestination(
- connection_string: str,
- destination_table: str | list[str],
- credentials: dict | UserPasswordCredentials | None = None,
- server: str | Secret = None,
- database: str | Secret = None,
- driver: str | Secret = None,
- if_table_exists: Literal['append', 'replace'] = 'append',
- chunk_size: int = 50000,
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
MSSQL-based data outputs.
- chunk( ) list[None | str]
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property connection_string: str
Get the connection string for the database.
- Returns:
The connection string.
- Return type:
- property credentials: UserPasswordCredentials | None
The credentials required to access Microsoft SQL Server. If no credentials were provided, it will return None.
- Type:
UserPasswordCredentials | None
- property if_table_exists: Literal['append', 'replace']
Returns the value of the if_table_exists property. This property determines what to do if the table already exists.
- stream( )
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- write(
- files,
This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.
- class MariaDBDest(
- conn: MariaDBConn,
- destination_tables: str | list[str],
- if_table_exists: Literal['append', 'replace'] = 'append',
- transactional: bool = True,
- chunk_size: Annotated[int, Gt(gt=0)] = 100000,
- loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
Bases:
DestinationPlugin- Categories:
destination
Destination plugin for writing data to MySQL.
- property chunk_size: int
The chunk size for writing large results in batches.
- property conn: MariaDBConn
The MariaDB connection configuration.
- property destination_tables: list[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- property loader: Literal['polars_sqlalchemy']
The data processing loader to use for executing DB write operation.
- stream( )
Store the results into the MariaDB database.
- Parameters:
working_dir – The working directory where the results are stored.
results – The results to store in the SQL destination.
- property transactional: bool
Whether to use transactions for writing data to ensure consistency.
- class MariaDBDestination(
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
MariaDB-based data outputs.
- chunk( ) list[str | None]
Store the results in the SQL destination.
- Parameters:
working_dir – The working directory where the results will be stored.
results (list[pl.LazyFrame | None]) – The results to store in the SQL destination.
- property credentials: UserPasswordCredentials
The credentials required to access the MariaDB database.
- Type:
- property destination_table: str | List[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- class MongoDBDestination(
- uri: str,
- collections_with_ids: tuple[str, str | None] | List[tuple[str, str | None]],
- credentials: UserPasswordCredentials = None,
- connection_options: dict = None,
- if_collection_exists: Literal['append', 'replace'] = 'append',
- use_trxs: bool = False,
- docs_per_trx: int = 1000,
- maintain_order: bool = False,
- update_existing: bool = True,
- fail_on_duplicate_key: bool = True,
- log_intermediate_files: bool = False,
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
MongoDB-based data outputs.
- chunk( ) List[None | List[str]]
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property connection_options: dict
- property credentials: UserPasswordCredentials | None
- property if_collection_exists: Literal['append', 'replace']
- stream( )
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- property uri: str
- write(
- files,
This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.
- class MySQLDest(
- conn: MySQLConn,
- destination_tables: str | list[str],
- if_table_exists: Literal['append', 'replace'] = 'append',
- transactional: bool = True,
- chunk_size: Annotated[int, Gt(gt=0)] = 100000,
- loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
Bases:
DestinationPlugin- Categories:
destination
Destination plugin for writing data to MySQL.
- property chunk_size: int
The chunk size for writing large results in batches.
- property conn: MySQLConn
The MySQL connection configuration.
- property destination_tables: list[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- property loader: Literal['polars_sqlalchemy']
The data processing loader to use for executing DB write operation.
- stream( )
Store the results into the MySQL database.
- Parameters:
working_dir – The working directory where the results are stored.
results – The results to store in the SQL destination.
- property transactional: bool
Whether to use transactions for writing data to ensure consistency.
- class MySQLDestination(
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
MySQL-based data outputs.
- property credentials: UserPasswordCredentials
The credentials required to access the MySQLDatabase.
- Type:
- property destination_table: str | List[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- class OracleDestination(
- uri: str,
- destination_table: List[str] | str,
- credentials: UserPasswordCredentials = None,
- if_table_exists: Literal['append', 'replace'] = 'append',
Bases:
DestinationPlugin- Categories:
destination
Oracle-based data outputs.
- property credentials: UserPasswordCredentials
The credentials required to access the Oracle database.
- Type:
- property destination_table: str | List[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- class PostgresDest(
- conn: PostgresConn,
- destination_tables: str | list[str],
- if_table_exists: Literal['append', 'replace'] = 'append',
- transactional: bool = True,
- chunk_size: Annotated[int, Gt(gt=0)] = 100000,
- loader: Literal['polars_sqlalchemy'] = 'polars_sqlalchemy',
Bases:
DestinationPlugin- Categories:
destination
Destination plugin for writing data to PostgreSQL.
- property chunk_size: int
The chunk size for writing large results in batches.
- property conn: PostgresConn
The Postgres connection configuration.
- property destination_tables: list[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- property loader: Literal['polars_sqlalchemy']
The data processing loader to use for executing DB write operation.
- stream( )
Store the results into the PostgreSQL database.
- Parameters:
working_dir – The working directory where the results are stored.
results – The results to store in the SQL destination.
- property transactional: bool
Whether to use transactions for writing data to ensure consistency.
- class PostgresDestination(
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
Postgres-based data outputs.
- property credentials: UserPasswordCredentials
The credentials required to access the Postgres database.
- Type:
- property destination_table: str | List[str]
The table(s) to create. If multiple tables are provided, they must be provided as a list.
- property if_table_exists: Literal['append', 'replace']
The strategy to follow when the table already exists.
- Type:
- class S3Destination(
- uri: str | list[str],
- credentials: S3Credentials,
- format: str | FileFormat = None,
- region: str = None,
- catalog: AWSGlue = None,
Bases:
DestinationPlugin- Categories:
destination
S3-file-based data outputs.
- class SupportedFormats(
- *values,
Bases:
EnumEnum for the supported formats for the S3Destination.
- avro = <class 'tabsdata._format.AvroFormat'>
- csv = <class 'tabsdata._format.CSVFormat'>
- ndjson = <class 'tabsdata._format.NDJSONFormat'>
- parquet = <class 'tabsdata._format.ParquetFormat'>
- property catalog: AWSGlue
The catalog to store the data in.
- Type:
Catalog
- chunk(
- working_dir: str,
- *results,
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property credentials: S3Credentials | S3AccessKeyCredentials
The credentials required to access the S3 bucket.
- Type:
S3Credentials
- property format: FileFormat
The format of the file. If not provided, it will be inferred from the file.
- Type:
FileFormat
- stream(
- working_dir: str,
- *results,
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- class SnowflakeDestination(
- connection_parameters: dict,
- destination_table: List[str] | str,
- if_table_exists: Literal['append', 'replace'] = 'append',
- stage: str | None = None,
- **kwargs,
Bases:
DestinationPlugin- Categories:
destination
Snowflake based data outputs.
- chunk( ) List[None | str]
Trigger the exporting of the data to local parquet chunks. This method will receive the resulting data from the user function and must store it in the local system as parquet files, using the working_dir. Note: This method should not materialize the data, it should only store it in the local system.
- Parameters:
working_dir (str) – The folder where any files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
A list of the intermediate files created
- property if_table_exists: Literal['append', 'replace']
Returns the value of the if_table_exists property. This property determines what to do if the table already exists.
- stream( )
Trigger the exporting of the data. This method will receive the resulting data from the user function and must store it in the desired location. Note: this method might materialize the data provided in a single chunk generated by the chunk function if invoked, so chunks should be of an appropriate size.
- Parameters:
working_dir (str) – The folder where any intermediate files generated must be stored (this refers to temporary files that will be deleted after the execution of the plugin, not the final destination of the data)
results – The data to be exported. It is a list of polars LazyFrames or None.
- Returns:
None
- write(
- files,
This method is used to write the files to the database. It is called from the stream method, and it is not intended to be called directly.