tabsdata.DatabricksDestination
- class DatabricksDestination(
- host_url: str,
- token: str | Secret,
- tables: list[str] | str,
- volume: str,
- catalog: str | None = None,
- schema: str | None = None,
- warehouse: str | None = None,
- warehouse_id: str | None = None,
- if_table_exists: Literal['append', 'replace'] = 'append',
- schema_strategy: Literal['update', 'strict'] = 'update',
- **kwargs,
Bases:
DestinationPluginDatabricks based data outputs.
- __init__(
- host_url: str,
- token: str | Secret,
- tables: list[str] | str,
- volume: str,
- catalog: str | None = None,
- schema: str | None = None,
- warehouse: str | None = None,
- warehouse_id: str | None = None,
- if_table_exists: Literal['append', 'replace'] = 'append',
- schema_strategy: Literal['update', 'strict'] = 'update',
- **kwargs,
Initializes the DatabricksDestination with the configuration desired to store the data.
- Parameters:
host_url – Databricks URL
token – Databricks Personal Account Token (PAT). The user owning the PAT requires ‘USE CATALOG’, ‘USE SCHEMA’, ‘READ VOLUME’ & ‘WRITE VOLUME’ privileges.
tables – Tables to be created in Databricks. Each table name provided must either be fully qualified in the form of ‘catalog.schema.table_name’ or have the information provided through the ‘catalog’ or the ‘schema’ parameters. For example, if a table of the form ‘schema.table_name’ is provided, the ‘catalog’ parameter must also be provided. If a table of the form ‘table_name’ is provided, both ‘catalog’ and ‘schema’ must be provided.
volume – Name of the Databricks volume. The connector uses an existing Databricks managed volume. The files written to the volume are deleted after connector finishes writing the data to the Databricks table.
catalog – Catalog name (optional, can be null if the table name has the catalog name.)
schema – Schema name (optional, can be null if the table name has the schema name.)
warehouse – Warehouse name (use either this or warehouse_id) that will be used for uploading data to Databricks tables. The user has to have privileges to use the warehouse.
warehouse_id – Warehouse Id. Use null if warehouse name is provided.
if_table_exists – This is an optional property to define the strategy to follow when the table already exists. ‘replace’ will create overwrite the existing table, and ‘append’ will append to the existing data in the table. Defaults to ‘append’.
schema_strategy – The is to mention the strategy to follow when appending to a table with an existing schema. ‘update’ will update the schema with the possible new columns that might exist in the TableFrame. ‘strict’ will not modify the schema, and will fail if there is any difference.
Methods
__init__(host_url, token, tables, volume[, ...])Initializes the DatabricksDestination with the configuration desired to store the data.
chunk(working_dir, *results)Trigger the exporting of the data to local parquet chunks.
stream(working_dir, *results)Trigger the exporting of the data.
write(files)This method is used to write the files to the databricks.
Attributes
host_urlif_table_existsThe strategy to follow when the table already exists.
schema_strategyThe strategy to follow when appending to a table with an existing schema.
tablestoken