tabsdata.S3Source#
- class S3Source(uri: str | List[str], credentials: dict | S3Credentials, format: str | dict | FileFormat = None, initial_last_modified: str | datetime = None, region: str = None)[source]#
Bases:
Input
Class for managing the configuration of S3-file-based data inputs.
- format#
The format of the file. If not provided, it will be inferred from the file extension of the data.
- Type:
FileFormat
- uri#
The URI of the files with format: ‘s3://path/to/files’. It can be a single URI or a list of URIs.
- credentials#
The credentials required to access the S3 bucket.
- Type:
S3Credentials
- initial_last_modified#
If provided, only the files modified after this date and time will be considered.
- Type:
- __init__(uri: str | List[str], credentials: dict | S3Credentials, format: str | dict | FileFormat = None, initial_last_modified: str | datetime = None, region: str = None)[source]#
- Initializes the S3Source with the given URI and the credentials required to
access the S3 bucket, and optionally a format and date and time after which the files were modified.
- Parameters:
uri (str | List[str]) – The URI of the files with format: ‘s3://path/to/files’. It can be a single URI or a list of URIs.
credentials (dict | S3Credentials) – The credentials required to access the S3 bucket. Can be a dictionary or a S3Credentials object.
format (str | dict | FileFormat, optional) – The format of the file. If not provided, it will be inferred from the file extension of the data. Can be either a string with the format, a FileFormat object or a dictionary with the format as the ‘type’ key and any additional format-specific information. Currently supported formats are ‘csv’, ‘parquet’, ‘ndjson’, ‘jsonl’ and ‘log’.
initial_last_modified (str | datetime.datetime, optional) – If provided, only the files modified after this date and time will be considered. The date and time can be provided as a string in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601) or as a datetime object. If no timezone is provided, UTC will be assumed.
region (str, optional) – The region where the S3 bucket is located. If not provided, the default AWS region will be used.
- Raises:
InputConfigurationError –
FormatConfigurationError –
Methods
__init__
(uri, credentials[, format, ...])Initializes the S3Source with the given URI and the credentials required to
Attributes
The credentials required to access the S3 bucket.
The format of the file.
The date and time after which the files were modified.
region
The region where the S3 bucket is located.
's3://path/to/files'.