tabsdata.LocalFileSource

class LocalFileSource(
path: str | list[str],
format: str | FileFormat = None,
initial_last_modified: str | datetime = None,
)

Bases: SourcePlugin

Local-file-based data inputs.

__init__(
path: str | list[str],
format: str | FileFormat = None,
initial_last_modified: str | datetime = None,
)

Initializes the LocalFileSource with the given path, and optionally a format and a date and time after which the files were modified.

Parameters:
  • path – The path where the files can be found. It can be a single path or a list of paths.

  • format – The format of the file. If not provided, it will be inferred from the file extension of the data. Can be either a string with the format or a FileFormat object. Currently supported formats are ‘csv’, ‘parquet’, ‘avro’, ‘ndjson’, ‘jsonl’ and ‘log’.

  • initial_last_modified – If provided, only the files modified after this date and time will be considered. The date and time can be provided as a string in [ISO 8601 format](https://en.wikipedia.org/wiki/ISO_8601) or as a datetime object. If no timezone is provided, UTC will be assumed.

Raises:
  • InputConfigurationError

  • FormatConfigurationError

Methods

__init__(path[, format, initial_last_modified])

Initializes the LocalFileSource with the given path, and optionally a format and a date and time after which the files were modified.

chunk(working_dir)

Trigger the import of the data.

stream(working_dir)

Attributes

format

The format of the file or files.

initial_last_modified

The date and time after which the files were modified.

initial_values

Return a dictionary with the initial values to be stored after execution of the plugin.

path

The path or paths to the files to load.