tabsdata.tableframe.udf.function

Bases: ABC

columns() → list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]

on_batch( series: list[Series], ) → list[Series]

on_batch( *series: Series, ) → list[Series]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.

on_element( values: list[Any], ) → list[Any]

on_element( *values: Any, ) → list[Any]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.

property signature: Literal['list', 'unpacked']

Defines how parameters are passed to on_batch and on_element methods.

Returns:: Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
Return type:: “list”

Override this property in your UDF subclass to change the parameter style.

with_columns( output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]], ) → UDF

UDF.columns() → list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]

UDF.on_batch( series: list[Series], ) → list[Series]

UDF.on_batch( *series: Series, ) → list[Series]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.

UDF.on_element( values: list[Any], ) → list[Any]

UDF.on_element( *values: Any, ) → list[Any]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.

UDF.with_columns( output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]], ) → UDF

class UDFList( output_columns, )

Bases: UDF, ABC

Abstract base class for UDFs that use list-style parameter passing.

When subclassing UDFList, implement on_batch or on_element with list signature:

on_batch(self, series: list[Series]) -> list[Series]
on_element(self, values: list[Any]) -> list[Any]

property signature: Literal['list', 'unpacked']

Defines how parameters are passed to on_batch and on_element methods.

Returns:: Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
Return type:: “list”

Override this property in your UDF subclass to change the parameter style.

UDFList.columns() → list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]

UDFList.on_batch( *args, ) → list[Series]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.

UDFList.on_element( *args, ) → list[Any]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.

UDFList.with_columns( output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]], ) → UDF

class UDFUnpacked( output_columns, )

Bases: UDF, ABC

Abstract base class for UDFs that use unpacked-style parameter passing.

When subclassing UDFUnpacked, implement on_batch or on_element with unpacked signature:

on_batch(self, col1: Series, col2: Series, ...) -> list[Series]
on_element(self, val1: Any, val2: Any, ...) -> list[Any]

property signature: Literal['list', 'unpacked']

Defines how parameters are passed to on_batch and on_element methods.

Returns:: Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
Return type:: “list”

Override this property in your UDF subclass to change the parameter style.

UDFUnpacked.columns() → list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]

UDFUnpacked.on_batch( *args, ) → list[Series]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.

UDFUnpacked.on_element( *args, ) → list[Any]

Creating UDFs:

Subclass tabsdata.tableframe.udf.function.UDF.

Implement __init__ to call super().__init__(output_columns) where output_columns is a tuple or list of tuples (name, data type) specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).

Override exactly one of on_batch or on_element, to implement the UDF function logic.

Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.

If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.

Using UDFs:

Instantiate a function created as above.

Pass it to TableFrame method udf().

Optionally use UDF.output_columns() to override output column names or data types after instantiation.

By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.

UDFUnpacked.with_columns( output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]], ) → UDF