tabsdata.tableframe.udf.function
- class UDF(
- output_columns: list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]] | tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64],
Bases:
ABC- columns() list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]
- on_batch(
- series: list[Series],
- on_batch(
- *series: Series,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.
- on_element( ) list[Any]
- on_element(
- *values: Any,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.
- property signature: Literal['list', 'unpacked']
Defines how parameters are passed to on_batch and on_element methods.
- Returns:
Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
- Return type:
“list”
Override this property in your UDF subclass to change the parameter style.
- with_columns(
- output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]],
- UDF.columns() list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]
- UDF.on_batch(
- series: list[Series],
- UDF.on_batch(
- *series: Series,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.
- UDF.on_element( ) list[Any]
- UDF.on_element(
- *values: Any,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.
- UDF.with_columns(
- output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]],
- class UDFList(
- output_columns,
-
Abstract base class for UDFs that use list-style parameter passing.
When subclassing UDFList, implement on_batch or on_element with list signature:
on_batch(self, series: list[Series]) -> list[Series]on_element(self, values: list[Any]) -> list[Any]
- property signature: Literal['list', 'unpacked']
Defines how parameters are passed to on_batch and on_element methods.
- Returns:
Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
- Return type:
“list”
Override this property in your UDF subclass to change the parameter style.
- UDFList.columns() list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]
- UDFList.on_batch(
- *args,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.
- UDFList.on_element(
- *args,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.
- UDFList.with_columns(
- output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]],
- class UDFUnpacked(
- output_columns,
-
Abstract base class for UDFs that use unpacked-style parameter passing.
When subclassing UDFUnpacked, implement on_batch or on_element with unpacked signature:
on_batch(self, col1: Series, col2: Series, ...) -> list[Series]on_element(self, val1: Any, val2: Any, ...) -> list[Any]
- property signature: Literal['list', 'unpacked']
Defines how parameters are passed to on_batch and on_element methods.
- Returns:
Parameters are passed as a single list (default). “unpacked”: Each parameter is passed as a separate argument.
- Return type:
“list”
Override this property in your UDF subclass to change the parameter style.
- UDFUnpacked.columns() list[tuple[str, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64]]
- UDFUnpacked.on_batch(
- *args,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_batch receives a list of series. Override the signature property to return “unpacked” to receive each series as a separate argument instead.
- UDFUnpacked.on_element(
- *args,
Creating UDFs:
Subclass
tabsdata.tableframe.udf.function.UDF.Implement
__init__to callsuper().__init__(output_columns)whereoutput_columnsis a tuple or list of tuples(name, data type)specifying the UDF default output schema (column names and data types). Each tuple must contain a column name (string) and a data type (DataType).Override exactly one of on_batch or on_element, to implement the UDF function logic.
Return a list of TabsData Series (for on_batch) or TabsData supported scalars (for on_element) with the same length as specified in the output schema.
If overriding the on_batch method, the return type must be a list of TabsData Series. If overriding the on_element method, the return type must be a list of supported TabsData scalar values. For both cases, the number of elements in the returned lists must match the number of elements in the output_columns list provided to the UDF constructor.
Using UDFs:
Instantiate a function created as above.
Pass it to TableFrame method udf().
Optionally use
UDF.output_columns()to override output column names or data types after instantiation.By default, on_element receives a list of values. Override the signature property to return “unpacked” to receive each value as a separate argument instead.
- UDFUnpacked.with_columns(
- output_columns: tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None] | list[tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]] | dict[int, tuple[str | None, Boolean | Categorical | Date | Datetime | Decimal | Duration | Enum | Float32 | Float64 | Int8 | Int16 | Int64 | Int32 | Int128 | Null | String | Time | UInt8 | UInt16 | UInt32 | UInt64 | None]],