Filters

TableFrame.extract_as_columns(
offset: int,
length: int,
) dict[str, list[Any]]
Categories:

filters

Extract a slice of rows from the table as a column-oriented dictionary.

The result is a mapping of column names to lists of values from the selected rows.

Parameters:
  • offset (int) – The starting row index of the slice.

  • length (int) – The number of rows to include in the slice.

Returns:

A dictionary where each key is a column name, and its value is a list of values from the selected slice.

Return type:

dict[str, list[Any]]

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.extract_as_columns(offset=0, length=2)
    {
        "a": ["A", 1],
        "b": ["X", 10]
    }
TableFrame.extract_as_rows(
offset: int,
length: int,
) list[dict[str, Any]]
Categories:

filters

Extract a slice of rows from the TableFrame as a list of dictionaries.

Each dictionary represents one row, where keys are column names and values are the corresponding cell values.

Parameters:
  • offset (int) – The starting row index of the slice.

  • length (int) – The number of rows to include in the slice.

Returns:

A list of row dictionaries.

Return type:

list[dict[str, Any]]

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.extract_as_rows(offset=0, length=2)
[
>>>
>>> tf.extract_as_rows(offset=0, length=2)
    [
        {"a": "A", "b": 1},
        {"a": "X", "b": 10},
    ]
TableFrame.filter(
*predicates: td_typing.IntoExprColumn | Iterable[td_typing.IntoExprColumn] | bool | list[bool] | np.ndarray[Any, Any],
) TableFrame
Categories:

filters

Filter the TableFrame based on the given predicates.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
│ A   ┆ 100 │
│ M   ┆ 50  │
└─────┴─────┘
>>>
>>> tf.filter(td.col("a").is_in(["A", "C"]).or_(td.col("b").eq(10)))
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ A   ┆ 100 │
└─────┴─────┘
TableFrame.first() TableFrame
Categories:

filters

Return a TableFrame with the first row.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.first()
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
└─────┴─────┘
TableFrame.first_row(
named: bool = False,
) tuple[Any, ...] | dict[str, Any] | None
Categories:

filters

Return a tuple or dictionary with the first row, or None if no row.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ a   ┆ 1   │
│ b   ┆ 2   │
│ c   ┆ 3   │
└─────┴─────┘
>>>
>>> tf.last_row()
>>>
('a', 1)
>>>
>>> tf.last_row(named=True)
>>>
{'A': 'a', 'B': '1'}
TableFrame.head(
n: int = 5,
) TableFrame
Categories:

filters

Return a TableFrame with the first n rows.

Parameters:

n – The number of rows to return.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.head(2)
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
└─────┴─────┘
TableFrame.last() TableFrame
Categories:

filters

Return a TableFrame with the last row.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.last()
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ M   ┆ 9   │
└─────┴─────┘
TableFrame.last_row(
named: bool = False,
) tuple[Any, ...] | dict[str, Any] | None
Categories:

filters

Return a tuple or dictionary with the last row, or None if no row.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ A   ┆ B   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ a   ┆ 1   │
│ b   ┆ 2   │
│ c   ┆ 3   │
└─────┴─────┘
>>>
>>> tf.last_row()
>>>
('c', 3)
>>>
>>> tf.last_row(named=True)
>>>
{'A': 'c', 'B': 3}
TableFrame.limit(
n: int = 5,
) TableFrame
Categories:

filters

Return a TableFrame with the first n rows. This is equivalent to head.

Parameters:

n – The number of rows to return.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.limit(2)
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
└─────┴─────┘
TableFrame.slice(
offset: int,
length: int | None = None,
) TableFrame
Categories:

filters

Return a TableFrame with a slice of the original TableFrame

Parameters:
  • offset – Slice starting index.

  • length – The length of the slice. None means all the way to the end.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.slice(2,2)
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ C   ┆ 3   │
│ D   ┆ 5   │
└─────┴─────┘
TableFrame.tail(
n: int = 5,
) TableFrame
Categories:

filters

Return a TableFrame with the last n rows.

Parameters:

n – The number of rows to return.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ A   ┆ 1   │
│ X   ┆ 10  │
│ C   ┆ 3   │
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
>>>
>>> tf.tail(2)
>>>
┌─────┬─────┐
│ a   ┆ b   │
│ --- ┆ --- │
│ str ┆ i64 │
╞═════╪═════╡
│ D   ┆ 5   │
│ M   ┆ 9   │
└─────┴─────┘
TableFrame.unique(
subset: td_typing.ColumnNameOrSelector | Collection[td_typing.ColumnNameOrSelector] | None = None,
*,
keep: td_typing.UniqueKeepStrategy = 'any',
maintain_order: bool = False,
) TableFrame
Categories:

filters

Deduplicate rows from the TableFrame.

Parameters:
  • subset – Columns to evaluate for duplicate values. If None, all columns are considered.

  • keep – Strategy to keep duplicates: first, last, any, none ( eliminate duplicate rows).

  • maintain_order – Preserve the order of the rows.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌──────┬──────┐
│ a    ┆ b    │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ A    ┆ 1    │
│ X    ┆ 10   │
│ C    ┆ 3    │
│ D    ┆ 5    │
│ M    ┆ 9    │
│ A    ┆ 100  │
│ M    ┆ 50   │
│ null ┆ 20   │
│ F    ┆ null │
└──────┴──────┘
>>>
>>> tf.unique("a", keep="last")
>>>
┌──────┬──────┐
│ a    ┆ b    │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ D    ┆ 5    │
│ C    ┆ 3    │
│ X    ┆ 10   │
│ A    ┆ 100  │
│ M    ┆ 50   │
│ F    ┆ null │
│ null ┆ 20   │
└──────┴──────┘
Expr.filter(
*predicates: Expr | Series | str | Iterable[Expr | Series | str],
) Expr
Categories:

filters

Apply a filter predicate to an expression.

Elements for which the predicate does not evaluate to true are discarded, evaluations to null are also discarded.

Useful in an aggregation expression.

Parameters:

predicates – Expression(s) that evaluates to a boolean.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌───────┬─────────┐
│ state ┆ tickets │
│ ---   ┆ ---     │
│ str   ┆ i64     │
╞═══════╪═════════╡
│ CA    ┆ 1       │
│ AL    ┆ 3       │
│ CA    ┆ 2       │
│ NY    ┆ 2       │
│ NY    ┆ 3       │
└───────┴─────────┘
>>>
>>> import tabsdata as td
>>> tf.group_by("state").agg(td.col("tickets")
>>>   .filter(td.col("tickets") !=2)
>>>   .sum().alias("sum_non_two"))
>>>
┌───────┬─────────────┐
│ state ┆ sum_non_two │
│ ---   ┆ ---         │
│ str   ┆ i64         │
╞═══════╪═════════════╡
│ AL    ┆ 3           │
│ NY    ┆ 3           │
│ CA    ┆ 1           │
└───────┴─────────────┘
Expr.slice(
offset: int | Expr,
length: int | Expr | None = None,
) Expr
Categories:

filters

Compute a slice of the TableFrame for the specified columns.

Parameters:
  • offset – the offset to start the slice.

  • length – the length of the slice.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌──────┬──────┐
│ x    ┆ y    │
│ ---  ┆ ---  │
│ f64  ┆ f64  │
╞══════╪══════╡
│ 1.0  ┆ 2.0  │
│ 2.0  ┆ 2.0  │
│ NaN  ┆ NaN  │
│ 4.0  ┆ NaN  │
│ 5.0  ┆ null │
│ null ┆ null │
└──────┴──────┘
>>>
>>> tf.select(tf.all().slice(1,2))
>>>
┌─────┬─────┐
│ x   ┆ y   │
│ --- ┆ --- │
│ f64 ┆ f64 │
╞═════╪═════╡
│ 2.0 ┆ 2.0 │
│ NaN ┆ NaN │
└─────┴─────┘