tabsdata.tableframe.lazyframe.frame.TableFrame.unique#

TableFrame.unique(subset: ColumnNameOrSelector | Collection[ColumnNameOrSelector] | None = None, *, keep: UniqueKeepStrategy = 'any', maintain_order: bool = False) TableFrame[source]#

Deduplicate rows from the TableFrame.

Parameters:
  • subset – Columns to evaluate for duplicate values. If None, all columns are considered.

  • keep – Strategy to keep duplicates: first, last, any, none ( eliminate duplicate rows).

  • maintain_order – Preserve the order of the rows.

Example:

>>> import tabsdata as td
>>>
>>> tf: td.TableFrame ...
>>>
┌──────┬──────┐
│ a    ┆ b    │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ A    ┆ 1    │
│ X    ┆ 10   │
│ C    ┆ 3    │
│ D    ┆ 5    │
│ M    ┆ 9    │
│ A    ┆ 100  │
│ M    ┆ 50   │
│ null ┆ 20   │
│ F    ┆ null │
└──────┴──────┘
>>>
>>> tf.unique("a", keep="last")
>>>
┌──────┬──────┐
│ a    ┆ b    │
│ ---  ┆ ---  │
│ str  ┆ i64  │
╞══════╪══════╡
│ D    ┆ 5    │
│ C    ┆ 3    │
│ X    ┆ 10   │
│ A    ┆ 100  │
│ M    ┆ 50   │
│ F    ┆ null │
│ null ┆ 20   │
└──────┴──────┘