Operations

class AllOk(
none_is_ok: bool = False,
nan_is_ok: bool = False,
*,
tags: str | list[str] | None = None,
)

Bases: BoolCriteria

Categories:

dq-operator

A boolean criteria that selects records where all boolean classifiers pass (True).

class AnyFailed(
none_is_ok: bool = False,
nan_is_ok: bool = False,
*,
tags: str | list[str] | None = None,
)

Bases: BoolCriteria

Categories:

dq-operator

A boolean criteria that selects records where at least one boolean classifier fails (False).

class Enrich(
to_table: str | None = None,
*,
tags: str | list[str] | None = None,
)

Bases: TagOperator

Categories:

dq-operator

An operator that adds the data quality classifier columns to the original table frame. It does not remove any row from the original table.

property to_table: str | None

Returns the destination table with the enriched data quality classifier columns. If None, the original table is used.

class Fail(
criteria: Criteria,
threshold: Threshold,
)

Bases: CriteriaOperator

Categories:

dq-operator

An operator that fails the current transaction if a data quality threshold is met.

property threshold: Threshold

Returns the threshold for this Fail operation.

class Filter(
criteria: Criteria,
to_table: str | None = None,
include_quality_columns: Literal['none', 'criteria', 'all'] = 'none',
)

Bases: CriteriaOperator

Categories:

dq-operator

An operator that filters rows from a table based on data quality criteria.

The filtered rows are removed from the table being processed. They can either be discarded or redirected to another table.

property include_quality_columns: Literal['none', 'criteria', 'all']

Returns the data quality columns to included in the table with filtered out data.

property to_table: str | None

Returns the destination table for filtered rows, or None if they are discarded.

class InBins(
column_name: str,
bins: Annotated[int, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])] | Literal['none', 'nan', 'underflow', 'overflow'] | Annotated[list[Annotated[int, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])] | Literal['none', 'nan', 'underflow', 'overflow']], Strict],
)

Bases: CategoryCriteria

Categories:

dq-operator

A category criteria that selects records where the value in the specified columns falls into one of the given bins.

class NotInBins(
column_name: str,
bins: Annotated[int, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])] | Literal['none', 'nan', 'underflow', 'overflow'] | Annotated[list[Annotated[int, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])] | Literal['none', 'nan', 'underflow', 'overflow']], Strict],
)

Bases: CategoryCriteria

Categories:

dq-operator

A category criteria that selects records where the value in the specified columns does not fall into any of the given bins.

class PercentThreshold(
percent: Annotated[int, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])] | Annotated[float, Strict(strict=True), FieldInfo(annotation=NoneType, required=True, metadata=[Ge(ge=0), Le(le=100)])],
)

Bases: Threshold

Categories:

dq-operator

A threshold based on a percentage of total rows.

property percent: float

Returns the percentage for the threshold.

class Select(
criteria: Criteria,
to_table: str,
include_quality_columns: Literal['none', 'criteria', 'all'] = 'none',
)

Bases: CriteriaOperator

Categories:

dq-operator

An operator that selects rows based on data quality criteria and writes them to another table.

This operation does not modify the original table.

property include_quality_columns: Literal['none', 'criteria', 'all']

Returns the data quality columns to included in the table with selected data.

property to_table: str

Returns the destination table for the selected rows.

class Summary(
table: str | None = None,
*,
tags: str | list[str] | None = None,
)

Bases: TagOperator

Categories:

dq-operator

An operator that generates a data quality summary report for a table.

property table: str | None

Returns the name of the data quality summary table.