openclean.profiling.constraints.ucc module

Base classes for unique column combination (UCC) discovery. UCCs are a prerequisite for unique constraints and keys.

class openclean.profiling.constraints.ucc.UniqueColumnCombinationFinder

Bases: object

Interface for operators that discover combinations of unique columns in a given data frame.

abstract run(df: pandas.core.frame.DataFrame) List[Union[int, str, List[Union[str, int]]]]

Run the implemented unique column combination discovery algorithm on the given data frame. Returns a list of all discovered unique column sets.

Parameters

df (pd.DataFrame) – Input data frame.

Return type

list