openclean.profiling.anomalies.base module
Abstract base class for anomaly and outlier detection operators.
- class openclean.profiling.anomalies.base.AnomalyDetector
Bases:
openclean.profiling.base.DistinctSetProfiler
Interface for generic anomaly and outlier detectors. Each implementation should take a stream of distinct values (e.g., from a column in a data frame or a metadata object) as input and return a list of values that were identified as outliers.
- find(values: Union[Iterable[Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]], collections.Counter]) List[Union[Dict, int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]]
Identify values in a given set of values that are classified as outliers or anomalities. Returns a list of identified values.
- Parameters
values (iterable of values) – List of input values.
- Return type
list