openclean.function.value.aggregate module

Value function that selects a value from a given list based on a given aggregator.

class openclean.function.value.aggregate.Longest(tiebreaker: Optional[openclean.function.value.base.ValueFunction] = None)

Bases: openclean.function.value.aggregate.ValueAggregator

Aggregator that selects the longest value from a given list of values.

class openclean.function.value.aggregate.Max

Bases: openclean.function.value.aggregate.ValueAggregator

Aggregator that selects the maximum value from a given list of values.

class openclean.function.value.aggregate.Min

Bases: openclean.function.value.aggregate.ValueAggregator

Aggregator that selects the minimum value from a given list of values.

class openclean.function.value.aggregate.Shortest(tiebreaker: Optional[openclean.function.value.base.ValueFunction] = None)

Bases: openclean.function.value.aggregate.ValueAggregator

Aggregator that selects the shortest value from a given list of values.

class openclean.function.value.aggregate.ValueAggregator(aggr: Callable, feature: Optional[Union[Callable, openclean.function.value.base.ValueFunction]] = None, tiebreaker: Optional[openclean.function.value.base.ValueFunction] = None)

Bases: openclean.function.value.base.UnpreparedFunction

Value function that can be used to select a value from a list based on a given aggregation function. Passes a list of values to an aggregator and returns a constant value function with teh aggregator result. Allows to apply a feature generator on the given values prior to applying the aggregator. If a feature function is given the value that is associated with the selected feature is returned (and not the feature value itself).

prepare(values: List[Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]]) openclean.function.value.base.ValueFunction

Evaluate the aggregation function on the given list of values. Returns a constant function for the selected value.

If the feature generator is set, we first generate a feature for each value in the given list. We maintain the set of original values with each feature value. The aggregator is applied on the list of generated feature values and a constant functions for the original value that is associated with the selected feature is returned.

If there is more than one value associated with a selected feature, the tiebreaker function is evaluated on all associated values. If not tiebraker was specified a ValueError is raised.

Parameters

values (list) – List of scalar values or tuples of scalar values.

Return type

openclean.function.value.base.ConstantValue