openclean.embedding.feature.length module

Feature function that computes normalized length for string representations of values in a list.

class openclean.embedding.feature.length.NormalizedLength(normalizer=None)

Bases: openclean.function.value.base.ValueFunction

Value function that computes a normalized length for the string representation of values in a given list.

eval(value)

Return the normalized frequency for the given value.

Parameters

value (scalar or tuple) – Value from the list that was used to prepare the function.

Return type

float

is_prepared()

The object still requires preparation if the normalization function is still None.

Return type

bool

prepare(values)

Compute the frequency for each value to be used as the feature function. Then initialize the normalization function using the list of value frequencies.

Parameters

values (list) – List of value sin the stream.

Return type

openclean.embedding.feature.length.NormalizedLength