openclean.function.value.domain module

Collection of scalar predicates that test domain membership.

class openclean.function.value.domain.BestMatch(matcher: openclean.function.matching.base.StringMatcher)

Bases: openclean.function.value.base.PreparedFunction

Value function that returns for a given value the best matching similar value in a controlled vocabulary.

eval(value)

Return the base matching value for a given query in the associated vocabulary. If the query term is in the vocabulary it is returned as the result. If the term is not in the vocabulary the best matching values (using the associated vocabulary matcher) are found. If no bast match is found or if multiple best matches are found a ValueError is raised. Otherwise, the best matching value is returned as the function result.

Parameters

value (scalar) – Scalar value for which the best matching value in the associated vocabulary is computed.

Return type

bool

Raises

ValueError

class openclean.function.value.domain.IsInDomain(domain, ignore_case=False, negated=False)

Bases: openclean.function.value.base.PreparedFunction

Callable function that wrapps a list of values for containment checking. If the ignore case flag is True, all string values in the domain are converted to lower case.

eval(value)

Test if a given value is a member of the domain of known values.

Parameters

value (scalar) – Scalar value that is tested for being a domain member.

Return type

bool

class openclean.function.value.domain.IsNotInDomain(domain, ignore_case=False)

Bases: openclean.function.value.domain.IsInDomain

Callable that wrapps a list of values that define the domain of valid values. The list is used to identify those values that do not belong to the domain. If the ignore case flag is True, all string values in the domain are converted to lower case.

openclean.function.value.domain.to_lower(value)

Convert a given value to lower case. Handles the case where the value is a list or tuple.

Parameters

value (string, list, or tuple) – Value that is transformed to lower case.

Return type

string, list, or tuple