openclean.function.token.filter module

Collection of functions to filter (remove) tokens from given token lists.

class openclean.function.token.filter.FirstLastFilter

Bases: openclean.function.token.base.TokenTransformer

Return a list that only contains the first and last element in a token list.

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Return a list that contains the first and last element from the input list. If the input is empty the result is empty as well.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token

class openclean.function.token.filter.MinMaxFilter

Bases: openclean.function.token.base.TokenTransformerPipeline

Filter that returns the minimum and maximum token in a given list. This filter is implemented as a pipeline that first sorts the tokens and then returns the first and last token from the sorted list.

class openclean.function.token.filter.RepeatedTokenFilter

Bases: openclean.function.token.base.TokenTransformer

Remove consecutive identical tokens in a given sequence.

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Returns a list where no two consecutive tokens are identical.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token

class openclean.function.token.filter.TokenFilter(predicate: openclean.function.value.base.ValueFunction)

Bases: openclean.function.token.base.TokenTransformer

Filter tokens based on a given predicate.

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Returns a list that contains only those tokens that satisfy the filter condition defined by the associated predicate.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token

class openclean.function.token.filter.TokenTypeFilter(types: Set[str], negated: Optional[bool] = False)

Bases: openclean.function.token.base.TokenTransformer

Filter tokens in a given list by their type.

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Returns a list that contains only those tokens that satisfy the filter condition defined by the associated predicate.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token