openclean.function.token.convert module

Converter for tokens that allows to change the value of a token and/or the token type.

class openclean.function.token.convert.TokenConverter

Bases: openclean.function.token.base.TokenTransformer

Interface for token convertrs that change token values and/or token types. The converter interface consist of two methods: the contains method checks whether the converter accepts a given token for conversion, and the convert method converts the token if it is accepted by he converter.

abstract contains(token: openclean.function.token.base.Token) bool

Test if the converter contains a conversion rule for the given token.

Parameters

token (openclean.function.token.base.Token) – Token that is tested for acceptance by this converter.

Return type

bool

abstract convert(token: openclean.function.token.base.Token) openclean.function.token.base.Token

Convert the given token according to the conversion ruls that are implemented by the converter.

Returns a modified token.

Parameters

token (openclean.function.token.base.Token) – Token that is converted.

Return type

openclean.function.token.base.Token

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Convert accpeted token in a given list of tokens.

For each token in the given list, if the converter accepts the token it is transformed. Otherwise, the original token is added to the resulting token list.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token

class openclean.function.token.convert.TokenListConverter(converters: List[openclean.function.token.convert.TokenConverter])

Bases: openclean.function.token.base.TokenTransformer

Converter for a list of tokens. Implements the token transformer mixin interface. Uses a list of converters to convert tokens i a given list. The first converter that accepts a token in the list is used to transform the token.

transform(tokens: List[openclean.function.token.base.Token]) List[openclean.function.token.base.Token]

Transform a list of tokens.

For each token in the given list, the initialized converters are used in given order. The first converter that accepts the token is used to convert it. If no converter accepts the token it is added to the result without changes.

Parameters

tokens (list of openclean.function.token.base.Token) – List of string tokens.

Return type

list of openclean.function.token.base.Token

class openclean.function.token.convert.TokenMapper(label: str, lookup: Union[Dict, Set])

Bases: openclean.function.token.convert.TokenConverter

Converter for tokens that uses a lookup table to map a given token to a new token value and a new token type. This class is used for example to standardize tokens for a semantic type.

contains(token: openclean.function.token.base.Token) bool

Test if the given token is contained in the lookup table.

Parameters

token (openclean.function.token.base.Token) – Token that is tested for acceptance by this converter.

Return type

bool

convert(token: openclean.function.token.base.Token) openclean.function.token.base.Token

Replace the given token with the respective value in the lookup table and the converter token type.

Returns a modified token.

Parameters

token (openclean.function.token.base.Token) – Token that is converted.

Return type

openclean.function.token.base.Token