openclean.function.value.phonetic module

String matcher that use phonetic algorithms for transforming the input strings to normalized phonetic encodings before comparing them for being equal using the exact matcher.

class openclean.function.value.phonetic.Metaphone

Bases: openclean.function.value.phonetic.PhoneticMatcher

String matcher using the metaphone algorithm to encode strings. Uses the metaphone algorithm (included in the jellyfish library) to encode each string before comparing the codes.

class openclean.function.value.phonetic.NYSIIS

Bases: openclean.function.value.phonetic.PhoneticMatcher

String matcher using the NYSIIS algorithm to encode strings. Uses the NYSIIS algorithm (developed by the New York State Identification and Intelligence System; included in the jellyfish library) to encode each string before comparing the codes.

class openclean.function.value.phonetic.PhoneticMatcher(encoder: Callable)

Bases: openclean.function.matching.base.ExactSimilarity, openclean.function.value.base.PreparedFunction

String matcher for phonetic algorithms. Extends exact string similarity using a callable that implements a phonetic encoding algorithm.

eval(value: str) str

The evaluation method for a phonetic matcher returns the encoding for a given value depending on the associated encoding algorithm.

Parameters

value (string) – Value that is being encoded.

Return type

string

class openclean.function.value.phonetic.Soundex

Bases: openclean.function.value.phonetic.PhoneticMatcher

String matcher using soundex encoding for a pair of strings. Uses the soundex function to encode each string to a four digit code before comparing the codes.