openclean.function.value.mapping module
The mapping operator that returns a dictionary that contains a mapping of original values in a data frame column(s) to results of applying a given value function on them.
Lookup functions represent mappings using dictionaries.
- class openclean.function.value.mapping.Lookup(mapping: Dict, raise_error: Optional[bool] = False, default: Optional[Union[Callable, int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]] = None, as_string: Optional[bool] = False)
Bases:
openclean.function.value.base.PreparedFunction
Dictionary lookup function. Uses a mapping dictionary to convert given input values to their pre-defined targets.
- eval(value: Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]) Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]
Return the defined target value for a given lookup value.
- Parameters
value (scalar) – Scalar value in a data stream.
- Return type
any
- class openclean.function.value.mapping.Standardize(mapping: Dict)
Bases:
openclean.function.value.base.PreparedFunction
Use a mapping dictionary to standardize values. For a given value, if a mapping is defined in the dictionary the mapped value is returned. For all other values the original value is returned.
- eval(value: Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]) Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]
Return the defined target value for a given lookup value. If the given value is not included in the standardization mapping it will be returned as is.
- Parameters
value (scalar) – Scalar value in a data stream.
- Return type
any
- openclean.function.value.mapping.mapping(df: pandas.core.frame.DataFrame, columns: Union[int, str, List[Union[str, int]]], func: Union[Callable, openclean.function.value.base.ValueFunction]) Dict
Get the mapping of values that are modified by a given value function.
- Parameters
df (pandas.DataFrame) – Input data frame.
columns (int, string, or list(int or string), optional) – Single column or list of column index positions or column names.
func (callable or openclean.function.value.base.ValueFunction) – Callable or value function that accepts a single value as the argument.
- Return type
dict
- Raises
ValueError –
- openclean.function.value.mapping.replace(predicate: Callable, value: Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]) openclean.function.value.cond.ConditionalStatement
Return an instance of the Replace class for the given arguments.
- Parameters
predicate (callable) – Predicate that is evalauated on input values.
value (scalar or tuple) – Replacement value for inputs that satisfy the predicate.
- Return type
openclean.function.value.mapping.Replace