openclean.function.eval.random module

Eval function for random number generator.

class openclean.function.eval.random.Rand(seed: Optional[int] = None)

Bases: openclean.function.eval.base.EvalFunction

Evaluation function that returns a random number in the interval [0, 1). This function can for example be used to randomly select rows in a data frame using a probability threshold.

eval(df: pandas.core.frame.DataFrame) List[Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]]

Return a list of random numbers in the interval [0, 1).

Parameters

df (pd.DataFrame) – Pandas data frame.

Return type

list

prepare(columns: List[Union[str, histore.document.schema.Column]]) Callable[[List[Union[int, float, str, datetime.datetime]]], Union[int, float, str, datetime.datetime, Tuple[Union[int, float, str, datetime.datetime]]]]

The prepare method returns a callable that returns a random number for evary input row.

Parameters

columns (list of string) – List of column names in the schema of the data stream.

Return type

openclean.data.stream.base.StreamFunction