openclean.engine.action module

Handles for operators that have been applied to a dataset. Handles are stored in a log (provenance record) for the dataset. The operator handles contain all the information that is necessary to reapply the opertor to a dataset version.

class openclean.engine.action.CommitOp

Bases: openclean.engine.action.OpHandle

Handle for a user commit operation.

to_eval() openclean.function.eval.base.EvalFunction

The commit operator cannot be converted to an evaluation function. If an attempt is made to do so a runtime error is raised.

Return type

openclean.function.eval.base.EvalFunction

class openclean.engine.action.InsertOp(schema: List[Union[str, histore.document.schema.Column]], names: Union[str, List[str]], pos: Optional[int] = None, values: Optional[Union[int, float, str, datetime.datetime, openclean.engine.object.function.FunctionHandle]] = None, args: Optional[Dict] = None, sources: Optional[Union[int, str, List[Union[str, int]]]] = None)

Bases: openclean.engine.action.OpHandle

Handle for an insert operation.

property names: Union[str, List[str]]

Synonym for accessing the columns (which are the names of the inserted columns for an inscol operator).

Return type

string or list of string

to_dict() Dict

Get a dictionary serialization for the handle.

Return type

dict

to_eval() openclean.function.eval.base.EvalFunction

Get an evaluation function instance that can be used to re-apply the represented operation on a dataset version.

Return type

openclean.function.eval.base.EvalFunction

class openclean.engine.action.LoadOp

Bases: openclean.engine.action.OpHandle

Handle for a load operation.

to_eval() openclean.function.eval.base.EvalFunction

The load operator cannot be converted to an evaluation function. If an attempt is made to do so a runtime error is raised.

Return type

openclean.function.eval.base.EvalFunction

class openclean.engine.action.OpHandle(optype: str, schema: Optional[List[Union[str, histore.document.schema.Column]]] = None, columns: Optional[Union[int, str, List[Union[str, int]]]] = None, func: Optional[Union[int, float, str, datetime.datetime, openclean.engine.object.function.FunctionHandle]] = None, args: Optional[Dict] = None, sources: Optional[Union[int, str, List[Union[str, int]]]] = None)

Bases: openclean.data.archive.base.ActionHandle

The operator handle defines the interface for entries in the provenance log of a dataset. The defined methods are used to store the handle and to re-apply the operation using a evaluation function that is generated from the operator metadata.

property is_insert: bool

True if the operator type is ‘inscol’.

Return type

bool

property is_update: bool

True if the operator type is ‘update’.

Return type

bool

to_dict() Dict

Get a dictionary serialization for the handle.

Return type

dict

abstract to_eval() openclean.function.eval.base.EvalFunction

Get an evaluation function instance that can be used to re-apply the represented operation on a dataset version.

Return type

openclean.function.eval.base.EvalFunction

class openclean.engine.action.SampleOp(args: Optional[Dict] = None)

Bases: openclean.engine.action.OpHandle

Handle for a dataset sample operation.

to_eval() openclean.function.eval.base.EvalFunction

The sample operator cannot be converted to an evaluation function. If an attempt is made to do so a runtime error is raised.

Return type

openclean.function.eval.base.EvalFunction

class openclean.engine.action.UpdateOp(schema: List[Union[str, histore.document.schema.Column]], columns: Union[int, str, List[Union[str, int]]], func: openclean.engine.object.function.FunctionHandle, args: Optional[Dict] = None, sources: Optional[Union[int, str, List[Union[str, int]]]] = None)

Bases: openclean.engine.action.OpHandle

Handle for an update operation.

to_eval() openclean.function.eval.base.EvalFunction

Get an evaluation function instance that can be used to re-apply the represented operation on a dataset version.

Return type

openclean.function.eval.base.EvalFunction