openclean.engine.log module
Log of actions that defines the history of a dataset.
- class openclean.engine.log.LogEntry(descriptor: Dict, action: Optional[openclean.engine.action.OpHandle] = None, version: Optional[int] = None)
Bases:
object
Entry in an operation log for a dataset. Each entry maintains information about a committed or uncommitted snapshot of a dataset. Each log entry is associated with a unique UUID identifer and a descriptor for the action that created the snapshot.
For uncommitted snapshots the handle for the action that created the snapshot is maintained together with the version identifier in the data store for the dataset sample.
- action: Optional[openclean.engine.action.OpHandle] = None
- descriptor: Dict
- version: Optional[int] = None
- class openclean.engine.log.OperationLog(snapshots: List[histore.archive.snapshot.Snapshot])
Bases:
object
The operation log maintains a list of entries containing provenance information for each snapshot of a dataset. Snapshots in a dataset can either be committed, i.e., persisted with the datastore that manages the full dataset, or uncommitted, i.e., committed only with the datastore for a dataset sample but not the full dataset.
- add(version: int, action: openclean.engine.action.OpHandle)
Append a record to the log.
- Parameters
version (int) – Dataset snapshot version identifier.
action (openclean.engine.log.OpHandle) – Handle for the operation that created the dataset snapshot.
- last_version() int
Get version identifier of the last entry in the log.
- Return type
int
- truncate(pos: int)
Remove all log entries starting at the given index.
- Parameters
pos (int) – List position from which (including the position) all entries in the log are removed.