openclean.data.metadata.mem module
Implementation of the metadata store class that maintains metadata information about dataset snapshots in main memory.
- openclean.data.metadata.mem.KEY(column_id: Optional[int] = None, row_id: Optional[int] = None) str
Get the unque key for an identifiable object.
- Parameters
snapshot_id (int) – Unique snapshot version identifier.
metadata_id (int) – Unique metadata object identifier.
- Return type
string
- class openclean.data.metadata.mem.VolatileMetadataStore
Bases:
openclean.data.metadata.base.MetadataStore
Metadata store that maintains annotations for a dataset snapshot in main memory. Metadata is not persistet in any other form and therefore volatile if the metadata store object is destroyed.
- read(column_id: Optional[int] = None, row_id: Optional[int] = None) Dict
Read the annotation dictionary for the specified object.
- Parameters
column_id (int, default=None) – Column identifier for the referenced object (None for rows or full datasets).
row_id (int, default=None) – Row identifier for the referenced object (None for columns or full datasets).
- Return type
dict
- write(doc: Dict, column_id: Optional[int] = None, row_id: Optional[int] = None)
Write the annotation dictionary for the specified object.
- Parameters
doc (dict) – Annotation dictionary that is being written to file.
column_id (int, default=None) – Column identifier for the referenced object (None for rows or full datasets).
row_id (int, default=None) – Row identifier for the referenced object (None for columns or full datasets).
- Return type
dict
- class openclean.data.metadata.mem.VolatileMetadataStoreFactory
Bases:
openclean.data.metadata.base.MetadataStoreFactory
Factory pattern for volatile metadata stores. Maintains the created metadata stores for each version in memory.
- get_store(version: int) openclean.data.metadata.mem.VolatileMetadataStore
Get the metadata store for the dataset snapshot with the given version identifier.
- Parameters
version (int) – Unique version identifier
- Return type
- rollback(version: int)
Remove metadata for all dataset versions that are after the given rollback version.
- Parameters
version (int) – Unique identifier of the rollback version.