openclean.data.metadata.mem module

Implementation of the metadata store class that maintains metadata information about dataset snapshots in main memory.

openclean.data.metadata.mem.KEY(column_id: Optional[int] = None, row_id: Optional[int] = None) str

Get the unque key for an identifiable object.

Parameters
  • snapshot_id (int) – Unique snapshot version identifier.

  • metadata_id (int) – Unique metadata object identifier.

Return type

string

class openclean.data.metadata.mem.VolatileMetadataStore

Bases: openclean.data.metadata.base.MetadataStore

Metadata store that maintains annotations for a dataset snapshot in main memory. Metadata is not persistet in any other form and therefore volatile if the metadata store object is destroyed.

read(column_id: Optional[int] = None, row_id: Optional[int] = None) Dict

Read the annotation dictionary for the specified object.

Parameters
  • column_id (int, default=None) – Column identifier for the referenced object (None for rows or full datasets).

  • row_id (int, default=None) – Row identifier for the referenced object (None for columns or full datasets).

Return type

dict

write(doc: Dict, column_id: Optional[int] = None, row_id: Optional[int] = None)

Write the annotation dictionary for the specified object.

Parameters
  • doc (dict) – Annotation dictionary that is being written to file.

  • column_id (int, default=None) – Column identifier for the referenced object (None for rows or full datasets).

  • row_id (int, default=None) – Row identifier for the referenced object (None for columns or full datasets).

Return type

dict

class openclean.data.metadata.mem.VolatileMetadataStoreFactory

Bases: openclean.data.metadata.base.MetadataStoreFactory

Factory pattern for volatile metadata stores. Maintains the created metadata stores for each version in memory.

get_store(version: int) openclean.data.metadata.mem.VolatileMetadataStore

Get the metadata store for the dataset snapshot with the given version identifier.

Parameters

version (int) – Unique version identifier

Return type

openclean.data.metadata.mem.VolatileMetadataStore

rollback(version: int)

Remove metadata for all dataset versions that are after the given rollback version.

Parameters

version (int) – Unique identifier of the rollback version.