openclean.operator.stream.processor module
Operators in a data stream pipeline represent the actors in the definition of the pipeline. Each operator provides the functionality (factory pattern) to instantiate the operator for a given data stream before a data streaming pipeline is executed.
- class openclean.operator.stream.processor.StreamProcessor
Bases:
object
Stream processors represent definitions of actors in a data stream processing pipeline. Each operator implements the open method to create a prepared instance (consumer) for the operator that is used in a stream processing pipeline to filter, manipulate or profile data stream rows.
- abstract open(schema: List[Union[str, histore.document.schema.Column]]) openclean.operator.stream.consumer.StreamConsumer
Factory pattern for stream consumer. Returns an instance of the stream consumer that corresponds to the action that is defined by the stream processor.
- Parameters
schema (list of string) – List of column names in the data stream schema.
- Return type