openclean
latest

Contents:

  • Installation
    • Users
    • Contributors
  • Getting Started
    • Loading Data
    • Profiling the Dataset
    • Selecting Columns
    • Downloading and Preparing Master data
    • Identifying Fixes
    • Making Repairs
    • More Examples
  • Data Model
    • Datasets and Streams
    • Eval Functions
      • Col
      • Cols
      • Const
      • And
      • Or
  • Data Profiling
    • Using the openclean profiler
    • Visualizing profiled results
  • Data Transformation
    • Selecting
    • Inserting
    • Updating
    • Filtering
    • Moving
    • Sorting
  • Data Wrangling and Cleaning
    • Functional Dependency Violations
    • Missing Values
    • Misspellings and Data Entry Bugs
    • Data Standardization
    • Statistical Outliers
    • Custom functions
  • Data Enrichment
    • Master data using Socrata
    • Master data using Reference Data Repository
  • Data Provenance
    • Initialize
    • Create
    • Commit
    • Checkout
    • Rollback
    • Register
    • Other Examples
  • Step by Step Guides
    • Downloading master data from Reference Data Repository
      • restcountries.eu
      • Encyclopaedia Britannica
      • Cleanup
    • Downloading DOB Job Application Filings from Socrata
    • Misspellings in Country Names
      • Download Country Names Masterdata
      • Identify Country Name Outliers in ITU ICT Development Index (IDI)
      • Repair Country Name Outliers in ITU ICT Development Index (IDI)
    • Statistical Outliers in City names
    • Misspellings of Brooklyn
    • Profiling - DOHMH New York City Restaurant Inspection Results
      • Data Profiling
    • Wrangling - DOHMH New York City Restaurant Inspection Results
      • Data Cleaning
      • Extract Relevant Records
    • Features
      • Data Profiling
      • Data Cleaning & Wrangling
      • Data Enrichment
      • Data Provenance
    • Setting up
    • Loading data
    • Profiling
    • Transformations
      • Date Conversion
      • Standardizing Spellings
    • kNN Clustering - DOHMH New York City Restaurant Inspection Results
      • Extract Relevant Records
    • Functional Dependency Violations
    • Token Signature Outliers for Street Names
    • Standardization of Street Names
    • User-defined Functions
    • Engine - Datastore
      • Notebook Spreadsheet UI
      • Rollback Changes in Persistent Archive
  • Extensions
    • openclean-notebook
    • openclean-pattern
  • Configuration
    • Data Storage
    • Multi-Threading
    • Configuration for Workers for External Processes
  • Contributing
    • Code
    • Test Coverage
    • Bug report or feature request
    • Documentation
  • Frequently Asked Questions
    • Where to report bugs?

API Reference:

  • openclean
    • openclean package
      • Subpackages
      • Submodules
openclean
  • »
  • openclean »
  • openclean package »
  • openclean.data package
  • Edit on GitHub

openclean.data package

Subpackages

  • openclean.data.archive package
    • Submodules
      • openclean.data.archive.base module
      • openclean.data.archive.cache module
      • openclean.data.archive.histore module
  • openclean.data.metadata package
    • Submodules
      • openclean.data.metadata.base module
      • openclean.data.metadata.fs module
      • openclean.data.metadata.mem module
  • openclean.data.source package
    • Submodules
      • openclean.data.source.socrata module
  • openclean.data.store package
    • Submodules
      • openclean.data.store.base module
      • openclean.data.store.fs module
      • openclean.data.store.mem module
  • openclean.data.stream package
    • Submodules
      • openclean.data.stream.base module
      • openclean.data.stream.csv module
      • openclean.data.stream.df module

Submodules

  • openclean.data.groupby module
  • openclean.data.load module
  • openclean.data.mapping module
  • openclean.data.refdata module
  • openclean.data.schema module
  • openclean.data.sequence module
  • openclean.data.serialize module
  • openclean.data.types module
  • openclean.data.util module
Previous Next

© Copyright 2021, New York University. Revision 177a2804.

Built with Sphinx using a theme provided by Read the Docs.