openclean.data.source.socrata module
Data repository for accessing datasets via the Socrata Open Data API.
- class openclean.data.source.socrata.SODADataset(doc: Dict, app_token: Optional[str] = None)
Bases:
refdata.base.DatasetDescriptor
Handle for a SODA dataset.
- load() pandas.core.frame.DataFrame
Download the dataset as a pandas data frame.
- Return type
pd.DataFrame
- write(file: IO)
Write the dataset to the given file. The output file format is a tab-delimited csv file with the column names as the first line.
- Parameters
file (file object) – File-like object that provides a write method.
- class openclean.data.source.socrata.Socrata(app_token: Optional[str] = None)
Bases:
refdata.base.Descriptor
Repository handle for the Socrata Open Data API.
- catalog(domain: Optional[str] = None) Iterable[openclean.data.source.socrata.SODADataset]
Generator for a listing of all datasets that are available from the repository. Provides to option to filter datasets by their domain.
- Parameters
domain (string, optional=None) – Optional domain name filter for returned datasets.
- Return type
iterable of openclean.data.source.socrata.SODADataset
- dataset(identifier: str) openclean.data.source.socrata.SODADataset
Get the handle for the dataset with the given identifier.
- Parameters
identifier (string) – Unique dataset identifier.
- Return type
- domains(filter: Optional[str] = None) List[Tuple[str, str]]
Get a list of domain names that are available from the Socrata Open Data API. Returns a list of tuples with catalog Url and the domain name.
If the domain filter is given only the domain that matches the filter will be returned.
- Return type
list of tuples of string and string