openclean.operator.transform.sort module

Data frame transformation operator for sorting by data frame columns.

class openclean.operator.transform.sort.Sort(columns, reversed=None)

Bases: openclean.operator.base.DataFrameTransformer

Sort operator for data frames. Allows to sort a data frame by one or more columns. For each column, the sort order can be specified separately.

transform(df)

Return a data frame that contains all rows but only those columns from the given input data frame that are included in the select clause.

Raises a value error if the list of columns contains an item that cannot be matched to a column in the given data frame.

Parameters

df (pandas.DataFrame) – Input data frame.

Return type

pandas.DataFrame

openclean.operator.transform.sort.order_by(df, columns, reversed=None)

Sort operator for data frames. Sort columns are referenced by their name or index position.

Parameters
  • df (pandas.DataFrame) – Input data frame.

  • columns (int, string, or list(int or string)) – Single column or list of column index positions or column names.

  • reversed (list(bool), default=None) – Allows to specify for each sort column if sort order is reversed. If given, the length of this list has to match the length of the columns list.

Return type

pandas.DataFrame

Raises

ValueError