Skip to content

Releases: unionai-oss/pandera

MultiIndex checks, head/tail/sample validation, DataFrame-checks

14 Aug 01:07
Compare
Choose a tag to compare

Release Notes

  • Added support for MultiIndex column and and index validation
  • DataFrameSchema can validate head, tail, or a random sample of dataframe
  • Checks and Hypothesis checks now support dataframe-level (wide) data validation

v0.1.4

10 Jun 13:56
988b14b
Compare
Choose a tag to compare

Add testing support for python 3.7, improved documentation

Pandera v0.1.3

10 Jun 03:29
6cfa2a7
Compare
Choose a tag to compare

This release adds a few nifty features to pandera, special thanks to @mastersplinter and @ralbertazzi:

  • We now have official documentation! Thanks to @mastersplinter on the work here.
  • the Check class now has a groupby argument, which enables the user to assert properties on subsets of the Column of interest. This opens up the possibility to compare the values or aggregates of values of subsets of a column #42.
  • the introduction of hypothesis tests through the Hypothesis class, which is a subclass of the Check class. This enables the user to run hypothesis tests on their dataframe as part of a DataFrameSchema definition. Refer to the documentation for more info #43.
  • Columns now have a required argument (default = True), where required=False means that the column is optional #23.
  • SeriesSchemaBase now has an allow_duplicates argument (default = True) #24
  • add informative errors to check_input and check_output decorators 902f199
  • DataFrameSchema(..., strict=True) means that all columns in the dataframe need to be specified in the schema columns. #34
  • improved error messaging in general.
  • improved CI (codecoverage).

Improve error reporting, add coerce option

29 Dec 19:35
Compare
Choose a tag to compare

This release adds two new features to pandera.

Improved error reporting

Now failure cases in column checks are displayed in a much more compact format,
where the failure cases, the index of the dataframe where those failures occur, and the
count of failure cases are shown to the user, e.g.

# failure cases:
#              index  count
# failure_case
# foo1           [0]      1
# foo2           [1]      1
# foo3           [2]      1

Coerce option in DataFrameSchema and Column

Now the user can coerce the dataframe when calling schema.validate so that
the columns are cast into the expected data-type before performing Checks.

New DataFrameSchema API

16 Dec 04:05
Compare
Choose a tag to compare

Release Notes

  • Major change: This release updates changes the API of the DataFrameSchema object.
    Instead of passing a list of Columns, you now pass a dictionary where the keys are column_names
    values are Column objects. This makes the API feel a lot more familiar for pandas users, who may
    often define DataFrames in a similar way (see README for details).
  • renamed Validator to Check for brevity and clarity (accordingly renamed validator_{input, output}
    to check_{input, output}.
  • created convenience variables for PandasDtype so they can be accessed in pandera namespace:
    Bool, DatetTime, Category, Float, Int, Object, String, Timedelta

bugfix: string-format datatype correctly checks type

10 Dec 21:54
Compare
Choose a tag to compare

pip dist bugfix

13 Nov 13:36
Compare
Choose a tag to compare
0.0.4

bump version 0.0.4

Update the schema Index API

11 Nov 17:24
Compare
Choose a tag to compare
0.0.3

bump version 0.0.3

alpha release

10 Nov 21:51
Compare
Choose a tag to compare

initial release of pandera. API likely to change