Releases: unionai-oss/pandera
Releases · unionai-oss/pandera
MultiIndex checks, head/tail/sample validation, DataFrame-checks
Release Notes
- Added support for
MultiIndex
column and and index validation DataFrameSchema
can validate head, tail, or a random sample of dataframeCheck
s andHypothesis
checks now support dataframe-level (wide) data validation
v0.1.4
Add testing support for python 3.7, improved documentation
Pandera v0.1.3
This release adds a few nifty features to pandera
, special thanks to @mastersplinter and @ralbertazzi:
- We now have official documentation! Thanks to @mastersplinter on the work here.
- the
Check
class now has agroupby
argument, which enables the user to assert properties on subsets of theColumn
of interest. This opens up the possibility to compare the values or aggregates of values of subsets of a column #42. - the introduction of hypothesis tests through the
Hypothesis
class, which is a subclass of theCheck
class. This enables the user to run hypothesis tests on their dataframe as part of aDataFrameSchema
definition. Refer to the documentation for more info #43. Column
s now have arequired
argument (default = True), whererequired=False
means that the column is optional #23.SeriesSchemaBase
now has anallow_duplicates
argument (default = True) #24- add informative errors to
check_input
andcheck_output
decorators 902f199 DataFrameSchema(..., strict=True)
means that all columns in the dataframe need to be specified in the schemacolumns
. #34- improved error messaging in general.
- improved CI (codecoverage).
Improve error reporting, add coerce option
This release adds two new features to pandera
.
Improved error reporting
Now failure cases in column checks are displayed in a much more compact format,
where the failure cases, the index of the dataframe where those failures occur, and the
count of failure cases are shown to the user, e.g.
# failure cases:
# index count
# failure_case
# foo1 [0] 1
# foo2 [1] 1
# foo3 [2] 1
Coerce option in DataFrameSchema
and Column
Now the user can coerce
the dataframe when calling schema.validate
so that
the columns are cast into the expected data-type before performing Check
s.
New DataFrameSchema API
Release Notes
- Major change: This release updates changes the API of the
DataFrameSchema
object.
Instead of passing a list ofColumn
s, you now pass a dictionary where the keys arecolumn_name
s
values areColumn
objects. This makes the API feel a lot more familiar for pandas users, who may
often defineDataFrame
s in a similar way (see README for details). - renamed
Validator
toCheck
for brevity and clarity (accordingly renamedvalidator_{input, output}
tocheck_{input, output}
. - created convenience variables for
PandasDtype
so they can be accessed inpandera
namespace:
Bool
,DatetTime
,Category
,Float
,Int
,Object
,String
,Timedelta
bugfix: string-format datatype correctly checks type
0.0.5 bump to version 0.0.5
pip dist bugfix
0.0.4 bump version 0.0.4
Update the schema Index API
0.0.3 bump version 0.0.3
alpha release
initial release of pandera. API likely to change