Releases: unionai-oss/pandera
v0.14.3: Fix regression with reporting missing columns
What's Changed
- fix regression: all missing columns should be reported by @cosmicBboy in #1117
Full Changelog: v0.14.2...v0.14.3
v0.14.2: docs fix - don't build pdf or epub versions
v0.14.1: add multimethod dependency
What's Changed
- multimethod is a runtime dependency by @timkpaine in #1112
New Contributors
- @timkpaine made their first contribution in #1112
Full Changelog: v0.14.0...v0.14.1
v0.14.0: ✍️ Pandera Internals Rewrite [phase 1]
⭐️ Highlights
The main highlight of this release is that phase 1 of the Pandera internals re-write is complete 🎉🚀! This is a backwards-compatible re-write (unit tests FTW 😅) that should just work with your existing pandera code. Please submit bug reports if you encounter any regressions that weren't covered by the current test suite.
These PRs #913 #1109, and #1110 address #381, and essentially decouples pandas-specific logic from the pandera schema specification. In summary:
- The pandera schema specifications are defined in
pandera.api
, containing:- schema base classes in
pandera.api.base
- pandera schema classes in
pandera.api.pandas
- the global check and hypothesis namespace in
pandera.api.checks.Check
andpandera.api.hypotheses.Hypothesis
- decorators are provided in
pandera.api.extensions
to be able to register builtin and custom checks/hypotheses
- schema base classes in
- The pandera backend validation logic is defined in
pandera.backends
, containing:- backend base classes in
pandera.backends.base
- pandas-specific backend validators in
pandera.backends.pandas
- backend base classes in
Now, all pandas-specific logic is isolated to specific modules, where support for additional non-pandas-compliant schema specifications and their associated backends can be implemented either as 1st-party-maintained libraries (see issues for supporting polars and ibis) or 3rd party libraries.
🛣 Rewrite Roadmap
The bulk of the re-write is complete, however there are still some outstanding items:
- Write validation backends for the existing pandas-like frameworks (dask, pyspark.pandas, modin). This may lead to refactoring some of the abstractions that came out of the rewrite.
- Write an alpha version of the
pandera-ibis
package, which will create a schema specification and validation backends for ibis data structures (see issue #1105) - Document the process of writing your own 3rd party libraries based on pandera for any arbitrary statistical data container.
What's Changed
- Bugfix/996: strict="filter" doesn't work on spark dataframes by @nwoodbury in #1001
- unpin pandas-stubs version by @williamjamir in #1000
- add PR messages, DCO to contributing guide by @cosmicBboy in #1006
- Turn failure-cases to string to avoid hashing unhashable objects by @a-recknagel in #1014
- not require
coerce==True
when for PydandticModels by @the-matt-morris in #1011 - Schema Model manipulation docs by @a-recknagel in #1012
- Fix handling of decimals with scale=0 by @a-recknagel in #1010
- Add Union support to
check_types
: Bugfix/977 by @kr-hansen in #995 - Bugfix/997 by @joepatol in #1017
- update mypy plugin and tests by @cosmicBboy in #1007
- fix issue where @check_types-decorated function is an iterable by @cosmicBboy in #1022
- fix mypy extra unit tests, pin pandas-stubs for dev env by @cosmicBboy in #1056
- Feature/511: Copy columns in DataFrameSchema.init() by @NickCrews in #1055
- Unpinning ray from requirements-dev.txt by @erichamers in #1052
- core and backend pandera API internals rewrite by @cosmicBboy in #913
- Small fix to example by @brl0 in #1083
- Coerce dt indexes and series by @cristianmatache in #1057
- correctly type-check strings by @cosmicBboy in #1106
- fix lazy validation issue with regex columns if no column found by @cosmicBboy in #1107
- fix(dtypes.py): correction at function is_numeric docstring by @HenriqueAJNB in #1100
- internals rewrite: clean up checks and hypothesis functionality by @cosmicBboy in #1109
- rename pandera.core to pandera.api by @cosmicBboy in #1110
New Contributors
- @nwoodbury made their first contribution in #1001
- @williamjamir made their first contribution in #1000
- @kr-hansen made their first contribution in #995
- @joepatol made their first contribution in #1017
- @erichamers made their first contribution in #1052
- @brl0 made their first contribution in #1083
- @HenriqueAJNB made their first contribution in #1100
Full Changelog: v0.13.4...v0.14.0
Release 0.13.4: JSON serialization and bugfixes
What's Changed
- make GenericDtype TypeVar bound union of supported types by @cosmicBboy in #960
- Use Self in type hints for DataFrameSchema by @NickCrews in #961
- CI-improvements by @NickCrews in #962
- move jupyterlite_sphinx to pip deps by @cosmicBboy in #973
- Fix custom check extensions doc by @a-recknagel in #975
- Raise error on typo in extensions.register_check_method's statistics argument by @a-recknagel in #985
- Add class name for methods in validation error message (#980) by @Drumbits in #982
- check_output works with schema where coerce=True by @cosmicBboy in #979
- make df strategy less complex by @cosmicBboy in #989
- fix tz deprecation by @cristianmatache in #972
- De/serializate data frame schema to/from JSON by @KiaXdice in #924
- fix to_script output by @cosmicBboy in #994
New Contributors
- @a-recknagel made their first contribution in #975
- @Drumbits made their first contribution in #982
- @KiaXdice made their first contribution in #924
Contributors
Shoutout to all the contributors of this release 🎉
Full Changelog: v0.13.3...v0.13.4
Beta Release: v0.13.4b0
Beta Release v0.13.4b0
Release 0.13.3: Fix Decimal Type
What's Changed
- Fix decimal by @cosmicBboy in #956
- add date and decimal types to typing.common module by @cosmicBboy in #958
Full Changelog: v0.13.2...v0.13.3
Release 0.13.2: Fix modin tests
Release 0.13.1: Bugfix on "Try Pandera" Jupyterlite Deployment
Release 0.13.0: Option to Report All Errors, "Try Pandera" with Jupyterlite
Highlights ⭐️
- try pandera: add jupyterlite notebooks, add support for py3.7 (#951) @cosmicBboy
- Feature/922 add other ways to report unique errors as an argument (#914) @ng-henry
What's Changed 📈
- Bugfix/910: Support
ordered=True
in yaml schemas (#943) @dstumpy - docs: Fix typo in pyspark.rst (#948) @smoothml
- update rename_columns not to error on {key: key, ...} rename_dict (#941) @hsorsky
- Fix #937: Handle empty MultiIndex validation (#938) @davidandreoletti
- Fix infer_schema for 'empty' dataframes (#944) @tpvasconcelos
- Bugfix/Fix with_pydantic mypy error (#934) @brrm
- Updating Fugue section docs (#927) @kvnkho
New Contributors 🎉
Shout out to all the first-time contributors!
Full Changelog: v0.12.0...v0.13.0