Skip to content

How do I validate a value in a dataframe which is dependent on other value in that specific row? #714

Answered by cosmicBboy
vovavili asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @vovavili

Depending on which API you're using, you can check out the wide checks for the object-based API or dataframe checks for the class-based API.

Note: the code snippets below aren't tested, but should be going in the right direction

Class-based API:

import pandera as pa
from pandera.typing as Series

class Schema(pa.SchemaModel):
    Name: Series[str]
    Salary: Series[int]
    Department: Series[str]
    Mandatory: Series[str]
    
    @pa.dataframe_check
    def rob_aviation_check(cls, df) -> Series[bool]:
        return df.loc[df["Name"] == "Rob" & df["Department"] == "Aviation", "Salary"] >= 5000

Object-based API:

schema = DataFrameSchema(
    columns={
        "Name": pa.Co…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by vovavili
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants