Replies: 1 comment 2 replies
-
Hi @EvavW
You can apply a check to all columns with the Another trick is to access the series's name to look up the allowed values. Here is how I would go about it: from typing import Optional
import pandas as pd
import pandera as pa
from pandera.typing import Series
class Screen(pa.SchemaModel):
# You don't need a Config, private attributes cannot be fields.
_code_book = {
"sex": {0: "Male", 1: "Fema"},
"chldbear": {0: "Unable", 1: "Able"},
}
sex: Series[int]
chldbear: Optional[Series[int]]
age: Optional[Series[int]]
@pa.check(".*", regex=True)
@classmethod
def check_codes(cls, series: Series) -> bool:
col_name = series.name
# look up allowed values if any
codes = cls._code_book.get(col_name, {}).keys()
if codes:
return series.isin(codes)
return True # nothing to check
df = pd.DataFrame({"sex": [1], "chldbear": [9], "age": [18]})
Screen.validate(df) Note that in your example,
I'm not sure to understand that part. Did I answer all your questions? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'd like to write a special version of the
isin
check that can take a dictionary of the possible int values for a field, along with the string values for each enum. I would want to be able to apply this check to multiple fields in a schema, without defining it separately for each. May look roughly like:In this example I could define
int_isin
generally and then use it for anyField
. Is this possible currently? If you know of a better way to reference Enums with both int and string values that would eliminate the need for this I would be happy to hear about that as well!Beta Was this translation helpful? Give feedback.
All reactions