Parquet: don't truncate min/max statistics for float16 and decimal when writing file #5075
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
See discussion:
#5003 (comment)
#5003 added support for float16 type which is a logical type on top of fixed len byte array
When writing statistics, truncation can occur for binary physical type:
arrow-rs/parquet/src/column/writer/mod.rs
Lines 634 to 643 in 7ba36b0
Which might be troublesome for f16 type, if the
column_index_truncate_length
config is set to 1, as a truncated f16 wouldn't represent the min and max correctly anymore as it has a sort order different from fixed len byte arrayDescribe the solution you'd like
Ignore truncation for f16 when writing min/max statistics
Describe alternatives you've considered
Additional context
Do we need to worry about this for other types based on binary physical types? i.e. decimal
The text was updated successfully, but these errors were encountered: