-
Notifications
You must be signed in to change notification settings - Fork 935
fix: Panic in pretty_format function when displaying DurationSecondsA… #7534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rray with i64::MIN / i64::MAX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @zhuqi-lucas for this contribution ❤️
@@ -215,28 +215,28 @@ pub(crate) fn split_second(v: i64, base: i64) -> (i64, u32) { | |||
(v.div_euclid(base), v.rem_euclid(base) as u32) | |||
} | |||
|
|||
/// converts a `i64` representing a `duration(s)` to [`Duration`] | |||
/// converts a `i64` representing a `duration(s)` to [`Option<Duration>`] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these methods are part of the public API : https://docs.rs/arrow/latest/arrow/array/temporal_conversions/fn.duration_s_to_duration.html
Thus we can't change them until the next major release (July 2025)
Option 1:
We could leave the original signatures and deprecate the two that are actually fallible per https://github.com/apache/arrow-rs?tab=readme-ov-file#deprecation-guidelines
And we can then add new functions like try_
for the two fallible conversions:
#[inline]
#[deprecated(since = "55.2.0", note = "Use `try_duration_s_to_duration` instead")]
pub fn duration_s_to_duration(v: i64) -> Duration {
Duration::try_seconds(v).unwrap()
}
#[inline]
pub fn try_duration_s_to_duration(v: i64) -> Option<Duration> {
Duration::try_seconds(v)
}
And similarly for milliseconds
Option 2: Reexport Duration
Since these are thin wrappers over Chrono anyways, we could simply re-export Duration 🤔
// Reexport Duration for use by downstream libraries
pub use chrono::Duration;
#[inline]
#[deprecated(since = "55.2.0", note = "Use `Duration::try_seconds` instead")]
pub fn duration_s_to_duration(v: i64) -> Duration {
Duration::try_seconds(v).unwrap()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @alamb for good suggestions! Addressed the option1 in latest PR, i think it's more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @zhuqi-lucas -- I have a suggestion here zhuqi-lucas#2
I think it would be nice to avoid having to use try_duration_us_to_duration
and handle a None
when that function can not return None. Let me know what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @alamb , i tried this way, but it will make the following function failed with match fmt with Some():
macro_rules! duration_display {
($convert:ident, $t:ty, $scale:tt) => {
impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
type State = DurationFormat;
fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
Ok(options.duration_format)
}
fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
let v = self.value(idx);
match fmt {
DurationFormat::ISO8601 => match $convert(v) {
Some(td) => write!(f, "{}", td)?,
None => write!(f, "<invalid>")?,
},
DurationFormat::Pretty => match $convert(v) {
Some(_) => duration_fmt!(f, v, $scale)?,
None => write!(f, "<invalid>")?,
},
}
Ok(())
}
}
};
}
The original one will always pass option:
duration_display!(try_duration_s_to_duration, DurationSecondType, 0);
duration_display!(try_duration_ms_to_duration, DurationMillisecondType, 3);
duration_display!(try_duration_us_to_duration, DurationMicrosecondType, 6);
duration_display!(try_duration_ns_to_duration, DurationNanosecondType, 9);
Or, we can make the macro support both cases, if we want to remove option for those don't need:
macro_rules! duration_display {
//–– converters that return Option<Duration> ––
($convert:ident, $t:ty, $scale:tt, option) => {
impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
type State = DurationFormat;
fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
Ok(options.duration_format)
}
fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
let v = self.value(idx);
match fmt {
DurationFormat::ISO8601 => {
if let Some(td) = $convert(v) {
write!(f, "{}", td)?;
} else {
write!(f, "<invalid>")?;
}
}
DurationFormat::Pretty => {
if $convert(v).is_some() {
duration_fmt!(f, v, $scale)?;
} else {
write!(f, "<invalid>")?;
}
}
}
Ok(())
}
}
};
//–– converters that return plain Duration ––
($convert:ident, $t:ty, $scale:tt) => {
impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
type State = DurationFormat;
fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
Ok(options.duration_format)
}
fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
let v = self.value(idx);
match fmt {
DurationFormat::ISO8601 => {
write!(f, "{}", $convert(v))?;
}
DurationFormat::Pretty => {
duration_fmt!(f, v, $scale)?;
}
}
Ok(())
}
}
};
}
And call it by:
// those four lines at the top of your file become:
// these two return Option<Duration>:
duration_display!(try_duration_s_to_duration, DurationSecondType, 0, option);
duration_display!(try_duration_ms_to_duration, DurationMillisecondType, 3, option);
// these two return Duration:
duration_display!(duration_us_to_duration, DurationMicrosecondType, 6);
duration_display!(duration_ns_to_duration, DurationNanosecondType, 9);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, we can make the macro support both cases, if we want to remove option for those don't need:
This is a great idea. I have tried it in :zhuqi-lucas#3
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
arrow-cast/Cargo.toml
Outdated
@@ -37,6 +37,7 @@ all-features = true | |||
|
|||
[features] | |||
prettyprint = ["comfy-table"] | |||
default = ["prettyprint"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could just put the tests in the prettyprint module and avoid having to change any feature flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in latest PR, thanks @alamb .
arrow-cast/src/display.rs
Outdated
let array: ArrayRef = Arc::new(arr); | ||
|
||
// Pretty formatting | ||
let opts = FormatOptions::default().with_null("<NULL>"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps these tests can be put into the pretty
module directly which would mean we don't have to adjust feature flags
arrow-cast/src/display.rs
Outdated
assert_eq!(pretty.matches("<invalid>").count(), 2); | ||
assert!(pretty.contains("0 days 1 hours 1 mins 1 secs")); | ||
assert!(pretty.contains("<NULL>")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easier to validate this test if you just compared the output entirely
Perhaps you can follow the model of the existing pretty print tests:
arrow-rs/arrow-cast/src/pretty.rs
Lines 459 to 471 in 5b44d67
let table = pretty_format_batches(&[batch]).unwrap().to_string(); | |
let expected = vec![ | |
"+--------------------------------------------+", | |
"| d1 |", | |
"+--------------------------------------------+", | |
"| 68656c6c6f |", | |
"| |", | |
"| 6c6f6e676572207468616e203132206279746573 |", | |
"| 616e6f74686572207468616e203132206279746573 |", | |
"| |", | |
"| 736d616c6c |", | |
"+--------------------------------------------+", | |
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in latest PR, thanks @alamb!
Thank you @alamb for review and good suggestions! Addressed in latest PR. |
Avoid API changes for infallible functions, try 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @zhuqi-lucas
@@ -1186,4 +1186,56 @@ mod tests { | |||
let actual: Vec<&str> = batch.lines().collect(); | |||
assert_eq!(expected_table, actual, "Actual result:\n{batch}"); | |||
} | |||
|
|||
#[test] | |||
fn duration_pretty_and_iso_extremes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing that would be good would be to extend these tests to cover the other duration TimeUnits (like milliseconds, microseconds and nanoseconds)
However I don't think it is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @alamb , I think it's a good point, because it seems no other testing cover it, i will create a follow-up PR to improve the test.
I'll plan to merge this tomorrow unless there are any concerns with |
…rray with i64::MIN / i64::MAX
Which issue does this PR close?
Closes #7533
Rationale for this change
Avoid panic when printing invalid durations
Also added testing
What changes are included in this PR?
Option
insteadSupport panic handling and print invalid result.
Are there any user-facing changes?
Support panic handling and print invalid result.