Skip to content

fix: Panic in pretty_format function when displaying DurationSecondsA… #7534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 26, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 17 additions & 3 deletions arrow-array/src/temporal_conversions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -217,14 +217,28 @@ pub(crate) fn split_second(v: i64, base: i64) -> (i64, u32) {

/// converts a `i64` representing a `duration(s)` to [`Duration`]
#[inline]
#[deprecated(since = "55.2.0", note = "Use `try_duration_s_to_duration` instead")]
pub fn duration_s_to_duration(v: i64) -> Duration {
Duration::try_seconds(v).unwrap()
}

/// converts a `i64` representing a `duration(s)` to [`Option<Duration>`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these methods are part of the public API : https://docs.rs/arrow/latest/arrow/array/temporal_conversions/fn.duration_s_to_duration.html

Thus we can't change them until the next major release (July 2025)

Option 1:

We could leave the original signatures and deprecate the two that are actually fallible per https://github.com/apache/arrow-rs?tab=readme-ov-file#deprecation-guidelines

And we can then add new functions like try_ for the two fallible conversions:

#[inline]
#[deprecated(since = "55.2.0", note = "Use `try_duration_s_to_duration` instead")]
pub fn duration_s_to_duration(v: i64) -> Duration {
    Duration::try_seconds(v).unwrap()
}

#[inline]
pub fn try_duration_s_to_duration(v: i64) -> Option<Duration> {
    Duration::try_seconds(v)
}

And similarly for milliseconds

Option 2: Reexport Duration

Since these are thin wrappers over Chrono anyways, we could simply re-export Duration 🤔

// Reexport Duration for use by downstream libraries
pub use chrono::Duration;

#[inline]
#[deprecated(since = "55.2.0", note = "Use `Duration::try_seconds` instead")]
pub fn duration_s_to_duration(v: i64) -> Duration {
    Duration::try_seconds(v).unwrap()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @alamb for good suggestions! Addressed the option1 in latest PR, i think it's more clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zhuqi-lucas -- I have a suggestion here zhuqi-lucas#2

I think it would be nice to avoid having to use try_duration_us_to_duration and handle a None when that function can not return None. Let me know what you think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @alamb , i tried this way, but it will make the following function failed with match fmt with Some():

macro_rules! duration_display {
    ($convert:ident, $t:ty, $scale:tt) => {
        impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
            type State = DurationFormat;

            fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
                Ok(options.duration_format)
            }

            fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
                let v = self.value(idx);
                match fmt {
                    DurationFormat::ISO8601 => match $convert(v) {
                        Some(td) => write!(f, "{}", td)?,
                        None => write!(f, "<invalid>")?,
                    },
                    DurationFormat::Pretty => match $convert(v) {
                        Some(_) => duration_fmt!(f, v, $scale)?,
                        None => write!(f, "<invalid>")?,
                    },
                }
                Ok(())
            }
        }
    };
}

The original one will always pass option:

duration_display!(try_duration_s_to_duration, DurationSecondType, 0);
duration_display!(try_duration_ms_to_duration, DurationMillisecondType, 3);
duration_display!(try_duration_us_to_duration, DurationMicrosecondType, 6);
duration_display!(try_duration_ns_to_duration, DurationNanosecondType, 9);

Or, we can make the macro support both cases, if we want to remove option for those don't need:

macro_rules! duration_display {
    //–– converters that return Option<Duration> ––
    ($convert:ident, $t:ty, $scale:tt, option) => {
        impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
            type State = DurationFormat;

            fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
                Ok(options.duration_format)
            }

            fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
                let v = self.value(idx);
                match fmt {
                    DurationFormat::ISO8601 => {
                        if let Some(td) = $convert(v) {
                            write!(f, "{}", td)?;
                        } else {
                            write!(f, "<invalid>")?;
                        }
                    }
                    DurationFormat::Pretty => {
                        if $convert(v).is_some() {
                            duration_fmt!(f, v, $scale)?;
                        } else {
                            write!(f, "<invalid>")?;
                        }
                    }
                }
                Ok(())
            }
        }
    };

    //–– converters that return plain Duration ––
    ($convert:ident, $t:ty, $scale:tt) => {
        impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
            type State = DurationFormat;

            fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
                Ok(options.duration_format)
            }

            fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
                let v = self.value(idx);
                match fmt {
                    DurationFormat::ISO8601 => {
                        write!(f, "{}", $convert(v))?;
                    }
                    DurationFormat::Pretty => {
                        duration_fmt!(f, v, $scale)?;
                    }
                }
                Ok(())
            }
        }
    };
}

And call it by:

// those four lines at the top of your file become:

// these two return Option<Duration>:
duration_display!(try_duration_s_to_duration, DurationSecondType, 0, option);
duration_display!(try_duration_ms_to_duration, DurationMillisecondType, 3, option);

// these two return Duration:
duration_display!(duration_us_to_duration, DurationMicrosecondType, 6);
duration_display!(duration_ns_to_duration, DurationNanosecondType, 9);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, we can make the macro support both cases, if we want to remove option for those don't need:

This is a great idea. I have tried it in :zhuqi-lucas#3

This comment was marked as outdated.

#[inline]
pub fn try_duration_s_to_duration(v: i64) -> Option<Duration> {
Duration::try_seconds(v)
}

/// converts a `i64` representing a `duration(ms)` to [`Duration`]
#[inline]
#[deprecated(since = "55.2.0", note = "Use `try_duration_ms_to_duration` instead")]
pub fn duration_ms_to_duration(v: i64) -> Duration {
Duration::try_milliseconds(v).unwrap()
Duration::try_seconds(v).unwrap()
}

/// converts a `i64` representing a `duration(ms)` to [`Option<Duration>`]
#[inline]
pub fn try_duration_ms_to_duration(v: i64) -> Option<Duration> {
Duration::try_milliseconds(v)
}

/// converts a `i64` representing a `duration(us)` to [`Duration`]
Expand Down Expand Up @@ -296,8 +310,8 @@ pub fn as_time<T: ArrowPrimitiveType>(v: i64) -> Option<NaiveTime> {
pub fn as_duration<T: ArrowPrimitiveType>(v: i64) -> Option<Duration> {
match T::DATA_TYPE {
DataType::Duration(unit) => match unit {
TimeUnit::Second => Some(duration_s_to_duration(v)),
TimeUnit::Millisecond => Some(duration_ms_to_duration(v)),
TimeUnit::Second => try_duration_s_to_duration(v),
TimeUnit::Millisecond => try_duration_ms_to_duration(v),
TimeUnit::Microsecond => Some(duration_us_to_duration(v)),
TimeUnit::Nanosecond => Some(duration_ns_to_duration(v)),
},
Expand Down
41 changes: 37 additions & 4 deletions arrow-cast/src/display.rs
Original file line number Diff line number Diff line change
Expand Up @@ -590,6 +590,12 @@ temporal_display!(time32ms_to_time, time_format, Time32MillisecondType);
temporal_display!(time64us_to_time, time_format, Time64MicrosecondType);
temporal_display!(time64ns_to_time, time_format, Time64NanosecondType);

/// Derive [`DisplayIndexState`] for `PrimitiveArray<$t>`
///
/// Arguments
/// * `$convert` - function to convert the value to an `Duration`
/// * `$t` - [`ArrowPrimitiveType`] of the array
/// * `$scale` - scale of the duration (passed to `duration_fmt`)
macro_rules! duration_display {
($convert:ident, $t:ty, $scale:tt) => {
impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
Expand All @@ -611,6 +617,34 @@ macro_rules! duration_display {
};
}

/// Similar to [`duration_display`] but `$convert` returns an `Option`
macro_rules! duration_option_display {
($convert:ident, $t:ty, $scale:tt) => {
impl<'a> DisplayIndexState<'a> for &'a PrimitiveArray<$t> {
type State = DurationFormat;

fn prepare(&self, options: &FormatOptions<'a>) -> Result<Self::State, ArrowError> {
Ok(options.duration_format)
}

fn write(&self, fmt: &Self::State, idx: usize, f: &mut dyn Write) -> FormatResult {
let v = self.value(idx);
match fmt {
DurationFormat::ISO8601 => match $convert(v) {
Some(td) => write!(f, "{}", td)?,
None => write!(f, "<invalid>")?,
},
DurationFormat::Pretty => match $convert(v) {
Some(_) => duration_fmt!(f, v, $scale)?,
None => write!(f, "<invalid>")?,
},
}
Ok(())
}
}
};
}

macro_rules! duration_fmt {
($f:ident, $v:expr, 0) => {{
let secs = $v;
Expand Down Expand Up @@ -657,8 +691,8 @@ macro_rules! duration_fmt {
}};
}

duration_display!(duration_s_to_duration, DurationSecondType, 0);
duration_display!(duration_ms_to_duration, DurationMillisecondType, 3);
duration_option_display!(try_duration_s_to_duration, DurationSecondType, 0);
duration_option_display!(try_duration_ms_to_duration, DurationMillisecondType, 3);
duration_display!(duration_us_to_duration, DurationMicrosecondType, 6);
duration_display!(duration_ns_to_duration, DurationNanosecondType, 9);

Expand Down Expand Up @@ -1071,9 +1105,8 @@ pub fn lexical_to_string<N: lexical_core::ToLexical>(n: N) -> String {

#[cfg(test)]
mod tests {
use arrow_array::builder::StringRunBuilder;

use super::*;
use arrow_array::builder::StringRunBuilder;

/// Test to verify options can be constant. See #4580
const TEST_CONST_OPTIONS: FormatOptions<'static> = FormatOptions::new()
Expand Down
54 changes: 53 additions & 1 deletion arrow-cast/src/pretty.rs
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ mod tests {
use arrow_buffer::{IntervalDayTime, IntervalMonthDayNano, ScalarBuffer};
use arrow_schema::*;

use crate::display::array_value_to_string;
use crate::display::{array_value_to_string, DurationFormat};

use super::*;

Expand Down Expand Up @@ -1186,4 +1186,56 @@ mod tests {
let actual: Vec<&str> = batch.lines().collect();
assert_eq!(expected_table, actual, "Actual result:\n{batch}");
}

#[test]
fn duration_pretty_and_iso_extremes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that would be good would be to extend these tests to cover the other duration TimeUnits (like milliseconds, microseconds and nanoseconds)

However I don't think it is required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @alamb , I think it's a good point, because it seems no other testing cover it, i will create a follow-up PR to improve the test.

// Build [MIN, MAX, 3661, NULL]
let arr = DurationSecondArray::from(vec![Some(i64::MIN), Some(i64::MAX), Some(3661), None]);
let array: ArrayRef = Arc::new(arr);

// Pretty formatting
let opts = FormatOptions::default().with_null("null");
let opts = opts.with_duration_format(DurationFormat::Pretty);
let pretty = pretty_format_columns_with_options("pretty", &[array.clone()], &opts)
.unwrap()
.to_string();

// Expected output
let expected_pretty = vec![
"+------------------------------+",
"| pretty |",
"+------------------------------+",
"| <invalid> |",
"| <invalid> |",
"| 0 days 1 hours 1 mins 1 secs |",
"| null |",
"+------------------------------+",
];

let actual: Vec<&str> = pretty.lines().collect();
assert_eq!(expected_pretty, actual, "Actual result:\n{pretty}");

// ISO8601 formatting
let opts_iso = FormatOptions::default()
.with_null("null")
.with_duration_format(DurationFormat::ISO8601);
let iso = pretty_format_columns_with_options("iso", &[array], &opts_iso)
.unwrap()
.to_string();

// Expected output
let expected_iso = vec![
"+-----------+",
"| iso |",
"+-----------+",
"| <invalid> |",
"| <invalid> |",
"| PT3661S |",
"| null |",
"+-----------+",
];

let actual: Vec<&str> = iso.lines().collect();
assert_eq!(expected_iso, actual, "Actual result:\n{iso}");
}
}
Loading