-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storing open-ness as a type parameter makes Postgres parsing type unstable #208
Comments
FWIW, we do dispatch on those parameters in a few places, but rewriting that logic with an explicit condition or by passing those as arguments to subfunction could be a reasonable solution. https://github.com/invenia/Intervals.jl/blob/master/src/interval.jl#L190 This would also make it easier to break up Intervals.jl into a set of IntervalSet.jl extensions as discussed here. |
@willtebbutt has a prototype for not doing this. |
Yup. It's on a WIP MR to an internal package but there's not much code, so here's a snippet for reference: struct DynamicInterval{T}
infimum::T
supremum::T
lhs_is_open::Bool
rhs_is_open::Bool
end
function Base.issubset(x::DynamicInterval, y::DynamicInterval)
lower_is_in = if x.lhs_is_open
x.infimum >= y.infimum
else
y.lhs_is_open ? (x.infimum > y.infimum) : (x.infimum >= y.infimum)
end
upper_is_in = if x.rhs_is_open
x.supremum <= y.supremum
else
y.rhs_is_open ? (x.supremum < y.supremum) : (x.supremum <= y.supremum)
end
return lower_is_in && upper_is_in
end Performance of construction of arrays of intervals with mixed open-ness / closed-ness and using BenchmarkTools
_opens = [rand([true, false]) for _ in 1:1_000_000]
_closeds = [rand([true, false]) for _ in 1:1_000_000]
@benchmark [DynamicInterval(0.0, 1.0, o, c) for(o, c) in zip($_opens, $_closeds)]
BenchmarkTools.Trial: 794 samples with 1 evaluation.
Range (min … max): 3.330 ms … 14.258 ms ┊ GC (min … max): 0.00% … 6.20%
Time (median): 3.416 ms ┊ GC (median): 0.00%
Time (mean ± σ): 6.290 ms ± 4.255 ms ┊ GC (mean ± σ): 7.31% ± 14.75%
█ ▄ ▁▃▂ ▄
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄███▅▆▄▁▄█ ▇
3.33 ms Histogram: log(frequency) by time 14.1 ms <
Memory estimate: 22.89 MiB, allocs estimate: 2.
xs = [DynamicInterval(0.0, 1.0, rand([true, false]), rand([true, false])) for _ in 1:1_000_000]
x = DynamicInterval(0.0, 1.0, false, false)
@benchmark map(Base.Fix1(issubset, $x), $xs)
BenchmarkTools.Trial: 2046 samples with 1 evaluation.
Range (min … max): 2.130 ms … 3.905 ms ┊ GC (min … max): 0.00% … 13.12%
Time (median): 2.389 ms ┊ GC (median): 0.00%
Time (mean ± σ): 2.436 ms ± 203.115 μs ┊ GC (mean ± σ): 0.49% ± 2.54%
▁▂▅▆▅█▇▆▅▃▄▂▁
▂▂▃▄▅█████████████▇▇▅▅▅▄▃▃▄▂▃▂▃▂▁▂▂▂▂▂▂▂▃▂▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▁▂ ▄
2.13 ms Histogram: frequency by time 3.36 ms <
Memory estimate: 976.67 KiB, allocs estimate: 2. Constrast this with: using Intervals
_opens_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
_closed_typed = [rand([Open, Closed]) for _ in 1:1_000_000]
@benchmark [Interval{o, c}(0.0, 1.0) for (o, c) in zip($_opens_typed, $_closed_typed)]
BenchmarkTools.Trial: 3 samples with 1 evaluation.
Range (min … max): 1.727 s … 2.053 s ┊ GC (min … max): 13.63% … 22.27%
Time (median): 1.938 s ┊ GC (median): 18.28%
Time (mean ± σ): 1.906 s ± 165.230 ms ┊ GC (mean ± σ): 18.31% ± 4.32%
█ █ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
1.73 s Histogram: frequency by time 2.05 s <
Memory estimate: 640.87 MiB, allocs estimate: 12000018.
xs_typed = [Interval{rand([Open, Closed]), rand([Open, Closed])}(0.0, 1.0) for _ in 1:1_000_000]
x_typed = Interval{Closed, Closed}(0.0, 1.0)
@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed)
BenchmarkTools.Trial: 85 samples with 1 evaluation.
Range (min … max): 52.160 ms … 72.713 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 59.560 ms ┊ GC (median): 10.26%
Time (mean ± σ): 59.213 ms ± 4.144 ms ┊ GC (mean ± σ): 6.27% ± 4.96%
█ ▂▂ ▂ ▂▂ ▂ ▂
▄▄█▆▄█▄▄█▄▆▆▄▄▁▆▄██▁▆█████▆███▄█▆▆▄▁▁▄▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
52.2 ms Histogram: frequency by time 72.6 ms <
Memory estimate: 31.47 MiB, allocs estimate: 1000002. As you would hope, if you're working with a vector of intervals with fixed open/closed-ness, then the performance is better than the dynamic intervals (presumably just because it's able to eliminate a bunch of branching at compile time), but only by a factor of 3: vs = [rand() for _ in 1:1_000_000]
@benchmark [Interval{Closed, Closed}(v - 0.5, 5.0) for v in $vs]
BenchmarkTools.Trial: 2499 samples with 1 evaluation.
Range (min … max): 1.762 ms … 9.897 ms ┊ GC (min … max): 0.00% … 9.05%
Time (median): 1.834 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.995 ms ± 505.555 μs ┊ GC (mean ± σ): 7.18% ± 11.85%
▂█▆
▂▅████▅▃▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▃▄▄▄▃▃▂ ▃
1.76 ms Histogram: frequency by time 2.76 ms <
Memory estimate: 15.26 MiB, allocs estimate: 2.
# change end-points so that the comparison isn't entirely vaccuous
xs_typed_fixed = [Interval{Closed, Closed}(rand() - 0.5, 5.0) for _ in 1:1_000_000]
@benchmark map(Base.Fix1(issubset, $x_typed), $xs_typed_fixed)
BenchmarkTools.Trial: 6879 samples with 1 evaluation.
Range (min … max): 685.808 μs … 2.068 ms ┊ GC (min … max): 0.00% … 40.18%
Time (median): 702.151 μs ┊ GC (median): 0.00%
Time (mean ± σ): 721.751 μs ± 116.212 μs ┊ GC (mean ± σ): 1.76% ± 6.55%
██▅▃▁ ▁
█████▇▇▅▄▃▁▅▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▃▇▇▅▃▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▆█ █
686 μs Histogram: log(frequency) by time 1.53 ms <
Memory estimate: 976.67 KiB, allocs estimate: 2. These kind of performance profiles suggest that you probably want dynamic intervals if you either have mixed open / closed-ness, and presumably it's going to be useful in cases where you don't know whether you have constant open / closed-ness and therefore can't just make a collection of things up front. From the above, I would characterise the gains to be had my knowing the open / closed-ness up front as "modest", but definitely something you want to have if you can. |
Okay, so my understanding as it relates to JuliaMath/IntervalSets.jl#67 is that we probably want:
|
Postgres has a
tzrange
type, which LibPQ.jl returns as anInterval[ZonedDateTime}
No openness type params, so it is type unstablBut it doesn't know if it is open or closed because LibPQ stores intervals that may be open or closed in the same postgre column type. (including mixing them).
Given we interact with intervals from prosgre databases all the time.
and we dispatch on this type param very rarely if ever,
I wonder if it wouldn;t be better stored as a field rather than a type param.
However this would be a breaking change
The text was updated successfully, but these errors were encountered: