-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nicer representations of intervals #25
Comments
@devmotion @torfjelde what do you think of this? It would change how |
An alternative would be to use an julia> summarize(x)
SummaryStats
mean std hdi94 mcse_mean mcse_std ess_tail ess_bulk rhat
1 0.499 0.291 0.0379 .. 0.975 0.0047 0.0020 3993 3878 1.00
2 0.500 0.287 0.0521 .. 0.991 0.0046 0.0021 3926 3897 1.00
3 0.491 0.289 0.0197 .. 0.948 0.0046 0.0020 4171 3820 1.00 On the upside, the type specifies whether the bounds are inclusive (they are), and the visualization is equally compact. Also, the next release of IntervalSets contains a plotting recipe for the intervals. On the downside, the use of ellipses could look like columns were truncated from the middle like pandas does. We could change the representation to use square brackets, but that might make the user thing it's a length 2 vector. |
Generally, I strive for avoiding unnecessary dependencies but I think I'd prefer both the type representation as a |
Alignment is tricky here. For intervals we would probably align on the |
Ah I see, then it was just a coincidence since all numbers were positive? In any case, I still prefer |
Yeah, here's kind of the worst-case scenario (parameter scales differ by 3 or more orders of magnitude): julia> x = ar1(0.5, sqrt(1-0.5^2), 200, 4, 5) .* reshape(exp10.(1:5), 1, 1, :) .+ reshape(1:5, 1, 1, :);
julia> summarize(x)
SummaryStats
mean std hdi94 mcse_mean mcse_std ess_tail ess_bulk rhat
1 0.1 10.1 -20.6 .. 17.3 0.55 0.30 493 343 1.00
2 -0.0009 94 -179 .. 171 5.2 2.9 556 319 1.01
3 -1 9.8e+02 -1.75e+03 .. 1.88e+03 59 28 580 277 1.01
4 -1.e+02 1.e+04 -1.82e+04 .. 1.84e+04 6.0e+02 5.6e+02 346 265 1.02
5 4.e+03 1.03e+05 -1.74e+05 .. 2.04e+05 6.5e+03 3.0e+03 560 250 1.03
Okay! Unless @torfjelde disagrees, I'll go with that. |
@devmotion @torfjelde When a user specifies julia> summarize(x, default_summary_stats()...)
SummaryStats
mean std hdi_94% mcse_mean mcse_std ess_tail ess_bulk rhat
1 1.1 0.98 -0.808 .. 2.83 0.053 0.030 506 341 1.00
2 2.1 1.9 -1.35 .. 5.79 0.12 0.057 448 266 1.01
3 3.1 2.8 -1.93 .. 8.50 0.17 0.091 507 276 1.01
4 3.8 4.1 -4.45 .. 11.1 0.23 0.13 472 315 1.00
5 5.2 4.9 -3.54 .. 15.1 0.29 0.18 412 279 1.01 For HDI vs for ETI julia> summarize(x, default_summary_stats(median)...)
SummaryStats
median mad eti_94% mcse_median ess_tail ess_median rhat
1 1.1 0.990 -0.725 .. 2.94 0.064 506 354 1.00
2 2.1 1.83 -1.43 .. 5.78 0.13 448 316 1.01
3 3.0 2.79 -1.96 .. 8.46 0.18 507 438 1.01
4 3.8 4.27 -3.96 .. 11.7 0.37 472 372 1.00
5 5.1 4.80 -4.55 .. 14.5 0.36 412 351 1.01 OR julia> summarize(x, default_summary_stats(median)...)
SummaryStats
median mad q3% q97% mcse_median ess_tail ess_median rhat
1 1.1 0.990 -0.725 2.94 0.064 506 354 1.00
2 2.1 1.83 -1.43 5.78 0.13 448 316 1.01
3 3.0 2.79 -1.96 8.46 0.18 507 438 1.01
4 3.8 4.27 -3.96 11.7 0.37 472 372 1.00
5 5.1 4.80 -4.55 14.5 0.36 412 351 1.01 |
I'm leaning towards the interval representation: Partly for consistency, and partly because I think the main interest is the interval, not the 3rd and 97th percentile. |
Sorry, just saw this. But yes, all of this seems very sensible:) Deffo a nice change! |
As pointed out in arviz-devs/arviz#2306, HDI is different from ETI in that for HDI it doesn't make sense to assign a percentage to the bounds as we currently do. The bounds themselves are not interesting; only the whole interval is interesting. By contrast, we could either represent an ETI with a percentage describing the probability in that interval or with percentages for each bound. The bounds themselves are interesting.
Furthermore, the name ETI is perhaps not so well known. Since horizontal space is precious for showing summary stats in the REPL, I propose we clarify the HDI bound with no additional horizontal space taken up with the following change:
First, we store the HDI as a tuple and drop the unnecessary
%
and_
from the column name, producing:Second, we replace
eti_XXX%
withqXXX
, similar to the posterior package. Whileq
is very short, I think it's understandable what these mean, and we could easily see this generalizing to include more than just the 2 quantiles:The text was updated successfully, but these errors were encountered: