Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AsIs sometimes not preserved when computing geom parameters #5863

Open
yjunechoe opened this issue Apr 25, 2024 · 4 comments · May be fixed by #6170
Open

AsIs sometimes not preserved when computing geom parameters #5863

yjunechoe opened this issue Apr 25, 2024 · 4 comments · May be fixed by #6170

Comments

@yjunechoe
Copy link
Contributor

yjunechoe commented Apr 25, 2024

At least in GeomBar (and possibly elsewhere), when an <AsIs> aesthetic for bar height is parameterized, the <AsIs> class is sometimes dropped.

library(ggplot2)

p <- data.frame(x = 1:2, y = c(0.5, 2)) |>
  ggplot(aes(x, I(y))) +
  geom_col()
p

image

Here, ymax is stripped of <AsIs>:

tibble::as_tibble(
  layer_data(p)[, c("ymin", "y", "ymax")]
)
#> # A tibble: 2 × 3
#>       ymin        y  ymax
#>   <I<dbl>> <I<dbl>> <dbl>
#> 1        0      0.5   0.5
#> 2        0      2     2

This also has consequences for training of the scales.1 In the plot, the height of the shortest bar is determined only by expansion

ggplot_build(p)$layout$get_scales(1)$y
#> <ScaleContinuousPosition>
#>  Range:   0.5 --    2
#>  Limits:  0.5 --    2

Expected plot of bar heights rendered in npc:

image

Footnotes

  1. I'm actually torn on whether this part is also a bug. Technically, the bars are following the baseline-at-0 constraint (just that 0 is now interpreted in npc, which is meaningless). But maybe in this case GeomBar should override the baseline of the bars to always be on data scale (probably hard)? Or if users really want that for whatever reason, they could just add ylim(0, NA) I suppose

@teunbrand
Copy link
Collaborator

Thanks June!

I think the following things are happening.

  1. position_stack() is doing a computation on ymax that drops the class. The data come out the correct way from GeomBar$setup_data and position = "identity" preserves the class.
  2. The scales are supposed to ignore class vectors, as that is the main mechanism through which this works. The y scale gets trained to c(0.5, 2) because it only observes the plain ymax returned from the position adjustment. I don't think the scale range should include 0 in this case, as y variables are . Theoretically, the y scale shouldn't even be populated in this case as it is supposed to ignore all y variables.

The following is what I think the plot should yield. The clipping/margins are added for clarity.

library(ggplot2)

data.frame(x = 1:2, y = c(0.5, 2)) |>
  ggplot(aes(x, I(y))) +
  geom_col(position = "identity") +
  coord_cartesian(clip = "off") +
  theme(plot.margin = margin(200, 5, 5, 5))

Note that the y-scale is unpopulated because it hasn't observed any of the variables:

layer_scales()$y$is_empty()
#> [1] TRUE

Created on 2024-04-25 with reprex v2.1.0

I think all of this brings us to the following question: should position adjustments attempt to preserve any variables?
While I'm still undecided, I'm leaning towards 'no' as these are designed to operate in data-space and mixing data-space and panel-space in these computations is prone to unexpected results (better to not promise anything, than promising and not delivering).

@yjunechoe
Copy link
Contributor Author

yjunechoe commented Apr 25, 2024

better to not promise anything, than promising and not delivering

Well put - that's the conclusion that I'm circling back to as well. Maybe once I() gets more widely used people will start developing stronger intuitions about what they expect here, but as it stands I'm now less sure about the "expected output" I posed originally.

Maybe a better way to frame the issue is whether ggplot should signal any infos or warnings if the user accidentally mixes data-space and panel-space? Because my surprise with the reprex is more so the fact that's not obvious from the user's side that they're mixing data-space and panel-space - the code reads like it should plot the y only in panel-space (setting aside the issue of whatever that should mean for GeomBar) but Position introduces data-space positioning internally and causes the mixing.

@teunbrand
Copy link
Collaborator

Yeah I agree that such warnings would be nice, but to my estimation there are a lot of places where aesthetics are combined into new ones which would mean a lot of checks scattered around the codebase. In the case of position adjustments specifically, it isn't standardised somewhere what aesthetics they read or write so it'd be hard to do systematically. Perhaps the least intrusive way out is simply to document the use of I() as 'at your own risk' and point out potential interactions with stats and position adjustments that may go unexpected.

@yjunechoe
Copy link
Contributor Author

Got it - that sounds completely reasonable! I'm content with just the fact that this is clarified for me - I'll let you make the call for whether this also warrants an entry in the docs (and feel free to close this as complete).

@teunbrand teunbrand linked a pull request Oct 29, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants