-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add .by
to fill()
#1572
Add .by
to fill()
#1572
Conversation
R/fill.R
Outdated
fill.data.frame <- function(data, | ||
..., | ||
.direction = c("down", "up", "downup", "updown"), | ||
.by = NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting .by
in the .data.frame
method rather than the generic to try and be as least disruptive as possible for S3 method authors like dbplyr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: We did add .by
to the nest()
generic, so maybe we should look at this a little more.
Maybe it isn't as disruptive as we thought due to the presence of the ...
in the generic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to my old research on this, it doesn't look like we will cause a warning for other S3 method authors if we add .by
to the generic (this is probably why we could do it with nest()
too):
They will just have to opt in to support of .by
on their own over time or catch .by
and error if its not NULL
if they don't support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've gone back to .by
in the generic of fill()
based on the above analysis. That means that for dbplyr we will see something like this where .by
is captured in the ...
of the fill.tbl_lazy()
method until dbplyr updates its S3 method to handle .by
(possibly by just erroring).
By some weird combination of events this passes through dbplyr without error, I've opened tidyverse/dbplyr#1536
df <- data.frame(x = c(1, 2), y = c(2, 2), z = c(1, NA))
df_sqlite <- tbl_lazy(df, con = simulate_sqlite())
df_sqlite |> window_order(x) |> fill(z, .by = y)
<SQL>
SELECT `x`, `y`, MAX(`z`) OVER (PARTITION BY `..dbplyr_partition_1`) AS `z`
FROM (
SELECT
`df`.*,
SUM(CASE WHEN ((`z` IS NULL)) THEN 0 ELSE 1 END) OVER (ORDER BY `x` ROWS UNBOUNDED PRECEDING) AS `..dbplyr_partition_1`,
SUM(CASE WHEN ((`.by` IS NULL)) THEN 0 ELSE 1 END) OVER (ORDER BY `x` ROWS UNBOUNDED PRECEDING) AS `..dbplyr_partition_2`
FROM `df`
) AS `q01`
dplyr::mutate_at(data, .vars = dplyr::vars(any_of(vars)), .funs = fn) | ||
dplyr::mutate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also switching away from mutate_at()
here
dplyr::mutate( | ||
data, | ||
dplyr::across(any_of(vars), .fns = fn), | ||
.by = {{ .by }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing straight through to dplyr means that if .by
is wrong for some reason, then the error call shows dplyr::mutate()
rather than fill()
. I think that is worth it to not have to fiddle with .by
in any way, because dplyr has some pretty unique handling for it. (See the tests for more info)
test_that("errors on named `...` inputs", { | ||
df <- tibble(x = c(1, NA, 2)) | ||
|
||
expect_snapshot(error = TRUE, { | ||
fill(df, fooy = x) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noted we didn't have a test for this and since we are touching check_dots_unnamed()
here we may as well add one
@olivroy thanks a lot! I'll take over from here. Leaving some comments for another colleague to review |
…epeat the error message from mutate() for a better stack trace.
Thanks @olivroy! |
Addresses part of #1439.