Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

count() does not work, n() does #613

Closed
szarnyasg opened this issue Feb 14, 2025 · 3 comments
Closed

count() does not work, n() does #613

szarnyasg opened this issue Feb 14, 2025 · 3 comments

Comments

@szarnyasg
Copy link

I found that the following script fails:

library("duckplyr")

iris |>
    group_by(Species) |>
    summarize(num_observations = count()) |>
    collect()
The duckplyr package is configured to fall back to dplyr when it encounters an incompatibility. Fallback events can be collected and
uploaded for analysis to guide future development. By default, data will be collected but no data will be uploaded.
ℹ Automatic fallback uploading is not controlled and therefore disabled, see `?duckplyr::fallback()`.
✔ Number of reports ready for upload: 5.
→ Review with `duckplyr::fallback_review()`, upload with `duckplyr::fallback_upload()`.
ℹ Configure automatic uploading with `duckplyr::fallback_config()`.
✔ Overwriting dplyr methods with duckplyr methods.
ℹ Turn off with `duckplyr::methods_restore()`.
Error in `summarize()`:
ℹ In argument: `num_observations = count()`.
ℹ In group 1: `Species = setosa`.
Caused by error in `UseMethod()`:
! no applicable method for 'count' applied to an object of class "NULL"
Run `rlang::last_trace()` to see where the error occurred.

Replacing count() to n() makes it work:

library("duckplyr")

iris |>
    group_by(Species) |>
    summarize(num_observations = n()) |>
    collect()
# A tibble: 3 × 2
  Species    num_observations
  <fct>                 <int>
1 setosa                   50
2 versicolor               50
3 virginica                50
@krlmlr
Copy link
Member

krlmlr commented Feb 14, 2025

Thanks. I also see:

options(conflicts.policy = list(warn = FALSE))
library(dplyr)

iris |>
  group_by(Species) |>
  summarize(num_observations = count()) |>
  collect()
#> Error in `summarize()`:
#> ℹ In argument: `num_observations = count()`.
#> ℹ In group 1: `Species = setosa`.
#> Caused by error in `UseMethod()`:
#> ! no applicable method for 'count' applied to an object of class "NULL"

Created on 2025-02-14 with reprex v2.1.1

count() is a verb to be used instead of summarize() . But we don't support factors yet:

options(conflicts.policy = list(warn = FALSE))
library(dplyr)

iris |>
  duckplyr::as_duckdb_tibble(prudence = "stingy") |>
  count(Species) |>
  collect()
#> Error in `duckdb_rel_from_df()` at duckplyr/R/duckplyr_df.R:20:5:
#> ! Can't convert columns of class <factor> to relational. Affected
#>   column: `Species`.

Created on 2025-02-14 with reprex v2.1.1

(Need a better error message here too.)

This will work and use DuckDB:

options(conflicts.policy = list(warn = FALSE))
library(dplyr)

iris |>
  mutate(Species = as.character(Species)) |>
  duckplyr::as_duckdb_tibble(prudence = "stingy") |>
  count(Species) |>
  collect()
#> # A tibble: 3 × 2
#>   Species        n
#>   <chr>      <int>
#> 1 setosa        50
#> 2 versicolor    50
#> 3 virginica     50

Created on 2025-02-14 with reprex v2.1.1

@krlmlr
Copy link
Member

krlmlr commented Feb 14, 2025

Better error message with #614.

@szarnyasg
Copy link
Author

Great, thanks!

@krlmlr krlmlr closed this as completed Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants