-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't build new SQL string without dataset in connection #434
Comments
But it does seem like an inconsistency as in #433 Oh, and the SQL assembled by |
Digging through the call stack when calling Line 47 in 64db635
Which you can see is trying to access the connection dataset. In your case it is NULL. function (con, sql)
{
if (dbplyr::is.sql(sql)) {
ds <- bq_dataset(con@project, con@dataset)
fields <- bq_query_fields(sql, con@billing, default_dataset = ds)
}
else {
tb <- as_bq_table(con, sql)
fields <- bq_table_fields(tb)
}
vapply(fields, "[[", "name", FUN.VALUE = character(1))
} |
EXACTLY. But the dataset is specified when I access the table I really should have included something in the original reprex. Notice the portion below that I've marked out. The query as generated by library(magrittr)
library(tidyverse)
library(dbplyr)
#>
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#>
#> ident, sql
library(bigrquery)
library(DBI)
library(RPostgres)
library(RPostgreSQL)
bq_deauth()
bq_auth(email="[email protected]")
getCount = function(bq_table)
{
### extract connection
conn = bq_table$src$con
print(conn)
### build sql string
sql_string = paste0(
"SELECT COUNT(*) FROM (",
dbplyr::sql_render(bq_table),
")"
) %>%
str_replace("\n", "")
print(sql_string)
### build query object
sql_query = dbplyr::sql(sql_string)
print(sql_query)
new_bq_table = tbl(conn, sql_query)
}
conn = dbConnect(
bigquery(),
project = "proj1-305622"
)
# df = tibble(
# A = sample(1:10, 5),
# B = letters[1:5]
# )
#
# dbWriteTable(
# conn,
# name = "dataset.df",
# value = df1,
# overwrite = T
# )
df_bq = tbl(conn, "dataset.df")
######### works here ################
df_bq %>% count() %>% show_query()
#> <SQL>
#> SELECT count(*) AS `n`
#> FROM `dataset.df`
df_bq %>% count()
#> # Source: lazy query [?? x 1]
#> # Database: BigQueryConnection
#> n
#> <int>
#> 1 5
############################################
df_bq %>% getCount()
#> <BigQueryConnection>
#> Billing: proj1-305622
#> [1] "SELECT COUNT(*) FROM (SELECT *FROM `dataset.df`)"
#> <SQL> SELECT COUNT(*) FROM (SELECT *FROM `dataset.df`)
#> Error: dataset is not a string (a length one character vector).
conn = dbConnect(
bigquery(),
project = "proj1-305622",
dataset = "dataset"
)
df_bq = tbl(conn, "dataset.df")
df_bq %>% getCount()
#> <BigQueryConnection>
#> Dataset: proj1-305622.dataset
#> Billing: proj1-305622
#> [1] "SELECT COUNT(*) FROM (SELECT *FROM `dataset.df`)"
#> <SQL> SELECT COUNT(*) FROM (SELECT *FROM `dataset.df`) Created on 2021-02-26 by the reprex package (v1.0.0) |
Here is putting it another way--querying with the connection and a string works using
|
Is there a workaround? I'm running into the same issue. I didn't add the dataset parameter because I wanted to join across projects. Upon trying to use a SQL expression in tbl, I get the same error about the dataset not being a string (because it wasn't provided in the connection). |
I think this is a simpler reprex, but it hits #540, so that will need to be resolved first. library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
library(bigrquery)
options(gargle_oauth_email = TRUE)
ds <- bq_test_dataset()
#> ℹ The bigrquery package is using a cached token for '[email protected]'.
con <- DBI::dbConnect(
bigquery(),
project = bq_test_project(),
dataset = ds$dataset
)
df <- tibble(A = sample(1:10, 5), B = letters[1:5])
DBI::dbWriteTable(con, "df", df)
getCount <- function(bq_table) {
sql <- paste0("SELECT COUNT(*) FROM (", dbplyr::sql_render(bq_table), ")")
tbl(bq_table$src$con, dbplyr::sql(sql))
}
getCount(tbl(con, "df"))
#> Warning: <BigQueryConnection> uses an old dbplyr interface
#> ℹ Please install a newer version of the package or contact the maintainer
#> This warning is displayed once every 8 hours.
#> Error in curl::curl_fetch_memory(url, handle = handle): URL rejected: Malformed input to a URL function Created on 2023-11-03 with reprex v2.0.2 |
This now seems to work: library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
library(bigrquery)
options(gargle_oauth_email = TRUE)
ds <- bq_test_dataset()
#> ℹ The bigrquery package is using a cached token for '[email protected]'.
con <- DBI::dbConnect(
bigquery(),
project = bq_test_project(),
dataset = ds$dataset
)
df <- tibble(A = sample(1:10, 5), B = letters[1:5])
DBI::dbWriteTable(con, "df", df)
getCount <- function(bq_table) {
sql <- paste0("SELECT COUNT(*) FROM (", dbplyr::sql_render(bq_table), ")")
tbl(bq_table$src$con, dbplyr::sql(sql))
}
getCount(tbl(con, "df"))
#> # Source: SQL [1 x 1]
#> # Database: BigQueryConnection
#> f0_
#> <int>
#> 1 5 Created on 2023-11-07 with reprex v2.0.2.9000 |
Cool! I'll check it out. I believe I came to this problem trying to write a |
I want to write custom functions to generate queries. In general usage, I have had no problems setting up connections with only the project and then accessing a table with the dataset name:
I've never had any problems with this in general use, even with cross-source tables.
However, when building up queries from strings, the dataset needs to be in the connection. Reprex below.
Created on 2021-02-23 by the reprex package (v1.0.0)
The text was updated successfully, but these errors were encountered: