Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dbWriteTable() #94

Open
blairj09 opened this issue Dec 16, 2023 · 2 comments
Open

Support dbWriteTable() #94

blairj09 opened this issue Dec 16, 2023 · 2 comments

Comments

@blairj09
Copy link

Currently when trying to write local data to Databricks, the following is observed:

> library(sparklyr)
> sc <- spark_connect(method = "databricks_connect", cluster_id = "*************")
! Changing host URL to: ****************
  Set `host_sanitize = FALSE` in `spark_connect()` to avoid changing it
✔ Retrieving info for cluster:'*************' [313ms]
✔ Using the 'r-sparklyr-databricks-14.0' Python environment 
  Path: /home/james/.virtualenvs/r-sparklyr-databricks-14.0/bin/python
✔ Connecting to 'Test Cluster' (DBR '14.0') [470ms]
> DBI::dbWriteTable(sc, "demos.testing.foo", mtcars, overwrite = TRUE)
Error in UseMethod("invoke") : 
  no applicable method for 'invoke' applied to an object of class "list"
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DBI_1.1.3             pysparklyr_0.1.2.9000 sparklyr_1.8.4       

loaded via a namespace (and not attached):
 [1] Matrix_1.6-1.1    jsonlite_1.8.7    dplyr_1.1.3       compiler_4.3.1    tidyselect_1.2.0 
 [6] Rcpp_1.0.11       parallel_4.3.1    tidyr_1.3.0       png_0.1-8         uuid_1.1-1       
[11] yaml_2.3.7        reticulate_1.34.0 lattice_0.21-9    R6_2.5.1          generics_0.1.3   
[16] curl_5.1.0        httr2_0.2.3       knitr_1.44        tibble_3.2.1      openssl_2.1.1    
[21] pillar_1.9.0      rlang_1.1.1       utf8_1.2.3        xfun_0.40         config_0.3.2     
[26] fs_1.6.3          cli_3.6.1         magrittr_2.0.3    ps_1.7.5          grid_4.3.1       
[31] processx_3.8.2    rstudioapi_0.15.0 dbplyr_2.3.4      rappdirs_0.3.3    askpass_1.2.0    
[36] lifecycle_1.0.3   vctrs_0.6.3       glue_1.6.2        fansi_1.0.4       purrr_1.0.2      
[41] httr_1.4.7        tools_4.3.1       pkgconfig_2.0.3  
@edgararuiz
Copy link
Collaborator

This will require an entire new DBI back-end for pysparklyr objects, not something I'd like to start this close to release time.

@tnederlof
Copy link

This request is coming up for users at a customer. Since so many of them are used to using dbWriteTable it would really help them onboard into using clusters from Workbench.

In the meantime I suggested they do something like:

random_df <- tibble::tibble("A" = rep(1,5,1), "B" = rep(1,5,1))

spark_tbl_random_df <- copy_to(sc, random_df, "spark_random_df")

spark_tbl_random_df  %>%
  spark_write_table(
    name = I("demo.default.random_df"),
    mode = "overwrite"
  )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants