Support dbWriteTable() #94

blairj09 · 2023-12-16T00:48:28Z

Currently when trying to write local data to Databricks, the following is observed:

> library(sparklyr)
> sc <- spark_connect(method = "databricks_connect", cluster_id = "*************")
! Changing host URL to: ****************
  Set `host_sanitize = FALSE` in `spark_connect()` to avoid changing it
✔ Retrieving info for cluster:'*************' [313ms]
✔ Using the 'r-sparklyr-databricks-14.0' Python environment 
  Path: /home/james/.virtualenvs/r-sparklyr-databricks-14.0/bin/python
✔ Connecting to 'Test Cluster' (DBR '14.0') [470ms]
> DBI::dbWriteTable(sc, "demos.testing.foo", mtcars, overwrite = TRUE)
Error in UseMethod("invoke") : 
  no applicable method for 'invoke' applied to an object of class "list"

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DBI_1.1.3             pysparklyr_0.1.2.9000 sparklyr_1.8.4       

loaded via a namespace (and not attached):
 [1] Matrix_1.6-1.1    jsonlite_1.8.7    dplyr_1.1.3       compiler_4.3.1    tidyselect_1.2.0 
 [6] Rcpp_1.0.11       parallel_4.3.1    tidyr_1.3.0       png_0.1-8         uuid_1.1-1       
[11] yaml_2.3.7        reticulate_1.34.0 lattice_0.21-9    R6_2.5.1          generics_0.1.3   
[16] curl_5.1.0        httr2_0.2.3       knitr_1.44        tibble_3.2.1      openssl_2.1.1    
[21] pillar_1.9.0      rlang_1.1.1       utf8_1.2.3        xfun_0.40         config_0.3.2     
[26] fs_1.6.3          cli_3.6.1         magrittr_2.0.3    ps_1.7.5          grid_4.3.1       
[31] processx_3.8.2    rstudioapi_0.15.0 dbplyr_2.3.4      rappdirs_0.3.3    askpass_1.2.0    
[36] lifecycle_1.0.3   vctrs_0.6.3       glue_1.6.2        fansi_1.0.4       purrr_1.0.2      
[41] httr_1.4.7        tools_4.3.1       pkgconfig_2.0.3

The text was updated successfully, but these errors were encountered:

edgararuiz · 2023-12-21T17:39:38Z

This will require an entire new DBI back-end for pysparklyr objects, not something I'd like to start this close to release time.

tnederlof · 2024-11-22T19:55:25Z

This request is coming up for users at a customer. Since so many of them are used to using dbWriteTable it would really help them onboard into using clusters from Workbench.

In the meantime I suggested they do something like:

random_df <- tibble::tibble("A" = rep(1,5,1), "B" = rep(1,5,1))

spark_tbl_random_df <- copy_to(sc, random_df, "spark_random_df")

spark_tbl_random_df  %>%
  spark_write_table(
    name = I("demo.default.random_df"),
    mode = "overwrite"
  )

edgararuiz added fix-before-release and removed fix-before-release labels Dec 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support dbWriteTable() #94

Support dbWriteTable() #94

blairj09 commented Dec 16, 2023

edgararuiz commented Dec 21, 2023

tnederlof commented Nov 22, 2024

Support dbWriteTable() #94

Support dbWriteTable() #94

Comments

blairj09 commented Dec 16, 2023

edgararuiz commented Dec 21, 2023

tnederlof commented Nov 22, 2024