Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

luz::lr_finder() Segmentation fault #43

Open
agorelick opened this issue Nov 15, 2024 · 2 comments
Open

luz::lr_finder() Segmentation fault #43

agorelick opened this issue Nov 15, 2024 · 2 comments

Comments

@agorelick
Copy link

Hi, I am running through your Getting Started example and I get a segfault in the luz::lr_finder() step. I have verified that torch and luz work successfully for non-tft models, so I think this is an issue within the tft package. I am running within a conda environment, using cuda 11.7. Have you ever encountered this issue? Any suggestions would be extremely appreciated! Thanks.

Here is my sessionInfo:

Platform: x86_64-conda-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS/LAPACK: /home/alex/miniconda3/envs/timeseries/lib/libopenblasp-r0.3.28.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] luz_0.4.0          torch_0.13.0       tft_0.0.0.9000     yardstick_1.3.1   
 [5] workflowsets_1.1.0 workflows_1.1.4    tune_1.2.1         tidyr_1.3.1       
 [9] tibble_3.2.1       rsample_1.2.1      recipes_1.1.0      purrr_1.0.2       
[13] parsnip_1.2.1      modeldata_1.4.0    infer_1.0.7        ggplot2_3.5.1     
[17] dplyr_1.1.4        dials_1.3.0        scales_1.3.0       broom_1.0.7       
[21] tidymodels_1.2.0  

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1    timeDate_4041.110   digest_0.6.37      
 [4] rpart_4.1.23        timechange_0.3.0    lifecycle_1.0.4    
 [7] survival_3.7-0      processx_3.8.4      magrittr_2.0.3     
[10] compiler_4.4.1      progress_1.2.3      rlang_1.1.4        
[13] tools_4.4.1         utf8_1.2.4          data.table_1.15.4  
[16] prettyunits_1.2.0   bit_4.5.0           DiceDesign_1.10    
[19] withr_3.0.2         nnet_7.3-19         grid_4.4.1         
[22] fansi_1.0.6         colorspace_2.1-1    future_1.34.0      
[25] globals_0.16.3      iterators_1.0.14    MASS_7.3-60.0.1    
[28] zeallot_0.1.0       cli_3.6.3           crayon_1.5.3       
[31] generics_0.1.3      rstudioapi_0.17.1   future.apply_1.11.2
[34] splines_4.4.1       parallel_4.4.1      coro_1.1.0         
[37] vctrs_0.6.5         hardhat_1.4.0       Matrix_1.6-5       
[40] callr_3.7.6         hms_1.1.3           bit64_4.5.2        
[43] listenv_0.9.1       foreach_1.5.2       gower_1.0.1        
[46] glue_1.8.0          parallelly_1.39.0   codetools_0.2-20   
[49] ps_1.8.1            lubridate_1.9.3     gtable_0.3.6       
[52] munsell_0.5.1       GPfit_1.0-8         furrr_0.3.1        
[55] pillar_1.9.0        ipred_0.9-15        lava_1.8.0         
[58] R6_2.5.1            lhs_1.2.0           lattice_0.22-6     
[61] backports_1.5.0     class_7.3-22        Rcpp_1.0.13-1      
[64] prodlim_2024.06.25  fs_1.6.5            pkgconfig_2.0.3    ```
@cregouby
Copy link
Collaborator

cregouby commented Nov 16, 2024

Hello @agorelick

There are few homework and cleanup to do on {tft} these days, but rerunning luz::lr_finder() from the Getting Started article did not raise any issue particular issue. R is known to have drawbacks running inside conda. Is there a way for you to run {tft} on a native stack (i.e. without conda) ?

Also is there some messages around the segfault that could help the investigation ? Like a smell of out-of-memory or something ?

Hope it helps.

@agorelick
Copy link
Author

Thank you for the reply! I think the issue was running it within conda as you suggested. In a global R installation the code ran perfectly! Actually, the separate torch/luz R script that I had successfully ran in conda failed with a segfault once I increased the neural network's complexity.

Thanks for your help! I don't think I would have tried a system-wide R installation without your suggestion and would probably have given up on using this package!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants