Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hdcuremodels #690

Open
20 tasks
kelliejarcher opened this issue Feb 26, 2025 · 2 comments
Open
20 tasks

hdcuremodels #690

kelliejarcher opened this issue Feb 26, 2025 · 2 comments

Comments

@kelliejarcher
Copy link

Submitting Author Name: Kellie J. Archer
Submitting Author Github Handle: @kelliejarcher
Repository: https://github.com/kelliejarcher/hdcuremodels
Submission type: Pre-submission
Language: en


  • Paste the full DESCRIPTION file inside a code block below:
Package: hdcuremodels
Title: Penalized Mixture Cure Models for High-Dimensional Data
Version: 0.0.2
Date: 2025-02-26
Authors@R: 
    c(person("Han", "Fu", role = "aut"), person(c("Kellie J."), "Archer", email=
    "[email protected]", role = c("aut","cre"), comment = c(ORCID="0000-0003-1555-5781")))
Description: Provides functions for fitting various penalized parametric and semi-parametric mixture cure models with different penalty functions, testing for a significant cure fraction, and testing for sufficient follow-up as described in Fu et al (2022)<doi:10.1002/sim.9513> and Archer et al (2024)<doi:10.1186/s13045-024-01553-6>. False discovery rate controlled variable selection is provided using model-X knock-offs. 
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (>= 4.2.0)
Imports: doParallel,
         flexsurv,
         flexsurvcure,
         foreach,
         ggplot2,
         ggpubr,
         glmnet,
         knockoff,
         mvnfast,
         parallel,
         plyr,
         methods,
         survival
Roxygen: list(markdown = TRUE, roclets = c ("namespace", "rd", "srr::srr_stats_roclet"))
RoxygenNote: 7.3.2
Suggests: 
    knitr,
    rmarkdown,
    roxygen2
VignetteBuilder: knitr
LazyData: true

Scope

  • Please indicate which category or categories from our package fit policies or statistical package categories this package falls under. (Please check one or more appropriate boxes below):

    Data Lifecycle Packages

    • data retrieval
    • data extraction
    • data munging
    • data deposition
    • data validation and testing
    • workflow automation
    • version control
    • citation management and bibliometrics
    • scientific software wrappers
    • field and lab reproducibility tools
    • database software bindings
    • geospatial data
    • text analysis

    Statistical Packages

    • Bayesian and Monte Carlo Routines
    • Dimensionality Reduction, Clustering, and Unsupervised Learning
    • Machine Learning
    • [X ] Regression and Supervised Learning
    • Exploratory Data Analysis (EDA) and Summary Statistics
    • Spatial Analyses
    • Time Series Analyses
    • Probability Distributions
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

  • If submitting a statistical package, have you already incorporated documentation of standards into your code via the srr package?

Yes, I tried to add notations via the srr package. This is all very new so advice would be most welcome.

  • Who is the target audience and what are scientific applications of this package?

Analysts interested in modeling a time-to-event outcome when a subset of patients experience long-term survival or cure. This package permits fitting penalized mixture cure models so the functions can handle modeling the time-to-event outcome when the covariate/predictor space is high-dimensional.

None of the existing packages that fit mixture cure models (MCMs) are capable of handling high-dimensional datasets. Only penPHcure includes a LASSO penalty to perform variable selection for scenarios when the sample size exceeds the number of predictors. Other R packages that can be used for fitting MCMs include:

  • cuRe (Jakobsen, 2023) can be used to fit parametric MCMs on a relative survival scale;
  • CureDepCens (Schneider and Grandemagne dos Santos, 2023) can be used to fit piecewise exponential or Weibull model with dependent censoring;
  • curephEM (Hou and Ren, 2024) can be used to fit a MCM where the latency is modeled using a Cox PH model;
  • flexsurvcure (Amdahl, 2022) can be used to fit parametric mixture and non-mixture cure models;
  • geecure (Niu and Peng, 2018) can be used to fit marginal MCM for clustered survival data;
  • GORCure (Zhou et al, 2017) can be used to fit generalized odds rate MCM with interval censored data;
  • mixcure (Peng, 2020) can be used to fit non-parametric, parametric, and semiparametric MCMs;
  • npcure (López-de-Ullibarri and López-Cheda, 2020) can be used to non-parametrically estimate incidence and latency;
  • npcurePK (Safari et al, 2023) can be used to non-parametrically estimate incidence and latency when cure is partially observed;
  • penPHcure(Beretta and Heuchenne, 2019) can be used to fit semi-parametric PH MCMs with time-varying covariates; and
  • smcure (Cai et al 2022) can be used to fit semi-parametric (PH and AFT) MCMs.

Not applicable.

  • Any other questions or issues we should be aware of?:

I am very inexperienced using GitHub and have not used it for collaboration before. I did post a version of this package on CRAN last June and more recently learned about ROpenSci so the github version is my initial attempt to adhere to your standards. I have not submitted a peer-reviewed manuscript yet as I would prefer to have an ROpenSci review first. Also, when I ran pkgcheck and tried to look at the summary I received a message that read, "Error: No GNU global installation found." and was unclear how to proceed.

@mpadge
Copy link
Member

mpadge commented Feb 27, 2025

Hi @kelliejarcher, and thank you for your pre-submission. No worries about inexperience - rOpenSci strives to be as welcoming and inclusive as possible, and to always be answer any questions you might have, and to help you along the way. As this is a pre-submission inquiry, feel free to ask questions here, or alternatively open issues in your own repository, cross-link them here by pasting the url for this issue in the comment, and ping me there. (Note that we try to keep the full submissions as "clean" as possible to help focus on reviews, while pre-submissions are the place for more general questions and dicussions.)

Specific responses to your questions:

  • The "Error: No GNU global installation found" is because {pkgcheck}, and the {pkgstats} package it uses to analyses packages, require a couple of system libraries, including "GNU global". If you're on a Linux-based or MacOS, installation is simply, generally by using standard package manager (apt-get, homebrew, or whatever), to install global. If you're on Windows, it's tricker, but you could start with the links in the {pkgstats} installation vignette.
  • The {srr} question sounds more general, and is maybe best moved to a specific issue within your repo? If you ping me there, I'll happily help further.

More generally, your package definitely looks like a good fit for statistical software review, and definitely within the category you've already indicated. Looking forward to working towards a full submission!

@kelliejarcher
Copy link
Author

kelliejarcher commented Feb 27, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants