stacks 0.2.3
While stacks 0.2.3 is a minor release, it includes a number of significant user experience improvements. This release adds an option to significantly reduce runtime for prediction blending, makes errors and warnings more informative, and greatly reduces the size of reloaded model objects in memory.
Regarding that first point, take a look at how adjusting the times
argument in blend_predictions
drastically affects its runtime:
library(tidymodels)
library(modeldata)
# using a version of the package where `times` is a param
library(stacks)
data("lending_club")
set.seed(1)
lending_club <- sample_n(lending_club, 1000)
folds <- vfold_cv(lending_club, v = 5)
lr_mod <-
linear_reg(penalty = tune(), mixture = tune()) %>%
set_engine("glmnet") %>%
workflow(
preprocessor = funded_amnt ~ int_rate + total_bal_il,
spec = .
) %>%
tune_grid(
resamples = folds,
control = control_stack_grid(),
grid = 10
)
system.time(
stacks() %>%
add_candidates(lr_mod) %>%
blend_predictions(times = 25)
)
#> user system elapsed
#> 10.280 0.112 10.550
system.time(
stacks() %>%
add_candidates(lr_mod) %>%
blend_predictions(times = 10)
)
#> user system elapsed
#> 4.424 0.050 4.554
system.time(
stacks() %>%
add_candidates(lr_mod) %>%
blend_predictions(times = 4)
)
#> user system elapsed
#> 2.158 0.018 2.194
Related to the second point, there are several different degrees and varieties of tuning "failure" that result in stacks tripping up during model stacking. The package now inspects its inputs more closely and may give you a heads up when you might run into issues later on. Look out for warnings like:
#> Warning message:
#> The inputted `candidates` argument `my_tuning_results` generated notes during tuning/resampling.
#> Model stacking may fail due to these issues; see `?collect_notes` if so.
And, finally, related to model stack object size, check out the the results of butcher::weigh
on the results like the above reprex (after saving and reloading) before and after this release:
weigh(lr_stack_before)
#> # A tibble: 374 × 2
#> object size
#> <chr> <dbl>
#> 1 coefs.preproc.terms 2.64
#> 2 coefs.fit.call 2.64
#> 3 coefs.spec.eng_args.lower.limits 2.64
#> 4 coefs.spec.method.fit.args.lower.limits 2.64
#> 5 coefs.spec.method.pred.numeric.post 1.76
#> 6 member_fits.lr_mod_1_3.fit.fit.spec.method.pred.numeric.post 1.76
#> 7 member_fits.lr_mod_1_1.fit.fit.spec.method.pred.numeric.post 1.76
#> 8 member_fits.lr_mod_1_3.pre.actions.formula.blueprint.mold.process 0.0172
#> 9 member_fits.lr_mod_1_3.pre.mold.blueprint.mold.process 0.0172
#> 10 member_fits.lr_mod_1_1.pre.actions.formula.blueprint.mold.process 0.0172
#> # … with 364 more rows
weigh(lr_stack_after)
#> # A tibble: 374 × 2
#> object size
#> <chr> <dbl>
#> 1 coefs.preproc.terms 24.7
#> 2 coefs.fit.call 24.7
#> 3 coefs.spec.eng_args.lower.limits 24.7
#> 4 coefs.spec.method.fit.args.lower.limits 24.7
#> 5 coefs.spec.method.pred.numeric.post 1.79
#> 6 member_fits.lr_mod_1_3.fit.fit.spec.method.pred.numeric.post 1.79
#> 7 member_fits.lr_mod_1_1.fit.fit.spec.method.pred.numeric.post 1.79
#> 8 member_fits.lr_mod_1_3.pre.actions.formula.blueprint.mold.process 0.0172
#> 9 member_fits.lr_mod_1_3.pre.mold.blueprint.mold.process 0.0172
#> 10 member_fits.lr_mod_1_1.pre.actions.formula.blueprint.mold.process 0.0172
#> # … with 364 more rows
Read more about these changes and their implementations at the issues linked below.🐧
Changelog
- Addressed deprecation warning in
add_candidates
(#99). - Improved clarity of warnings/errors related to failed hyperparameter tuning and resample fitting (#110).
- Reduced model stack object size and fixed bug where object size of model stack inflated drastically after saving to file (#116). Also, regenerated example objects with this change--saved model objects may need to be regenerated in order to interface with newer versions of the package.
- Introduced a
times
argument toblend_predictions
that is passed on torsample::bootstraps
when fitting stacking coefficients. Reducing this argument from its default (25
) greatly reduces the run time ofblend_predictions
(#94). - The package will now load packages necessary for model fitting at
fit_members()
, if available, and fail informatively if not (#118). - Fixed bug where meta-learner tuning would fail with outcome names and levels including the string
"class"
(#125). - The package will now warn when unused dots are passed to any of the core functions (#127).