Skip to content

Commit 9c28a4f

Browse files
topepoEmilHvitfeldtgithub-actions[bot]
authored
various updated prior to tune 2.0 (#1092)
* update urls * doMC -> future * doMC -> future * updated example objects * remove unused functions (and their direct tests) * A note about require in the extract files * polish news * added cran skips for long running examples * Apply suggestions from code review Co-authored-by: Emil Hvitfeldt <[email protected]> * updates based on reviewer comments * not used * pad zeros in iterative search .configs * a few more foreach references * run inst/test_objects.R * reorder column names * remove Rout file * air format * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: Emil Hvitfeldt <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 4add897 commit 9c28a4f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+540
-1099
lines changed

NEWS.md

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,37 @@
22

33
## Changes to `tune_grid()`.
44

5-
* A major rewrite and refactor of the underlying code that runs `tune_grid()` was made.
5+
* A major rewrite/refactor of the underlying code that runs `tune_grid()`. This was an upgrade to add postprocessing and to modernize our parallel processing infrastructure.
66

7-
* The pattern of `.config` values has changed from `Preprocessor{num}_Model{num}` to `pre{num}_mod{num}_post{num}`.
7+
* The pattern of `.config` values has changed.
8+
9+
- For grid search, it changes from `Preprocessor{num}_Model{num}` to `pre{num}_mod{num}_post{num}`. The numbers include a zero when that element was static. For example, a value of `pre0_mod3_post4` means no preprocessors were tuned and the model and postprocessor(s) had at least three and four candidates, respectively.
10+
- For iterative search, the pattern is not `iter{num}` instead of `Iter{num}` and the numbers are now zero padded to sort better. For example, if there between 10 and 99 iterations, the first `.config` value is now `iter01` instead of `Iter1`.
811

912
* The package will now log a backtrace for errors and warnings that occur during tuning. When a tuning process encounters issues, see the new `trace` column in the `collect_notes(.Last.tune.result)` output to find precisely where the error occurred (#873).
1013

11-
* Post-processing: new `schedule_grid()` for scheduling a grid including post-processing (#988).
14+
* Postprocessors can now be tuned. Currently, we support the tailor package.
1215

13-
## Other Changes
16+
## Parallel Processing
1417

1518
* Introduced support for parallel processing with mirai in addition to the currently supported framework future. See `?parallelism` to learn more (#1028).
1619

1720
* Sequential and parallel processing all use the same L'Ecuyer-CMRG seeds (conditional on `parallel_over`) (#1033).
1821

19-
* `int_pctl()` now includes an option (`keep_replicates`) to retain the individual bootstrap estimates. It also processes the resamples more efficiently (#1000).
20-
21-
* A `min_grid()` methods was added for `proportional_hazards` models so that their submodels are processed appropriately.
22-
2322
## Breaking Changes
2423

2524
* The `foreach` package is no longer supported. Instead, use the future or mirai packages.
25+
2626
* The parallel backend(s) and the methods of constructing seeds for workers have changed. There will be a lack of reproducibility between objects created in this version of tune and previous versions.
2727

28+
## Other Changes
29+
30+
* `int_pctl()` now includes an option (`keep_replicates`) to retain the individual bootstrap estimates. It also processes the resamples more efficiently (#1000).
31+
32+
* A `min_grid()` methods was added for `proportional_hazards` models so that their submodels are processed appropriately.
33+
34+
* Post-processing: new `schedule_grid()` for scheduling a grid including post-processing (#988).
35+
2836
# tune 1.3.0
2937

3038
* The package will now warn when parallel processing has been enabled with foreach but not with future. See [`?parallelism`](https://tune.tidymodels.org/dev/reference/parallelism.html) to learn more about transitioning your code to future (#878, #866). The next version of tune will move to a pure future implementation.

R/compute_metrics.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -147,9 +147,9 @@ compute_metrics.tune_results <- function(
147147
# nest by resample id
148148
nest_cols <- "id"
149149

150-
if ("Iter1" %in% mtrcs$.config) {
150+
# Convert the iterative .configs into numbers
151+
if (any(grepl("^[iI]ter", mtrcs$.config))) {
151152
mtrcs$.iter <- .config_to_.iter(mtrcs$.config)
152-
153153
nest_cols <- c(nest_cols, ".iter")
154154
}
155155

R/fit_best.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
#'
3838
#' @inheritSection last_fit See also
3939
#'
40-
#' @examplesIf tune:::should_run_examples() & rlang::is_installed("modeldata")
40+
#' @examplesIf tune:::should_run_examples() && rlang::is_installed("modeldata") && !tune:::is_cran_check()
4141
#' library(recipes)
4242
#' library(rsample)
4343
#' library(parsnip)

R/grid_helpers.R

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -99,15 +99,6 @@ forge_from_workflow <- function(new_data, workflow) {
9999
forged
100100
}
101101

102-
mold_has_case_weights <- function(mold) {
103-
roles <- mold$extras$roles
104-
no_extras <- is.null(roles)
105-
if (no_extras) {
106-
return(FALSE)
107-
}
108-
any(names(roles) == "case_weights")
109-
}
110-
111102
get_metrics_by <- function(metric_set) {
112103
metrics <- attr(metric_set, "metrics")
113104
metrics_by <- purrr::map(metrics, attr, "by")

R/loop_over_all_stages-helpers.R

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -370,21 +370,6 @@ replace_reserve_rows <- function(iter, chunk) {
370370
start_loc:end_loc
371371
}
372372

373-
update_reserve <- function(reserve, iter, predictions, grid_size) {
374-
grid_size <- min(1, grid_size)
375-
pred_size <- nrow(predictions)
376-
377-
if (is.null(reserve)) {
378-
reserve <- initialize_pred_reserve(predictions, grid_size)
379-
} else {
380-
if (tibble::is_tibble(predictions)) {
381-
predictions <- dplyr::as_tibble(predictions)
382-
}
383-
}
384-
reserve[replace_reserve_rows(iter, pred_size), ] <- predictions
385-
reserve
386-
}
387-
388373
# ------------------------------------------------------------------------------
389374
# Add .config to grid
390375

R/parallel.R

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -483,16 +483,6 @@ pctl_call <- function(framework, args = list()) {
483483

484484
# ------------------------------------------------------------------------------
485485

486-
warn_foreach_deprecation <- function() {
487-
cli::cli_warn(c(
488-
"!" = "{.pkg tune} detected a parallel backend registered with \\
489-
foreach but no backend registered with future.",
490-
"i" = "Support for parallel processing with foreach was \\
491-
soft-deprecated in {.pkg tune} 1.2.1.",
492-
"i" = "See {.help [?parallelism](tune::parallelism)} to learn more."
493-
))
494-
}
495-
496486
manange_global_limit <- function(min = 1e9) {
497487
currrent_value <- getOption("future.globals.maxSize")
498488
if (is.null(currrent_value)) {

R/pull.R

Lines changed: 0 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -5,79 +5,6 @@ extract_details <- function(object, extractor) {
55
extractor(object)
66
}
77

8-
# ------------------------------------------------------------------------------
9-
10-
# Grab the new results, make sure that they align row-wise with the rsample
11-
# object and then bind columns
12-
pulley <- function(resamples, res, col, order) {
13-
if (all(purrr::map_lgl(res, inherits, "simpleError"))) {
14-
res <-
15-
resamples |>
16-
mutate(col = purrr::map(splits, \(x) NULL)) |>
17-
setNames(c(names(resamples), col))
18-
return(res)
19-
}
20-
21-
all_null <- all(purrr::map_lgl(res, is.null))
22-
23-
id_cols <- grep("^id", names(resamples), value = TRUE)
24-
25-
resamples <- vctrs::vec_slice(resamples, order)
26-
27-
pulled_vals <- purrr::map(res, \(.x) .x[[col]]) |> purrr::list_rbind()
28-
29-
if (nrow(pulled_vals) == 0) {
30-
res <-
31-
resamples |>
32-
mutate(col = purrr::map(splits, \(x) NULL)) |>
33-
setNames(c(names(resamples), col))
34-
return(res)
35-
}
36-
37-
pulled_vals <- tidyr::nest(pulled_vals, data = -starts_with("id"))
38-
names(pulled_vals)[ncol(pulled_vals)] <- col
39-
40-
res <- new_bare_tibble(resamples)
41-
res <- full_join(res, pulled_vals, by = id_cols)
42-
res <- reup_rs(resamples, res)
43-
res
44-
}
45-
46-
maybe_repair <- function(x) {
47-
not_null <- !purrr::map_lgl(x, is.null)
48-
is_tibb <- purrr::map_lgl(x, tibble::is_tibble)
49-
ok <- not_null & is_tibb
50-
if (!any(ok)) {
51-
return(x)
52-
}
53-
54-
good_val <- which(ok)[1]
55-
template <- x[[good_val]][0, ]
56-
57-
insert_val <- function(x, y) {
58-
if (is.null(x)) {
59-
x <- y
60-
}
61-
x
62-
}
63-
64-
x <- purrr::map(x, insert_val, y = template)
65-
x
66-
}
67-
68-
ensure_tibble <- function(x) {
69-
if (is.null(x)) {
70-
res <- tibble::new_tibble(list(.notes = character(0)), nrow = 0)
71-
} else {
72-
res <- tibble::new_tibble(list(.notes = x), nrow = length(x))
73-
}
74-
res
75-
}
76-
77-
append_outcome_names <- function(all_outcome_names, outcome_names) {
78-
c(all_outcome_names, list(outcome_names))
79-
}
80-
818
#' Convenience functions to extract model
829
#'
8310
#' `r lifecycle::badge("soft-deprecated")`

R/tune_bayes.R

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,15 @@
6161
#' results. For good results, the number of initial values should be more than
6262
#' the number of parameters being optimized.
6363
#'
64+
#' The tuning parameter combinations that were tested are called _candidates_.
65+
#' Each candidate has a unique `.config` value that, for the initial grid search,
66+
#' has the pattern `pre{num}_mod{num}_post{num}`. The numbers include a zero
67+
#' when that element was static. For example, a value of `pre0_mod3_post4` means
68+
#' no preprocessors were tuned and the model and postprocessor(s) had at least
69+
#' three and four candidates, respectively. The iterative part of the
70+
#' search uses the pattern `iter{num}`. In each case, the numbers are
71+
#' zero-padded to enable proper sorting.
72+
#'
6473
#' @section Parameter Ranges and Values:
6574
#'
6675
#' In some cases, the tuning parameter values depend on the dimensions of the
@@ -135,7 +144,7 @@
135144
#' calculated for every value of `eval_time` but the _first_ evaluation time
136145
#' given by the user (e.g., `eval_time[1]`) is used to guide the optimization.
137146
#'
138-
#' @examplesIf tune:::should_run_examples(suggests = "kernlab")
147+
#' @examplesIf tune:::should_run_examples(suggests = "kernlab") && !tune:::is_cran_check()
139148
#' library(recipes)
140149
#' library(rsample)
141150
#' library(parsnip)
@@ -290,6 +299,11 @@ tune_bayes_workflow <- function(
290299
rset_info <- pull_rset_attributes(resamples)
291300

292301
check_iter(iter, call = call)
302+
if (iter > 0) {
303+
iter_chr <- recipes::names0(iter, "iter")
304+
} else {
305+
iter_chr <- "iter0"
306+
}
293307

294308
metrics <- check_metrics_arg(metrics, object, call = call)
295309
opt_metric <- first_metric(metrics)
@@ -488,20 +502,20 @@ tune_bayes_workflow <- function(
488502
tmp_res[[".metrics"]] <- purrr::map(
489503
tmp_res[[".metrics"]],
490504
dplyr::mutate,
491-
.config = paste0("Iter", i)
505+
.config = iter_chr[i]
492506
)
493507
if (control$save_pred) {
494508
tmp_res[[".predictions"]] <- purrr::map(
495509
tmp_res[[".predictions"]],
496510
dplyr::mutate,
497-
.config = paste0("Iter", i)
511+
.config = iter_chr[i]
498512
)
499513
}
500514
if (".extracts" %in% names(tmp_res)) {
501515
tmp_res[[".extracts"]] <- purrr::map(
502516
tmp_res[[".extracts"]],
503517
dplyr::mutate,
504-
.config = paste0("Iter", i)
518+
.config = iter_chr[i]
505519
)
506520
}
507521
unsummarized <- dplyr::bind_rows(

R/tune_grid.R

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,14 @@
7777
#' change the values. This updated parameter set can be passed to the function
7878
#' via the `param_info` argument.
7979
#'
80+
#' The rows of the grid are called tuning parameter _candidates_. Each
81+
#' candidate has a unique `.config` value that, for grid search, has the
82+
#' pattern `pre{num}_mod{num}_post{num}`. The numbers include a zero when that
83+
#' element was static. For example, a value of `pre0_mod3_post4` means no
84+
#' preprocessors were tuned and the model and postprocessor(s) had at least
85+
#' three and four candidates, respectively. Also, the numbers are zero-padded
86+
#' to enable proper sorting.
87+
#'
8088
#' @section Performance Metrics:
8189
#'
8290
#' To use your own performance metrics, the [yardstick::metric_set()] function
@@ -160,10 +168,14 @@
160168
#' sub-models so that, in these cases, not every row in the tuning parameter
161169
#' grid has a separate R object associated with it.
162170
#'
171+
#' Finally, it is a good idea to include calls to [require()] for packages that
172+
#' are used in the function. This helps prevent failures when using parallel
173+
#' processing.
174+
#'
163175
#' @template case-weights
164176
#' @template censored-regression
165177
#'
166-
#' @examplesIf tune:::should_run_examples(suggests = "kernlab") & rlang::is_installed("splines2")
178+
#' @examplesIf tune:::should_run_examples(suggests = "kernlab") & rlang::is_installed("splines2") && !tune:::is_cran_check()
167179
#' library(recipes)
168180
#' library(rsample)
169181
#' library(parsnip)

R/utils.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,8 @@ new_bare_tibble <- function(x, ..., class = character()) {
8686
# `.iter = 0`.
8787
.config_to_.iter <- function(.config) {
8888
.iter <- .config
89-
nonzero <- grepl("Iter", .iter)
90-
.iter <- ifelse(nonzero, gsub("Iter", "", .iter), "0")
89+
nonzero <- grepl("^[iI]ter", .iter)
90+
.iter <- ifelse(nonzero, gsub("^[iI]ter", "", .iter), "0")
9191
.iter <- as.numeric(.iter)
9292
.iter
9393
}

0 commit comments

Comments
 (0)