Switch from GPfit to GauPro for Gaussian Processes #1104

topepo · 2025-10-17T16:00:31Z

The GPfit maintainer contacted me a few months ago about the potential for that package to fall off CRAN. This change helps from a sustainability point of view, but also adds some new features.

Fit diagnostics are computed and reported. If the fit quality is poor, an "uncertainty sample" that is furthest away from the existing data is used as the new candidate.
The GP no longer uses binary indicators for qualitative predictors. Instead, a "categorical kernel" is used for those parameter columns. Fewer starting values are required with this change.
For numeric predictors, the Matern 3/2 kernel is always used.

I've tested this extensively on the analyses in aml4td and TMwR and get very similar results.

topepo · 2025-10-17T16:38:52Z

inst/test_objects.R

@@ -1,4 +1,17 @@
-library(tidymodels)
+library(broom)


Note: we were getting warnings with the old tests that:

namespace 'workflowsets' is not available and has been replaced

Not loading tidymodels solves this and it comes up in the tests for #1007 1007

EmilHvitfeldt

very excited to see these changes!

EmilHvitfeldt · 2025-10-18T00:49:30Z

R/checks.R

-        "There are {cli::qty(diff)}{?as many/more} tuning parameters
-          {cli::qty(diff)}{?as/than} there are initial points.


Why are these lines being deleted?

if that is intentional then we also don't need cli::pluralize()

We have this warning now. which I find more informative. I don't think that we need both.

msg <- cli::format_inline( "The Gaussian process model is being fit using {num_pred} feature{?s} but only has {num_cand} data point{?s} to do so. This may cause errors or a poor model fit." )

EmilHvitfeldt · 2025-10-18T00:54:10Z

R/tune_bayes.R

    dplyr::filter(.metric == metrics) |>
    dplyr::filter(!is.na(mean))

+  # TODO a lot of slice_min/slice_max can be used now


Should add as an issue if you are not planning on dealing with it in this PR.

R/gp_helpers.R

EmilHvitfeldt · 2025-10-18T01:21:13Z

R/gp_helpers.R

+  }
+
+  withr::with_seed(
+    114,


why are we hardcoding this specific seed?

That is a leftover from development. I've taken it out.

R/gp_helpers.R

Co-authored-by: Emil Hvitfeldt <[email protected]>

topepo added 30 commits March 15, 2025 09:28

initial code for GauPro

5550e86

update GP check

6bad583

more helpers

77ccb3b

logging and seeds

8ac5024

update messages

42e967f

accommodations for qualitative tuning parameters

5ec06f7

note about left-over warnings

86226bd

add some notes

7699de2

small updates

aaff649

switch gp packages

911afb5

global variable from mutate

8eed207

notes and remove old functions

3dfc61e

Merge branch 'main' into gau-pro

0378e57

add GauPro

1dff5a2

more updates

1f92a1e

remove old functions

0808c18

air format

83c3959

simplify note

368d7ab

write more details; rename some columns names for GP function

bdb2115

clean up some notes

eb087e0

update news

ef13682

recreate test objects with new GP code

68720ec

update tests

da11756

redoc

f11eab7

update words

2863caf

base pipe

4cefced

avoid loading workflowsets in test objects

b826bc1

missing o=symbol names

7306248

merge on updated names

117ec01

capture output

5d0ed03

topepo added 7 commits October 16, 2025 15:16

update bayes objects with wide grids

5a0ea4f

remove the use of previous GP fit code

c1fc879

remove some minor warnings when the occur

4680c3d

capture output

07742ae

test files names based on R file names

ed6f0c2

Merge branch 'main' into gau-pro

78f6660

fix news entry

f23c913

topepo mentioned this pull request Oct 17, 2025

Add Fold Weights for Variable Resample Weighting #1007

Merged

topepo commented Oct 17, 2025

View reviewed changes

skip for seemingly unrelated package based on GHA results

940f490

topepo requested a review from EmilHvitfeldt October 17, 2025 20:38

topepo marked this pull request as ready for review October 17, 2025 20:39

EmilHvitfeldt approved these changes Oct 18, 2025

View reviewed changes

topepo and others added 6 commits October 18, 2025 09:35

Apply suggestions from code review

15ff8c4

Co-authored-by: Emil Hvitfeldt <[email protected]>

use existing random number stream

3e0913c

no longer used

752700c

update snapshots

70d694a

Merge branch 'main' into gau-pro

aab3dfa

typo

198590f

topepo merged commit a1c5d52 into main Oct 22, 2025
6 checks passed

topepo deleted the gau-pro branch October 22, 2025 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Switch from GPfit to GauPro for Gaussian Processes #1104

Switch from GPfit to GauPro for Gaussian Processes #1104

Uh oh!

topepo commented Oct 17, 2025

Uh oh!

topepo Oct 17, 2025

Uh oh!

EmilHvitfeldt left a comment

Uh oh!

EmilHvitfeldt Oct 18, 2025

Uh oh!

topepo Oct 22, 2025

Uh oh!

EmilHvitfeldt Oct 18, 2025

Uh oh!

Uh oh!

EmilHvitfeldt Oct 18, 2025

Uh oh!

topepo Oct 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		"There are {cli::qty(diff)}{?as many/more} tuning parameters
		{cli::qty(diff)}{?as/than} there are initial points.

		@@ -1,4 +1,17 @@
		library(tidymodels)
		library(broom)

Uh oh!

Switch from GPfit to GauPro for Gaussian Processes #1104

Switch from GPfit to GauPro for Gaussian Processes #1104

Uh oh!

Conversation

topepo commented Oct 17, 2025

Uh oh!

topepo Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

EmilHvitfeldt left a comment

Choose a reason for hiding this comment

Uh oh!

EmilHvitfeldt Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

topepo Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

EmilHvitfeldt Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EmilHvitfeldt Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

topepo Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants