@@ -196,174 +196,6 @@ end
196196 end
197197end
198198
199- @testitem " search with parametric template expressions" tags = [:part1 ] begin
200- # ! format: off
201- # literate_begin file="src/examples/template_parametric_expression.md"
202- #=
203- # Parametrized Template Expressions
204-
205- Template expressions in SymbolicRegression.jl can include parametric forms - expressions with tunable constants
206- that are optimized during the search. This can even include class-specific parameters that vary by category.
207-
208- In this tutorial, we'll demonstrate how to use parametric template expressions to learn a model where:
209-
210- - Some constants are shared across all data points
211- - Other constants vary by class
212- - The structure combines known forms (like cosine) with unknown sub-expressions
213-
214- =#
215-
216- using SymbolicRegression
217- using Random: MersenneTwister, randn, rand
218- using MLJBase: machine, fit!, predict, report
219-
220- #=
221- ## The Model Structure
222-
223- We'll work with a model that combines:
224- - A cosine term with class-specific phase shifts
225- - A polynomial term
226- - Global scaling parameters
227-
228- Specifically, let's say that our true model has the form:
229-
230- ```math
231- y = A \cos(f(x_2) + \Delta_c) + g(x_1) - B
232- ```
233-
234- where:
235- - ``A`` is a global amplitude (same for all classes)
236- - ``\Delta_c`` is a phase shift that depends on the class label
237- - ``f(x_2)`` is some function of ``x_2`` (in our case, just ``x_2``)
238- - ``g(x_1)`` is some function of ``x_1`` (in our case, ``x_1^2``)
239- - ``B`` is a global offset
240-
241- We'll generate synthetic data where:
242- - ``A = 2.0`` (amplitude)
243- - ``\Delta_1 = 0.1`` (phase shift for class 1)
244- - ``\Delta_2 = 1.5`` (phase shift for class 2)
245- - ``B = 2.0`` (offset)
246- =#
247-
248- # # Set random seed for reproducibility
249- rng = MersenneTwister (0 )
250-
251- # # Number of data points
252- n = 200
253-
254- # # Generate random features
255- x1 = randn (rng, n) # feature 1
256- x2 = randn (rng, n) # feature 2
257- class = rand (rng, 1 : 2 , n) # class labels 1 or 2
258-
259- # # Define the true parameters
260- Δ_phase = [0.1 , 1.5 ] # phase shift for class 1 and 2
261- A = 2.0 # amplitude
262- B = 2.0 # offset
263-
264- # # Add some noise
265- eps = randn (rng, n) * 1e-5
266-
267- # # Generate targets using the true underlying function
268- y = [
269- A * cos (x2[i] + Δ_phase[class[i]]) + x1[i]^ 2 - B
270- for i in 1 : n
271- ]
272- y .+ = eps
273-
274- #=
275- ## Defining the Template
276-
277- Now we'll use the `@template_spec` macro to encode this structure, which will create
278- a `TemplateExpressionSpec` object.
279- =#
280-
281- # # Define the template structure with sub-expressions f and g
282- template = @template_spec (
283- expressions= (f, g),
284- parameters= (p1= 2 , p2= 2 )
285- ) do x1, x2, class
286- return p1[1 ] * cos (f (x2) + p2[class]) + g (x1) - p1[2 ]
287- end
288-
289- #=
290- Let's break down this template:
291- - We declared two sub-expressions: `f` and `g` that we want to learn
292- - By calling `f(x2)` and `g(x1)`, the forward pass will constrain both expressions
293- to only include a single input argument.
294- - We declared two parameter vectors: `p1` (length 2) and `p2` (length 2)
295- - The template combines these components as:
296- - `p1[1]` is the amplitude (global parameter)
297- - `cos(f(x2) + p2[class])` adds a class-specific phase shift via `p2[class]`
298- - `g(x1)` represents (we hope) the quadratic term
299- - `p1[2]` is the global offset
300-
301- Now we'll set up an SRRegressor with our template:
302- =#
303-
304- model = SRRegressor (
305- binary_operators = (+ , - , * , / ),
306- niterations = 300 ,
307- populations = 8 ,
308- maxsize = 20 ,
309- expression_spec = template,
310- early_stop_condition = (loss, complexity) -> loss < 1e-5 && complexity < 10 , # src
311- )
312-
313- # # Package data up for MLJ
314- X = (; x1, x2, class)
315- mach = machine (model, X, y)
316-
317- #=
318- At this point, you would run:
319- ```julia
320- fit!(mach)
321- ```
322-
323- which will evolve expressions following our template structure. The final result is accessible with:
324- ```julia
325- report(mach)
326- ```
327- which returns a named tuple of the fitted results, including the `.equations` field containing
328- the `TemplateExpression` objects that dominated the Pareto front.
329-
330- ## Interpreting Results
331-
332- After training, you can inspect the expressions found:
333- ```julia
334- r = report(mach)
335- best_expr = r.equations[r.best_idx]
336- ```
337-
338- You can also extract the individual sub-expressions (stored as `ComposableExpression` objects):
339- ```julia
340- inner_exprs = get_contents(best_expr)
341- metadata = get_metadata(best_expr)
342- ```
343-
344- The learned expression should closely match our true generating function:
345- - `f(x2)` should be approximately `x2` (note it will show up as `x1` in the raw contents, but this simply is a relative indexing of its arguments!)
346- - `g(x1)` should be approximately `x1^2`
347- - The parameters should be close to their true values:
348- - `p1[1] ≈ 2.0` (amplitude)
349- - `p1[2] ≈ 2.0` (offset)
350- - `p2[1] ≈ 0.1 mod 2π` (phase shift for class 1)
351- - `p2[2] ≈ 1.5 mod 2π` (phase shift for class 2)
352-
353- You can use the learned expression to make predictions using either `predict(mach, X)`,
354- or by calling `best_expr(X_raw)` directly (note that `X_raw` needs to be a matrix of shape
355- `(d, n)` where `n` is the number of samples and `d` is the dimension of the features).
356- =#
357-
358- # literate_end
359- # ! format: on
360-
361- fit! (mach)
362-
363- num_exprs = length (report (mach). equations)
364- @test sum (abs2, predict (mach, (data= X, idx= num_exprs)) .- y) / n < 1e-5
365- end
366-
367199@testitem " Preallocated copying with parameters" tags = [:part2 ] begin
368200 using SymbolicRegression
369201 using Random: MersenneTwister
0 commit comments