Add AdaptiveSampling and AdaptiveDiscretization for one-dimensional parametric geometries #1234

sschuldenzucker · 2025-07-19T14:59:50Z

This uses the algorithm from here: https://github.com/sschuldenzucker/parametricadaptivesampling.jl/

Tested with ParametricCurve and BezierCurve.

github-actions · 2025-07-19T15:02:32Z

Benchmark Results (Julia vlts)

Time benchmarks

	master	`b16ae95`...	master / `b16ae95`...
clipping/SutherlandHodgman	3.76 ± 0.2 μs	3.86 ± 1.6 μs	0.974 ± 0.41
discretization/simplexify	25.5 ± 1.4 ms	25.9 ± 1.3 ms	0.983 ± 0.075
intersection/ray-triangle	0.05 ± 0 μs	0.05 ± 0.001 μs	1 ± 0.02
sideof/ring/large	6.53 ± 0.01 μs	6.53 ± 0.01 μs	1 ± 0.0022
sideof/ring/small	0.05 ± 0 μs	0.05 ± 0.01 μs	1 ± 0.2
winding/mesh	0.0411 ± 0.0015 s	0.041 ± 0.0016 s	1 ± 0.052
time_to_load	1.45 ± 0.0026 s	1.47 ± 0.0021 s	0.99 ± 0.0023

Memory benchmarks

	master	`b16ae95`...	master / `b16ae95`...
clipping/SutherlandHodgman	0.053 k allocs: 4.97 kB	0.053 k allocs: 4.97 kB	1
discretization/simplexify	0.226 M allocs: 21.8 MB	0.226 M allocs: 21.8 MB	1
intersection/ray-triangle	0 allocs: 0 B	0 allocs: 0 B
sideof/ring/large	0 allocs: 0 B	0 allocs: 0 B
sideof/ring/small	0 allocs: 0 B	0 allocs: 0 B
winding/mesh	0.231 M allocs: 23 MB	0.231 M allocs: 23 MB	1
time_to_load	0.153 k allocs: 14.5 kB	0.153 k allocs: 14.5 kB	1

github-actions · 2025-07-19T15:02:53Z

Benchmark Results (Julia v1)

Time benchmarks

	master	`b16ae95`...	master / `b16ae95`...
clipping/SutherlandHodgman	3.33 ± 0.21 μs	3.39 ± 0.22 μs	0.983 ± 0.089
discretization/simplexify	24.4 ± 3.2 ms	24.7 ± 3.4 ms	0.985 ± 0.19
intersection/ray-triangle	0.07 ± 0.009 μs	0.06 ± 0.001 μs	1.17 ± 0.15
sideof/ring/large	6.77 ± 0.089 μs	6.78 ± 0.02 μs	0.999 ± 0.013
sideof/ring/small	0.06 ± 0.009 μs	0.06 ± 0.01 μs	1 ± 0.22
winding/mesh	0.0412 ± 0.0023 s	0.0423 ± 0.0025 s	0.975 ± 0.08
time_to_load	1.49 ± 0.0075 s	1.51 ± 0.016 s	0.992 ± 0.012

Memory benchmarks

	master	`b16ae95`...	master / `b16ae95`...
clipping/SutherlandHodgman	0.053 k allocs: 4.83 kB	0.053 k allocs: 4.83 kB	1
discretization/simplexify	0.324 M allocs: 21.8 MB	0.324 M allocs: 21.8 MB	1
intersection/ray-triangle	0 allocs: 0 B	0 allocs: 0 B
sideof/ring/large	0 allocs: 0 B	0 allocs: 0 B
sideof/ring/small	0 allocs: 0 B	0 allocs: 0 B
winding/mesh	0.329 M allocs: 22.9 MB	0.329 M allocs: 22.9 MB	1
time_to_load	0.159 k allocs: 11.2 kB	0.159 k allocs: 11.2 kB	1

codecov · 2025-07-19T15:10:24Z

Codecov Report

Attention: Patch coverage is 97.05882% with 2 lines in your changes missing coverage. Please review.

Project coverage is 87.62%. Comparing base (9218d68) to head (b16ae95).

Files with missing lines	Patch %	Lines
src/sampling/adaptive.jl	96.77%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1234      +/-   ##
==========================================
- Coverage   87.89%   87.62%   -0.27%     
==========================================
  Files         196      198       +2     
  Lines        6189     6257      +68     
==========================================
+ Hits         5440     5483      +43     
- Misses        749      774      +25

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sschuldenzucker · 2025-07-19T17:21:16Z

Huh the 32 bit tests seem to fail for subtle numerical deviations

souma4 · 2025-07-19T18:37:26Z

So I'll jump in with an unofficial review on the simple-to-fix/broad stuff before Julio and Josh since I have the time. The code conventions are that variable names should be short, descriptive, and not use underscores unless it is an internal, private function, in which case the underscore starts the function name (i.e _foo()). I also wonder, with how many SOMEDAYS there are, if this should be more of a running PR to add in those further features.

Other's can chime in on this, but we are planning on adding BinaryTrees.jl to this package, so that might be a way of creating your sorted sequence (currently BinaryHeap) without adding an extra datastructure dependency in the long term.

As for the FP32, you concrete type Float64 in your BinaryHeap which is promoting results to Float64, but then testing it against Float32, I think swapping those Float64's into T (the numtype) should resolve the FP32 test failures.

juliohm · 2025-07-19T19:52:11Z

💯% agree with @souma4 's review.

Try to adjust the code for a second round of review @sschuldenzucker. There are code style adjustments and dependency adjustments to make. If all you need is a BinaryTree, then BinaryTrees.jl is a better (more lightweight) dependency. DataStructures.jl has too much in it.

juliohm · 2025-07-19T19:57:03Z

Adaptive sampling/discretization will be super useful for visualization and other purposes. Depending on the performance of the implementation compared with RegularSampling and RegularDiscretization we could even consider it as the default discretization method for visualization of curved 1D geometries (e.g., BezierCurve).

souma4 · 2025-07-19T20:46:48Z

At some point I'll write up an update to the contribution document to give a push for what the style of this code base is, but scouring old messages and stuff I notice I get these. I don't see all of them from my quick review, but good to have them in one place.

short descriptive variable names without underscores (unless it's a private function, which is an underscore up front)
maintain variable naming patterns throughout function barriers. (i.e. if you create points p..., the functions that use those should have the argument p).
Try to keep variable names consistent with the code base, too. So T is for numtype. \scrL is for lentype. etc.
Structs, even private ones, don't have preceding underscores.
minimize parameterization of structs. (I think you do very well with this, but I tend to over parameterize as a way to reason about code, so good to mention just in case).
Concrete types should be kept to a minimum. Parametric types tend to handle it better.
When in doubt, look at the GeoStats.jl stack to see what patterns you see and try to shoot for those.

juliohm · 2025-07-19T22:59:47Z

Thanks for putting this list together @souma4 ! 🫶

I would also add that we try to follow the notation from scientific publications when that is possible. That way new contributors can read the paper and the implementation side by side to propose improvements more easily.

sschuldenzucker · 2025-07-21T22:26:20Z

Hi!

Removed underscores from variable names and hopefully made them a bit more readable.
Did my best, though sometimes this hits a generality bound (e.g., called function accepts any callable f and is called with a geometry geom for f)
Did my best.
Already ok.
Already ok, though I'm wondering if this leads to type instability.
Done, replaces this one mention of Float64 with the error type E and it seems fine now.

On DataStructures.jl vs BinaryTrees.jl: This only implements binary search (AVL) trees but I need heaps. Note that we already depend on DataStructures indirectly (Manifest.toml did not change) and I generally don't see any problem with pulling in some very standard data structures, either.

As for the SOMEDAYs: The ones that are not in the docstring are mostly about code quality; happy for any pointers.

For the ones in the docstring, these seem to sit at various levels of difficulty. Here's an annotated list. Would love to do more but there seem to be some tricky parts:

Only geometries with one-dimensional parameter domains are supported. It would be cool to implement higher dimensions. We then probably want a discretization, not just sampling. - Somewhat hard. The main ingredient would be a way some machienary to split simplices into sub-simplices (or polygons into sub-polygons) and that does the required accounting for me. Do you have that?
Infinite or open t ranges where we would adaptively sample larger/smaller (towards infinity or the edges) t values. This may be an instance of a more general hook into refinement. - AFAIU we don't have the geometries for this right now? ParametrizedCurve only supports closed t ranges right?
Some way of detecting and avoiding polar points. Basically a way to say "it's not useful to sample this part, better to cut it out". Not clear how we'd do this in a robust way without affecting some use cases negatively. - There seems to be a fundamental trade-off between detecting polar points and not excluding extreme non-polar regions. Seems tricky to get right generically, though there certainly are some methods.
Optional penalty for recursion depth (or something like this) to avoid excessive concentration at polar points if maxPoints gets exhausted. This would have to be configured by the user with knowledge of the function, can be detrimental otherwise. Unclear if we want this. - We may just not need this; if we do, a simple feature to add. Would implement the rest first and see if we need this then.
Option to split into more than two parts per recursion step, or to split not into halves but by some other proportions. This could help when details are missed by the 2-split recursion (see Caveats). - I wouldn't split into more than two parts but we can choose the split point dynamically based on the derivative (see below).
Point density target in value space. Could help in situations where the midway point is interpolated well linearly, but there is additional variation at some other point. - Should be reasonably easy to add but not sure if needed / actually desired.
Option to drop points that are ultimately not needed if the range expands during refinement (may reduce number of points, might be useful for some later computation steps). - Easy to add, may not be needed.
A way to understand that the range may actually be the wrong thing, e.g., when we use
aspect_ratio=1 in the plot and the plot therefore creates additional space, or some xlim or ylim. Hard to do generically. Maybe make an option to pass the plot range explicitly. - This sounds like a PITA to do; we could have an easy aspect_ratio option if it's really desired for the specific use case above.
Is there some smartness we can do if the (second) derivative of f is available? Choosing the splitting point better than at the interval midpoint seems attractive. - This is probably either easy OR not useful ;should be easy to figure out. How would we pipe the derivative through the geometry infra though? Maybe the right abstract concept is "curvature" here?

juliohm · 2025-07-22T10:56:19Z

Note that we already depend on DataStructures indirectly

That is really unfortunate. DataStructures.jl is one of those packages that provides various incomplete implementations without a single active maintainer. I've encountered so many issues in the past that I realized it would be safer to just not use it at all.

It seems that the dependency is coming from another poorly maintained package, which is doubly unfortunate:

(jl_YsCiCv) pkg> why DataStructures
  Meshes → StatsBase → DataStructures
  Meshes → StatsBase → SortingAlgorithms → DataStructures

Our ultimate goal is to remove StatsBase.jl from the list of dependencies. It dates back to the origins of Julia when the Statistics stdlib was part of the Julia distribution and people were trying to add more stats functionality independently of the Julia release process.

juliohm

Thank you @sschuldenzucker for attempting your first contribution! 🎉

We still have work to do in order to improve the quality of the code and adhere to the project standards. Please feel free to ask any question as you are working on the review comments.

juliohm · 2025-07-22T10:59:09Z

docs/src/algorithms/discretization.md

+crv = ParametrizedCurve((t -> Point((sin(t * 2π) * t, cos(t * 2π) * t))), (0, 1))
+method = AdaptiveDiscretization(; min_points=10.0, max_points=100.0)
+mesh = discretize(crv, method)
+
+viz(mesh, showsegments = true)


Please rename crv to curve. Also, the keyword arguments have underscores, which is not adherent to our code style.

juliohm · 2025-07-22T10:59:50Z

docs/src/algorithms/sampling.md

+# Sample between 10 and 100 points
+sampler = AdaptiveSampling(; min_points=10.0, max_points=100.0)
+crv = ParametrizedCurve((t -> Point((sin(t * 2π) * t, cos(t * 2π) * t))), (0, 1))
+points  = sample(crv, sampler) |> collect
+
+viz(points)


Same suggestions I shared in the other discretization example.

juliohm · 2025-07-22T11:00:59Z

src/sampling/adaptive.jl

+using DataStructures: BinaryHeap
+using LinearAlgebra: norm
+import IterTools


The only place in the source code where we import modules is the main Meshes.jl file. That way we don't loose track of current imports and avoid naming conflicts.

juliohm · 2025-07-22T11:02:27Z

src/sampling/adaptive.jl

+This is *only* implemented for geometries with a one-dimensional parameter domain (i.e., realistically, [`ParametrizedCurve`](@ref) or [`BezierCurve`](@ref))!
+


Suggested change

This is *only* implemented for geometries with a one-dimensional parameter domain (i.e., realistically, [`ParametrizedCurve`](@ref) or [`BezierCurve`](@ref))!

No need to make this information available in the docstring. At some point in the future, we hope to fill in the gaps.

juliohm · 2025-07-22T11:04:32Z

src/sampling/adaptive.jl

+* `tol` - Tolerance for the linear interpolation error. The meaning depends on `errfun`.
+* `errfun` - Function to compute the interpolation error. The default is [`errRelativeBBox`](@ref) = the l-infinity distance component-wise relative to (an estimate of) the range of the bounding box; this is a good default for plotting. If provided, it must satisfy `errfun(::ParametrizedSegment, ::MutableBox)::E`
+* `minPoints` - Number of points sampled uniformly initially.
+* `maxPoints` - Maximum number of points to sample. We do our best to, if this is hit, get rid of the worst approximation errors first.


Please revise the options as they follow a camel case notation, not supported by our code style. Also, it is worth thinking which of these options will be changed in practice. Do you expect end-users to change the errfun?

juliohm · 2025-07-22T11:15:44Z

src/sampling/adaptive.jl

+- Is there some smartness we can do if the (second) derivative of f is available? Choosing the splitting point better than at the interval midpoint seems attractive.
+"""
+struct AdaptiveSampling{E} <: ContinuousSamplingMethod
+    tol::E


Are you assuming that the tolerance is expressed as a unitless floating point number or as a tolerance with length units? We usually adopt the latter in our methods because they are more useful in applications. A user might want to sample the curve until the deviation/error is smaller than 1mm for example.

juliohm · 2025-07-22T11:15:58Z

src/sampling/adaptive.jl

+    minPoints::Int
+    maxPoints::Int


Not compliant with our code style.

juliohm · 2025-07-22T11:17:01Z

src/sampling/adaptive.jl

+struct ParametrizedSegment{T,V}
+    t1::T
+    v1::V
+    tMid::T
+    vMid::V
+    t2::T
+    v2::V
+end


Why not use the Segment type? It is parametrized already.

juliohm · 2025-07-22T11:17:32Z

src/sampling/adaptive.jl

+"Like `Box` but mutable."
+mutable struct MutableBox{V}
+    # Implicit: All lengths are equal
+    mins::Vector{V}
+    maxs::Vector{V}
+end


I don't think mutability is enough justification to create another box type.

juliohm · 2025-07-22T11:19:21Z

test/discretization.jl

+  min_points = 5
+  max_points = 10


Not compliant with code style.

juliohm · 2025-08-01T12:20:55Z

ping @sschuldenzucker

sschuldenzucker added 8 commits July 19, 2025 16:58

add AdaptiveSampling

0c026b3

AdaptiveSampling: docs, rename, parameter ergonomics

e0feb98

docs

8780fde

AdaptiveSampling tests

fe3c791

AdaptiveSampling docs

18d89d0

improve tests a bit more

6926343

add AdaptiveDiscretization

101fbc4

docs for AdaptiveDiscretization

c4353d2

sschuldenzucker added 2 commits July 21, 2025 23:09

fix Float32 tests: Use correct TV type

09a4b53

AdaptiveSampling: Rename some variables and functions, and format

b16ae95

juliohm requested changes Jul 22, 2025

View reviewed changes

		This is only implemented for geometries with a one-dimensional parameter domain (i.e., realistically, [`ParametrizedCurve`](@ref) or [`BezierCurve`](@ref))!

Add AdaptiveSampling and AdaptiveDiscretization for one-dimensional parametric geometries #1234

Are you sure you want to change the base?

Add AdaptiveSampling and AdaptiveDiscretization for one-dimensional parametric geometries #1234

Uh oh!

Conversation

sschuldenzucker commented Jul 19, 2025

Uh oh!

github-actions bot commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results (Julia vlts)

Uh oh!

github-actions bot commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results (Julia v1)

Uh oh!

codecov bot commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sschuldenzucker commented Jul 19, 2025

Uh oh!

souma4 commented Jul 19, 2025

Uh oh!

juliohm commented Jul 19, 2025

Uh oh!

juliohm commented Jul 19, 2025

Uh oh!

souma4 commented Jul 19, 2025

Uh oh!

juliohm commented Jul 19, 2025

Uh oh!

sschuldenzucker commented Jul 21, 2025

Uh oh!

juliohm commented Jul 22, 2025

Uh oh!

juliohm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juliohm commented Aug 1, 2025

Uh oh!

Uh oh!

github-actions bot commented Jul 19, 2025 •

edited

Loading

github-actions bot commented Jul 19, 2025 •

edited

Loading

codecov bot commented Jul 19, 2025 •

edited

Loading