Skip to content

Add AdaptiveSampling and AdaptiveDiscretization for one-dimensional parametric geometries #1234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

sschuldenzucker
Copy link

This uses the algorithm from here: https://github.com/sschuldenzucker/parametricadaptivesampling.jl/

Tested with ParametricCurve and BezierCurve.

Copy link
Contributor

github-actions bot commented Jul 19, 2025

Benchmark Results (Julia vlts)

Time benchmarks
master b16ae95... master / b16ae95...
clipping/SutherlandHodgman 3.76 ± 0.2 μs 3.86 ± 1.6 μs 0.974 ± 0.41
discretization/simplexify 25.5 ± 1.4 ms 25.9 ± 1.3 ms 0.983 ± 0.075
intersection/ray-triangle 0.05 ± 0 μs 0.05 ± 0.001 μs 1 ± 0.02
sideof/ring/large 6.53 ± 0.01 μs 6.53 ± 0.01 μs 1 ± 0.0022
sideof/ring/small 0.05 ± 0 μs 0.05 ± 0.01 μs 1 ± 0.2
winding/mesh 0.0411 ± 0.0015 s 0.041 ± 0.0016 s 1 ± 0.052
time_to_load 1.45 ± 0.0026 s 1.47 ± 0.0021 s 0.99 ± 0.0023
Memory benchmarks
master b16ae95... master / b16ae95...
clipping/SutherlandHodgman 0.053 k allocs: 4.97 kB 0.053 k allocs: 4.97 kB 1
discretization/simplexify 0.226 M allocs: 21.8 MB 0.226 M allocs: 21.8 MB 1
intersection/ray-triangle 0 allocs: 0 B 0 allocs: 0 B
sideof/ring/large 0 allocs: 0 B 0 allocs: 0 B
sideof/ring/small 0 allocs: 0 B 0 allocs: 0 B
winding/mesh 0.231 M allocs: 23 MB 0.231 M allocs: 23 MB 1
time_to_load 0.153 k allocs: 14.5 kB 0.153 k allocs: 14.5 kB 1

Copy link
Contributor

github-actions bot commented Jul 19, 2025

Benchmark Results (Julia v1)

Time benchmarks
master b16ae95... master / b16ae95...
clipping/SutherlandHodgman 3.33 ± 0.21 μs 3.39 ± 0.22 μs 0.983 ± 0.089
discretization/simplexify 24.4 ± 3.2 ms 24.7 ± 3.4 ms 0.985 ± 0.19
intersection/ray-triangle 0.07 ± 0.009 μs 0.06 ± 0.001 μs 1.17 ± 0.15
sideof/ring/large 6.77 ± 0.089 μs 6.78 ± 0.02 μs 0.999 ± 0.013
sideof/ring/small 0.06 ± 0.009 μs 0.06 ± 0.01 μs 1 ± 0.22
winding/mesh 0.0412 ± 0.0023 s 0.0423 ± 0.0025 s 0.975 ± 0.08
time_to_load 1.49 ± 0.0075 s 1.51 ± 0.016 s 0.992 ± 0.012
Memory benchmarks
master b16ae95... master / b16ae95...
clipping/SutherlandHodgman 0.053 k allocs: 4.83 kB 0.053 k allocs: 4.83 kB 1
discretization/simplexify 0.324 M allocs: 21.8 MB 0.324 M allocs: 21.8 MB 1
intersection/ray-triangle 0 allocs: 0 B 0 allocs: 0 B
sideof/ring/large 0 allocs: 0 B 0 allocs: 0 B
sideof/ring/small 0 allocs: 0 B 0 allocs: 0 B
winding/mesh 0.329 M allocs: 22.9 MB 0.329 M allocs: 22.9 MB 1
time_to_load 0.159 k allocs: 11.2 kB 0.159 k allocs: 11.2 kB 1

Copy link

codecov bot commented Jul 19, 2025

Codecov Report

Attention: Patch coverage is 97.05882% with 2 lines in your changes missing coverage. Please review.

Project coverage is 87.62%. Comparing base (9218d68) to head (b16ae95).

Files with missing lines Patch % Lines
src/sampling/adaptive.jl 96.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1234      +/-   ##
==========================================
- Coverage   87.89%   87.62%   -0.27%     
==========================================
  Files         196      198       +2     
  Lines        6189     6257      +68     
==========================================
+ Hits         5440     5483      +43     
- Misses        749      774      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sschuldenzucker
Copy link
Author

Huh the 32 bit tests seem to fail for subtle numerical deviations

@souma4
Copy link
Contributor

souma4 commented Jul 19, 2025

So I'll jump in with an unofficial review on the simple-to-fix/broad stuff before Julio and Josh since I have the time. The code conventions are that variable names should be short, descriptive, and not use underscores unless it is an internal, private function, in which case the underscore starts the function name (i.e _foo()). I also wonder, with how many SOMEDAYS there are, if this should be more of a running PR to add in those further features.

Other's can chime in on this, but we are planning on adding BinaryTrees.jl to this package, so that might be a way of creating your sorted sequence (currently BinaryHeap) without adding an extra datastructure dependency in the long term.

As for the FP32, you concrete type Float64 in your BinaryHeap which is promoting results to Float64, but then testing it against Float32, I think swapping those Float64's into T (the numtype) should resolve the FP32 test failures.

@juliohm
Copy link
Member

juliohm commented Jul 19, 2025

💯% agree with @souma4 's review.

Try to adjust the code for a second round of review @sschuldenzucker. There are code style adjustments and dependency adjustments to make. If all you need is a BinaryTree, then BinaryTrees.jl is a better (more lightweight) dependency. DataStructures.jl has too much in it.

@juliohm
Copy link
Member

juliohm commented Jul 19, 2025

Adaptive sampling/discretization will be super useful for visualization and other purposes. Depending on the performance of the implementation compared with RegularSampling and RegularDiscretization we could even consider it as the default discretization method for visualization of curved 1D geometries (e.g., BezierCurve).

@souma4
Copy link
Contributor

souma4 commented Jul 19, 2025

At some point I'll write up an update to the contribution document to give a push for what the style of this code base is, but scouring old messages and stuff I notice I get these. I don't see all of them from my quick review, but good to have them in one place.

  1. short descriptive variable names without underscores (unless it's a private function, which is an underscore up front)
  2. maintain variable naming patterns throughout function barriers. (i.e. if you create points p..., the functions that use those should have the argument p).
  3. Try to keep variable names consistent with the code base, too. So T is for numtype. \scrL is for lentype. etc.
  4. Structs, even private ones, don't have preceding underscores.
  5. minimize parameterization of structs. (I think you do very well with this, but I tend to over parameterize as a way to reason about code, so good to mention just in case).
  6. Concrete types should be kept to a minimum. Parametric types tend to handle it better.
  7. When in doubt, look at the GeoStats.jl stack to see what patterns you see and try to shoot for those.

@juliohm
Copy link
Member

juliohm commented Jul 19, 2025

Thanks for putting this list together @souma4 ! 🫶

I would also add that we try to follow the notation from scientific publications when that is possible. That way new contributors can read the paper and the implementation side by side to propose improvements more easily.

@sschuldenzucker
Copy link
Author

Hi!

  1. Removed underscores from variable names and hopefully made them a bit more readable.
  2. Did my best, though sometimes this hits a generality bound (e.g., called function accepts any callable f and is called with a geometry geom for f)
  3. Did my best.
  4. Already ok.
  5. Already ok, though I'm wondering if this leads to type instability.
  6. Done, replaces this one mention of Float64 with the error type E and it seems fine now.

On DataStructures.jl vs BinaryTrees.jl: This only implements binary search (AVL) trees but I need heaps. Note that we already depend on DataStructures indirectly (Manifest.toml did not change) and I generally don't see any problem with pulling in some very standard data structures, either.

As for the SOMEDAYs: The ones that are not in the docstring are mostly about code quality; happy for any pointers.

For the ones in the docstring, these seem to sit at various levels of difficulty. Here's an annotated list. Would love to do more but there seem to be some tricky parts:

  1. Only geometries with one-dimensional parameter domains are supported. It would be cool to implement higher dimensions. We then probably want a discretization, not just sampling. - Somewhat hard. The main ingredient would be a way some machienary to split simplices into sub-simplices (or polygons into sub-polygons) and that does the required accounting for me. Do you have that?
  2. Infinite or open t ranges where we would adaptively sample larger/smaller (towards infinity or the edges) t values. This may be an instance of a more general hook into refinement. - AFAIU we don't have the geometries for this right now? ParametrizedCurve only supports closed t ranges right?
  3. Some way of detecting and avoiding polar points. Basically a way to say "it's not useful to sample this part, better to cut it out". Not clear how we'd do this in a robust way without affecting some use cases negatively. - There seems to be a fundamental trade-off between detecting polar points and not excluding extreme non-polar regions. Seems tricky to get right generically, though there certainly are some methods.
  4. Optional penalty for recursion depth (or something like this) to avoid excessive concentration at polar points if maxPoints gets exhausted. This would have to be configured by the user with knowledge of the function, can be detrimental otherwise. Unclear if we want this. - We may just not need this; if we do, a simple feature to add. Would implement the rest first and see if we need this then.
  5. Option to split into more than two parts per recursion step, or to split not into halves but by some other proportions. This could help when details are missed by the 2-split recursion (see Caveats). - I wouldn't split into more than two parts but we can choose the split point dynamically based on the derivative (see below).
  6. Point density target in value space. Could help in situations where the midway point is interpolated well linearly, but there is additional variation at some other point. - Should be reasonably easy to add but not sure if needed / actually desired.
  7. Option to drop points that are ultimately not needed if the range expands during refinement (may reduce number of points, might be useful for some later computation steps). - Easy to add, may not be needed.
  8. A way to understand that the range may actually be the wrong thing, e.g., when we use
  9. aspect_ratio=1 in the plot and the plot therefore creates additional space, or some xlim or ylim. Hard to do generically. Maybe make an option to pass the plot range explicitly. - This sounds like a PITA to do; we could have an easy aspect_ratio option if it's really desired for the specific use case above.
  10. Is there some smartness we can do if the (second) derivative of f is available? Choosing the splitting point better than at the interval midpoint seems attractive. - This is probably either easy OR not useful ;should be easy to figure out. How would we pipe the derivative through the geometry infra though? Maybe the right abstract concept is "curvature" here?

@juliohm
Copy link
Member

juliohm commented Jul 22, 2025

Note that we already depend on DataStructures indirectly

That is really unfortunate. DataStructures.jl is one of those packages that provides various incomplete implementations without a single active maintainer. I've encountered so many issues in the past that I realized it would be safer to just not use it at all.

It seems that the dependency is coming from another poorly maintained package, which is doubly unfortunate:

(jl_YsCiCv) pkg> why DataStructures
  Meshes  StatsBase  DataStructures
  Meshes  StatsBase  SortingAlgorithms  DataStructures

Our ultimate goal is to remove StatsBase.jl from the list of dependencies. It dates back to the origins of Julia when the Statistics stdlib was part of the Julia distribution and people were trying to add more stats functionality independently of the Julia release process.

Copy link
Member

@juliohm juliohm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @sschuldenzucker for attempting your first contribution! 🎉

We still have work to do in order to improve the quality of the code and adhere to the project standards. Please feel free to ask any question as you are working on the review comments.

Comment on lines +167 to +171
crv = ParametrizedCurve((t -> Point((sin(t * 2π) * t, cos(t * 2π) * t))), (0, 1))
method = AdaptiveDiscretization(; min_points=10.0, max_points=100.0)
mesh = discretize(crv, method)

viz(mesh, showsegments = true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename crv to curve. Also, the keyword arguments have underscores, which is not adherent to our code style.

Comment on lines +140 to +145
# Sample between 10 and 100 points
sampler = AdaptiveSampling(; min_points=10.0, max_points=100.0)
crv = ParametrizedCurve((t -> Point((sin(t * 2π) * t, cos(t * 2π) * t))), (0, 1))
points = sample(crv, sampler) |> collect

viz(points)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same suggestions I shared in the other discretization example.

Comment on lines +5 to +7
using DataStructures: BinaryHeap
using LinearAlgebra: norm
import IterTools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only place in the source code where we import modules is the main Meshes.jl file. That way we don't loose track of current imports and avoid naming conflicts.

Comment on lines +14 to +15
This is *only* implemented for geometries with a one-dimensional parameter domain (i.e., realistically, [`ParametrizedCurve`](@ref) or [`BezierCurve`](@ref))!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is *only* implemented for geometries with a one-dimensional parameter domain (i.e., realistically, [`ParametrizedCurve`](@ref) or [`BezierCurve`](@ref))!

No need to make this information available in the docstring. At some point in the future, we hope to fill in the gaps.

Comment on lines +18 to +21
* `tol` - Tolerance for the linear interpolation error. The meaning depends on `errfun`.
* `errfun` - Function to compute the interpolation error. The default is [`errRelativeBBox`](@ref) = the l-infinity distance component-wise relative to (an estimate of) the range of the bounding box; this is a good default for plotting. If provided, it must satisfy `errfun(::ParametrizedSegment, ::MutableBox)::E`
* `minPoints` - Number of points sampled uniformly initially.
* `maxPoints` - Maximum number of points to sample. We do our best to, if this is hit, get rid of the worst approximation errors first.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revise the options as they follow a camel case notation, not supported by our code style. Also, it is worth thinking which of these options will be changed in practice. Do you expect end-users to change the errfun?

- Is there some smartness we can do if the (second) derivative of f is available? Choosing the splitting point better than at the interval midpoint seems attractive.
"""
struct AdaptiveSampling{E} <: ContinuousSamplingMethod
tol::E
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you assuming that the tolerance is expressed as a unitless floating point number or as a tolerance with length units? We usually adopt the latter in our methods because they are more useful in applications. A user might want to sample the curve until the deviation/error is smaller than 1mm for example.

Comment on lines +52 to +53
minPoints::Int
maxPoints::Int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not compliant with our code style.

Comment on lines +78 to +85
struct ParametrizedSegment{T,V}
t1::T
v1::V
tMid::T
vMid::V
t2::T
v2::V
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the Segment type? It is parametrized already.

Comment on lines +97 to +102
"Like `Box` but mutable."
mutable struct MutableBox{V}
# Implicit: All lengths are equal
mins::Vector{V}
maxs::Vector{V}
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think mutability is enough justification to create another box type.

Comment on lines +550 to +551
min_points = 5
max_points = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not compliant with code style.

@juliohm
Copy link
Member

juliohm commented Aug 1, 2025

ping @sschuldenzucker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants