Skip to content

replace current_span_length in favor of parameters_changed #706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

dvdsk
Copy link
Collaborator

@dvdsk dvdsk commented Feb 24, 2025

implements #694

currently draft, due to high number of non trivial changes this PR also aims to dramatically extend the test suite of rodio. Hopefully that will catch most bugs.

@roderickvd I could use some help fixing the decoders, they now just have todo!() as parameters_changed implementation. Let me know if you want to work on that. The branch is under the rodio repo so you can just pull it down and push your work to it.

status (ill edit this along the way)

  • trivial sources implemented (empty, sine etc)
  • normal sources implemented (skip etc)
  • extra tests for normal sources
  • complex sources implemented (queue and friends)
  • extra tests for complex sources
  • decoder sources implemented
  • extra tests for decoder sources (unsure if needed?)

@dvdsk dvdsk changed the title Parameters changed replace current_span_length in favor of parameters_changed Feb 24, 2025
@dvdsk dvdsk force-pushed the parameters_changed branch 3 times, most recently from 4678bb1 to 8204220 Compare February 25, 2025 10:47
@roderickvd
Copy link
Collaborator

Sure thing. Don't have much time this week, will look into it after.

@dvdsk dvdsk force-pushed the parameters_changed branch 2 times, most recently from af46244 to 481eed9 Compare February 26, 2025 18:02
dvdsk added a commit that referenced this pull request Feb 27, 2025
@dvdsk dvdsk force-pushed the parameters_changed branch from f1cf00e to 8c8df80 Compare February 27, 2025 13:38
@dvdsk
Copy link
Collaborator Author

dvdsk commented Feb 28, 2025

Note: I split of the NonZero change into its own PR, please review that first then ill rebase this on it.

@dvdsk dvdsk force-pushed the parameters_changed branch from b28c09b to 24932ce Compare February 28, 2025 17:00
@dvdsk
Copy link
Collaborator Author

dvdsk commented Feb 28, 2025

first benchmark data is in. Key takeaways:

  • resampler is 1.3x slower. (needs to be looked into)
  • high_pass is 3.8x faster 🥳
  • reverb is 1.5x slower, probably a bad buffered implementation. (needs to be investigated)

I'd not expect those changes to be honest. Lets hope we can speed things up otherwise we have some hard choices to make. Lets not dwell on it, I have spend literally zero time on optimization and I have not written this with performance in mind (yet).

Current master

Timer precision: 20 ns
conversions     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ from_sample                │               │               │               │         │
   ├─ f32       564.8 µs      │ 1.377 ms      │ 737 µs        │ 761.8 µs      │ 100     │ 100
   ├─ i16       254.5 µs      │ 360.5 µs      │ 255.9 µs      │ 257.5 µs      │ 100     │ 100
   ╰─ u16       380 µs        │ 479.2 µs      │ 382.2 µs      │ 383.9 µs      │ 100     │ 100

     Running benches/effects.rs (target/release/deps/effects-b27671fcc5098510)
Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  3.681 ms      │ 4.98 ms       │ 3.704 ms      │ 3.74 ms       │ 100     │ 100
├─ amplify      382.1 µs      │ 419.6 µs      │ 382.2 µs      │ 384.5 µs      │ 100     │ 100
├─ fade_out     1.142 ms      │ 1.161 ms      │ 1.148 ms      │ 1.148 ms      │ 100     │ 100
├─ high_pass    6.656 ms      │ 7.304 ms      │ 6.675 ms      │ 6.685 ms      │ 100     │ 100
╰─ reverb       8.281 ms      │ 8.422 ms      │ 8.322 ms      │ 8.326 ms      │ 100     │ 100

     Running benches/pipeline.rs (target/release/deps/pipeline-bb11d51b2daeec9e)
Timer precision: 30 ns
pipeline  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ long   26.06 ms      │ 27.09 ms      │ 26.15 ms      │ 26.19 ms      │ 100     │ 100
╰─ short  3.788 ms      │ 3.856 ms      │ 3.839 ms      │ 3.835 ms      │ 100     │ 100

     Running benches/resampler.rs (target/release/deps/resampler-9a89b0f001b3c0cb)
Timer precision: 20 ns
resampler         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ no_resampling  2.762 ms      │ 2.991 ms      │ 2.799 ms      │ 2.803 ms      │ 100     │ 100
╰─ resample_to                  │               │               │               │         │
   ├─ 8000        2.739 ms      │ 3.65 ms       │ 2.761 ms      │ 2.769 ms      │ 100     │ 100
   ├─ 11025       3.216 ms      │ 3.324 ms      │ 3.234 ms      │ 3.235 ms      │ 100     │ 100
   ├─ 16000       3.801 ms      │ 3.95 ms       │ 3.829 ms      │ 3.83 ms       │ 100     │ 100
   ├─ 22050       4.628 ms      │ 4.816 ms      │ 4.795 ms      │ 4.774 ms      │ 100     │ 100
   ├─ 44100       2.769 ms      │ 2.901 ms      │ 2.792 ms      │ 2.793 ms      │ 100     │ 100
   ├─ 48000       7.135 ms      │ 7.357 ms      │ 7.23 ms       │ 7.224 ms      │ 100     │ 100
   ├─ 88200       10.82 ms      │ 11.01 ms      │ 10.9 ms       │ 10.9 ms       │ 100     │ 100
   ├─ 96000       12.2 ms       │ 12.62 ms      │ 12.35 ms      │ 12.36 ms      │ 100     │ 100
   ├─ 176400      20.32 ms      │ 20.92 ms      │ 20.52 ms      │ 20.55 ms      │ 100     │ 100
   ├─ 192000      22.71 ms      │ 23.12 ms      │ 22.88 ms      │ 22.89 ms      │ 100     │ 100
   ├─ 352800      38.86 ms      │ 39.6 ms       │ 38.92 ms      │ 38.94 ms      │ 100     │ 100
   ╰─ 384000      45 ms         │ 47.65 ms      │ 45.07 ms      │ 45.15 ms      │ 100     │ 100

This PR

Timer precision: 20 ns
conversions     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ from_sample                │               │               │               │         │
   ├─ f32       753.3 µs      │ 969 µs        │ 919.8 µs      │ 920.8 µs      │ 100     │ 100
   ├─ i16       255 µs        │ 373.2 µs      │ 255.3 µs      │ 258.4 µs      │ 100     │ 100
   ╰─ u16       380 µs        │ 390.1 µs      │ 380.2 µs      │ 381.7 µs      │ 100     │ 100

     Running benches/effects.rs (target/release/deps/effects-4d762cfcb96f6e43)
Timer precision: 20 ns
effects         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ agc_enabled  3.695 ms      │ 4.135 ms      │ 3.699 ms      │ 3.709 ms      │ 100     │ 100
├─ amplify      380 µs        │ 398.3 µs      │ 380.4 µs      │ 382.5 µs      │ 100     │ 100
├─ fade_out     1.141 ms      │ 1.224 ms      │ 1.142 ms      │ 1.146 ms      │ 100     │ 100
├─ high_pass    1.711 ms      │ 1.733 ms      │ 1.714 ms      │ 1.715 ms      │ 100     │ 100
╰─ reverb       12.65 ms      │ 13.03 ms      │ 12.68 ms      │ 12.69 ms      │ 100     │ 100

     Running benches/pipeline.rs (target/release/deps/pipeline-113220290e55aa6e)
Timer precision: 20 ns
pipeline  fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ long   30.83 ms      │ 34.99 ms      │ 30.87 ms      │ 30.92 ms      │ 100     │ 100
╰─ short  1.711 ms      │ 1.73 ms       │ 1.714 ms      │ 1.714 ms      │ 100     │ 100

     Running benches/resampler.rs (target/release/deps/resampler-6724b60d4573a4a8)
Timer precision: 20 ns
resampler         fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ no_resampling  3.819 ms      │ 4.011 ms      │ 3.83 ms       │ 3.833 ms      │ 100     │ 100
╰─ resample_to                  │               │               │               │         │
   ├─ 8000        3.196 ms      │ 3.71 ms       │ 3.584 ms      │ 3.552 ms      │ 100     │ 100
   ├─ 11025       3.244 ms      │ 6.276 ms      │ 3.63 ms       │ 3.651 ms      │ 100     │ 100
   ├─ 16000       4.125 ms      │ 4.529 ms      │ 4.479 ms      │ 4.474 ms      │ 100     │ 100
   ├─ 22050       4.709 ms      │ 4.764 ms      │ 4.721 ms      │ 4.726 ms      │ 100     │ 100
   ├─ 44100       3.816 ms      │ 3.936 ms      │ 3.832 ms      │ 3.869 ms      │ 100     │ 100
   ├─ 48000       8.168 ms      │ 8.327 ms      │ 8.277 ms      │ 8.275 ms      │ 100     │ 100
   ├─ 88200       11.43 ms      │ 12.03 ms      │ 11.49 ms      │ 11.56 ms      │ 100     │ 100
   ├─ 96000       13.98 ms      │ 14.23 ms      │ 14.16 ms      │ 14.17 ms      │ 100     │ 100
   ├─ 176400      21.14 ms      │ 22.38 ms      │ 21.29 ms      │ 21.35 ms      │ 100     │ 100
   ├─ 192000      25.14 ms      │ 25.84 ms      │ 25.43 ms      │ 25.47 ms      │ 100     │ 100
   ├─ 352800      41.62 ms      │ 43.17 ms      │ 42.3 ms       │ 42.25 ms      │ 100     │ 100
   ╰─ 384000      46.84 ms      │ 67.99 ms      │ 48.41 ms      │ 48.47 ms      │ 100     │ 100

dvdsk added 17 commits March 1, 2025 12:40
All other sources have been marked as TODO using `todo!()`. This is such
that the test suite can run. New tests need to be made for all non
trivial implementations (those that do not simply forward the call to
`parameters_changed`)

- trivial implementations: TestSource, SamplesBuffer, chirp, delay,
  empty, empty_callback, mix, sawtooth, signal_generator, sine, square,
  triangle, zero, static_buffer
- implemented: blt, buffered, position, repeat, skip, take_duration,
  uniform
- marked as todo: all decoders, Mixer, Queue, source/from_iter

Added some sources to make the implementation easier:
- TakeSpan, takes up to a span of samples
- TakeSamples, takes n samples or slightly less. Always ends frame
  aligned
- Peekable, similar to Iterator::peekable.
also adds a new mod test support with a special test source that
can be used to test span boundary/parameters_changed handling.

adds two extra tests for `skip_duration` testing
 - parameters_changing during skip
 - ending on frame boundary
The removal of factories (like `source::amplify(unamplified, 1.2)` ->
`unamplified.amplify(1.2)`) was already planned and is tracked in issue
622. Its not done now. I am working through it as I add tests for all
the non-trivial source's.
This also improves the `TestSpan` in tests/test_support to prevent
expected total duration not matching `sample_rate` * `channels` *
`sample_count`
fix peekable & peekable tests + improve TestSource

performance: remove `if` from PeekableSource::next
add tests for buffered, fix buffered `parameters_changed`
dvdsk added 3 commits March 1, 2025 12:40
I ran into a lot of bugs while adding tests that had to do with channel
being set to zero somewhere. While this change makes the API slightly
less easy to use it prevents very hard to debug crashes/underflows etc.

Performance might drop in decoders, the current implementation makes the
bound check every time `channels` is called which is once per span. This
could be cached to alleviate that.
@dvdsk
Copy link
Collaborator Author

dvdsk commented Mar 1, 2025

Semantics of channels, sample_rate & parameters_changed

Current situation: greedy

Semantics are best explained with an example:

  • next -> (sample at sample_rate S1 and channel count C1)
  • parameters_changed -> false
  • next -> (sample at sample_rate S1 and channel count C1)
  • parameters_changed -> true
  • next -> (sample at sample_rate S2 and channel count C2)
  • parameters_changed -> false
  • next -> (sample at sample_rate S2 and channel count C2)
  • parameters_changed -> false
  • next -> (sample at sample_rate S2 and channel count C2)

In words, before you call next you get that the parameters are going to change. Said differently you get whether the next sample has a different channel count or sample rate then the previous sample.

Pro:

  • Sources that need to handle channel count and sample rate changes are simpler.
  • Similar semantics to current_span_length, you know ahead of time when to change.

Cons:

  • Ever source that can cause sample_rate or channel_count to change gets rather complex, needing to know ahead of time if that is going to happen.
  • Seems slower to me because of all the buffering

Alternative: lazy

Semantics again explained with an example:

  • next -> (sample at sample_rate S1 and channel count C1)
  • parameters_changed -> false
  • next -> (sample at sample_rate S1 and channel count C1)
  • parameters_changed -> false
  • next -> (sample at sample_rate S2 and channel count C2)
  • parameters_changed -> true
  • next -> (sample at sample_rate S2 and channel count C2)
  • parameters_changed -> false
  • next -> (sample at sample_rate S2 and channel count C2)

In words, after you call next you learn whether the sample you just got has a different sample rate or channel count then the samples before.

Pro:

  • Ever source that can cause sample_rate or channel_count get simpler when they cause a change they can flip a bool, reset it after the next sample has been emitted (consumer calls next).
  • Similar semantics to current_span_length, you know ahead of time when to change.
  • Sources that need to handle channel count and sample rate changes are simpler.

Cons:

  • Larger difference to how things are now
  • I just wrote 2.5k lines with the approach above and I do not want to change them 😢

Note that sources that need to know if the sample_rate or channel_count changes do not need to buffer across calls to next. They:

  • call next
  • call parameters_changed
  • realize they did
  • query the new parameters using channel_count and sample_rate
  • handle the sample together with the new parameters

Interrupted because:

 ### Semantics of channels, sample_rate & parameters_changed

 ## Current situation: greedy

Semantics are best explained with an example:
- `next` -> (sample at sample_rate _S1_ and channel count _C1_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S1_ and channel count _C1_)
- `parameters_changed` -> true
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)

In words, before you call next you get that the parameters are going to change. Said differently you get whether the next sample has a different channel count or sample rate then the previous sample.

**Pro:**
- Sources that need to handle *channel count* and *sample rate* changes are simpler.
- Similar semantics to `current_span_length`, you know ahead of time when to change.

**Cons:**
- Ever source that can cause sample_rate or channel_count gets rather complex, needing to know ahead of time if that is going to happen.
- Seems slower to me because of all the `buffering`

 ## Alternative: lazy

Semantics again explained with an example:
- `next` -> (sample at sample_rate _S1_ and channel count _C1_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S1_ and channel count _C1_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)
- `parameters_changed` -> true
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)
- `parameters_changed` -> false
- `next` -> (sample at sample_rate _S2_ and channel count _C2_)

In words, after you call next you learn whether the sample you just got has a different sample rate or channel count then the samples before.

**Pro:**
- Ever source that can cause sample_rate or channel_count get simpler when they cause a change they can flip a `bool`, reset it after the next sample has been emitted (consumer calls `next`).
- Similar semantics to `current_span_length`, you know ahead of time when to change.
- Sources that need to handle *channel count* and *sample rate* changes are simpler.

**Cons:**
- Larger difference to how things are now
- I just wrote 2.5k lines with the approach above and I do not want to change them 😢

Note that sources that need to know if the sample_rate or channel_count changes do not need to buffer across calls to `next`. They:
- call next
- call parameters_changed
- realize they did
- query the new parameters using `channel_count` and `sample_rate`
- handle the sample together with the new parameters
@dvdsk
Copy link
Collaborator Author

dvdsk commented Mar 1, 2025

EDIT: might not do the thing below but go for the greedy approach anyway, see: #706 (comment)

changing semantics for lazy option above, todo list:

  • buffered
  • channel_volume
  • crossfade
  • delay
  • done
  • empty
  • empty_callback
  • fadein
  • fadeout
  • from_factory
  • from_iter
  • linear_ramp
  • pausable
  • peekable
  • position
  • repeat
  • skip_duration
  • skippable
  • stoppable
  • take_duration
  • take_samples
  • take_span*, removed: impossible to implement with lazy*
  • uniform
  • queue

@dvdsk
Copy link
Collaborator Author

dvdsk commented Mar 1, 2025

yet another idea: have both styles of parameters_changed available (compiler can probably get rid of those not used as dead code?)

@PetrGlad
Copy link
Collaborator

PetrGlad commented Mar 2, 2025

Our linear resampler might be even broken. There are some comments that suggest it was not completely polished. It used numerator/divisor calculation which now may probably be more straightforward. I would leave this to a separate task. After this change is made.

@PetrGlad
Copy link
Collaborator

PetrGlad commented Mar 2, 2025

Regarding that greedy/lazy discussion. I do not really see how flagging before next frame may cause problems. I would like to keep changes to in-between frames only still. Sources that change number parameters themselves have all information about the change and should have no problems flagging it. The ones that do not change parameters or do not depend on them can just relay call to parameters_changed.
The sources that combine inputs should process whole frames and then check whether a change is needed.
Can you give an example how this approach may affect to buffering? If whole frame has to be processed it should be read completely anyway, if samples are independent, just loop for number of samples and check then.

With that lazy option, flagging after reading next sample and releasing is depending on next() call is confusing. Next source would have to read the sample, keep it somewhere then read new parameters than maybe re-allocate its buffers and then put that sample as the first one... On other words it is not clear to me which problem it is supposed to solve.

@dvdsk
Copy link
Collaborator Author

dvdsk commented Mar 3, 2025

I do not really see how flagging before next frame may cause problems.

It does not cause problems but it may require a more complex implementation. Honestly both approaches are complex and it seems I have made a mistake making it worse. Ill get back on that later. I have been designing #712 yesterday and want to try working on that a while. Then we could compare performance and code complexity. Getting back to your question, lets say you are the queue source:

  • you have to provide the correct parameters for the next sample
  • at any point the current source may end, you do not know when

Thus you need to peek one sample ahead so you know if your current sample is actually the last. This kind of needing to peek one ahead happens a lot, I wrote a PeekingSource. To handle a span of one sample PeekingSource needs to when it reads ahead track: parameter changes, sample rate and channel count.

You might have already seen the mistake, we can just forbid spans of length one. They make no sense anyway, why would you want 1 sample (at 44_100 samples a second!) to have a different channel count? So peeking is simple again, removing this complexity. Quite a few sources need to know if 'the end is coming' so this unneeded complexity was duplicated at a few places. That is what prompted me to look at alternatives.

I am still working on #712, I still think that is valuable and if its partially implemented we can compare its performance to this PR.

@dvdsk dvdsk mentioned this pull request May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants