Description
I have had some time and have been pouring it into rodio lately. Having now written a lot of span/parameter_change related code I have some opinions now :) I have paused progress on the parameters_changed()
PR while working out if an alternative (see below) works better.
A lot of performance is being lost by dealing with spans in various sources. Lets take buffered
as an example. Since we got to notify any consumer at the right moment of a sample rate/channel count changes we up having to keep track of a lot.
This is (just a part of) the Buffered::next()
method I have been working on this morning:
if self.shared.samples_in_memory.len() < self.samples_index {
let sample = self.shared.samples_in_memory[self.samples_index];
if self.samples_index == self.next_parameter_change_at {
let new_params = &self.shared.parameters[self.parameter_changes_index];
self.sample_rate = new_params.sample_rate;
self.channel_count = new_params.channel_count;
}
// sample after sample where flag a parameter_change
if self.samples_index > self.next_parameter_change_at {
self.next_parameter_change_at =
if self.parameter_changes_index > self.shared.parameters.len() {
usize::MAX
} else {
self.shared.parameters[self.parameter_changes_index].index
};
}
self.samples_index += 1;
return Some(sample);
}
Now if instead we got entirely rid of them *1 this (part of the) method would become trivial:
if self.shared.samples_in_memory.len() < self.samples_index {
let sample = self.shared.samples_in_memory[self.samples_index];
self.samples_index += 1;
return Some(sample);
}
Alternative: no spans
Get rid of spans, source will no longer have member functions channels()
& sample_rate()
instead it gets set_channel_count()
& set_sample_rate()
. The consumer (outputstream) uses those to communicate the picked sample_rate to the edges of the audio tree
.
Schematic example
OutputStream @ sr1,chs 4
|
mixer
/ \
low_pass mixer
/ / \_______
mixer queue \
/ \ | convertor
Square Noise convertor |
@44.1/4 @44.1/4 | decoder
decorder @48.0/1
@ 96.1/2
Here OutputStream
picks a sample rate and channel count, then forwards it too the edges. Generators like Square & Noise use that sample rate while between every variable sample rate edge and the tree a resampler is inserted. As an optimization the optimal target parameters might be picked from those available by inspecting the edges of the tree. For that we could add a functions preferred_sample_rate()
& preferred_channel_count()
.
Types
We could introduce a new trait SourceEdge
that is the current Source
. A SourceEdge
would have a member convertor()
that transforms it into a Source
. An alternative would be integrating the convertor into each decoder. This probably has some performance advantages.
Many decoders at the same sample rate
Given five decoders that all need to be mixed, four of them at a sample rate 44.1khz, one at 96.1khz and the output setting the samplerate for the tree to 48.0khz. Keep in mind tha in this new span less rodio the mixer does not do any conversions. In the current version of rodio we could build the tree such that we would mix the four matching decoders and only then resample. Given hifi resampling is performance intensive that would speed up things a lot.
I can imagine this is a commen scenario in game audio, the decoder at 96.1khz could be a microphone for example. The other decoders sound effects. Can we still optimize this? Maybe if we introduce subtrees each at a fixed sample_rate with conversions in between, as seen below. This is something the user would need to set up themselves.
OutputStream @ sr1,chs 4
|
mixer
/ \
convertor \
| \
------------------------------ mixer \_____
| | | | \
convertor convertor convertor convertor \
| | | | convertor
decoder decoder decoder decoder |
@44.1 @44.1 @44.1 @44.1 decoder
@96.1
Note we still need convertor's in between in case the decoders change their samplerate. Those could skip the actual resampling when the sample rate does match.
Re-negotiation
This is an advanced feature, I am describing it to see how far we can get regarding performance in a span-less rodio.
The edges of the tree could vote to renegotiate. They would increase a shared AtomicIsize
if they want to re-negotiation and decrease it if the do not. This would be usefull in case there are mixed sources that move from differing sample rates to the same. The outcome of the vote would be determined by the nearest mixer which has a convertor after it (we might want to make that a seperate thing, converting_mixer
or something like that).
Example
OutputStream @ sr1,chs 4
|
mixer
/ \
convertor \
| \
------------------------------ mixer \_____
| | | | \
convertor convertor convertor convertor \
| | | | convertor
decoder decoder decoder decoder |
@42.0 @22.0 @22.0 @96.1 decoder
->44.1 ->44.1 ->44.1 ->44.1 @96.1
Here the 4 decoders mixed left start at different sample rates. The first negotiation picks 22.0khz as target samplerate for the mixer.
Speed source needs an extra resampling step
An elegant thing about the current rodio version is that changing audio speed is simply changing the reported samplerate. So if an audio tree speeds up audio and then slows it down that only takes effect when resampling just before the OutputStream
. That is quite optimal.
If we make the sample rate in the tree constant speed no longer works. This is not a large problem as we can adapt speed to work again by resampling during the speed step. A more refined speed source would work without resampling but using FFT to not increase/decrease pitch like the current does.
*1: @PetrGlad brought this up recently and I have seen the suggestion in a old
comment by an earlier rodio maintainer. I dismissed it at the time as too big a
change/impossible.