Decoder builder with additional settings #697

roderickvd · 2025-02-04T22:24:04Z

This PR started with the intention to make the Symphonia decoder more robust and expose more of its functionality to users. The changes are implemented through a builder pattern while maintaining compatibility with the existing API.

Along the way, opportunities are taken to refactor the existing codebase.

Commits so far focus on:

Adding a builder pattern for configuring decoders
Improving ReadSeekSource to better handle seeking and duration calculations
Making seeking behavior configurable (coarse vs accurate seeking)
Exposing gapless playback configuration
Adding comprehensive documentation

Fixes:

Supersedes:

Fix Symphonia seek for a duration with a fractional element of 0 #687

gulis1 · 2025-02-05T20:03:39Z

The builder looks great to me, thanks for the work!

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

roderickvd · 2025-02-05T22:26:29Z

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

Yes, I've got a with_file cooked up that I'll push hopefully tomorrow.

roderickvd · 2025-02-06T21:30:53Z

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

So I was trying different things, Decoder::with_file, Decoder::from_file. With 558e613 you can now do:

let file = File::open("audio.mp3").unwrap();
let decoder = Decoder::try_from(file).unwrap();

Which wraps it in a BufReader and sets byte_len when available. What do you guys think about that? Nice and ergonomic or hiding too much from the user?

Old API remains unchanged:

let file = File::open("audio.mp3").unwrap();
let decoder = Decoder::new(file).unwrap(); // just as a reader

Or for advanced usage, use the builder:

let file = File::open("audio.mp3").unwrap();
let len = file.metadata().unwrap().len();
let decoder = DecoderBuilder::new()
    .with_data(BufReader::new(file))
    .with_byte_len(len)
    .build()
    .unwrap();

roderickvd · 2025-02-07T08:40:16Z

Probably should re-add Decoder::builder as a shorthand to DecoderBuilder::new.

src/decoder/mod.rs

dvdsk · 2025-02-07T11:39:47Z

What should we do about other decoders? I think its out of the scope of this PR to check what features they could support and implement them. We should still communicate to users that the chose feature set is not supported.

Maybe decoder creation should return a Error feature not supported? (and we add a bold warning to the builder options that could cause that). The defaults should probably be picked such that they do not produce that error.

src/decoder/symphonia.rs

dvdsk

reviewed up to a4db631

roderickvd · 2025-02-07T23:15:03Z

What should we do about other decoders? I think its out of the scope of this PR to check what features they could support and implement them. We should still communicate to users that the chose feature set is not supported.

Maybe decoder creation should return a Error feature not supported? (and we add a bold warning to the builder options that could cause that). The defaults should probably be picked such that they do not produce that error.

Today, without this PR, the Symphonia decoder is gapless (trims) and has seeking functionality in a lot of cases.
The other decoders do not without much of a hint to the user. (yeah, Err when attempting to seek)

Settings::default in this PR shares with current behavior that gapless is enabled by default, but differs in that is_seekable is false by default. That's in order to deal with the upcoming fix in Rodio#340: byte_len needs to be set for correct seeking behavior, either from metadata or by manually doing Seek::seek(SeekFrom::End(0)).

At one point I thought about having builder methods feature-gated, but that won't work because of #539.

So that leaves us with:

soft-failure like now, with documented behavior
printing / logging a message
throwing an error somewhere

Numbers 2 and 3 could only be done in the build / build_looped phase, because only at that point will we know whether, say, an Ogg Vorbis file will be decoded by lewton or Symphonia.

I'm kind of partial to number 1: settings only do something with Symphonia and that's that.
I'll do a check to see what settings the other decoders could take, but I don't think it's much (or anything at all). In that check I'll also consider to include some further Symphonia settings or not.

Besides that check, and moving stuff out into manageable-sized files, I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

reviewed up to a4db631

🙏

dvdsk · 2025-02-08T11:10:36Z

I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

I see, its the seek_does_not_break_channel_order test that is failing. That is strange given it works with the other symphonia decoders. Its not a difference in test files, I've taken a quick look in audacity: RL.wav & RL.ogg look identical.

Might be time to test by hand, just to be sure the test is correct. Try making an example that plays the left right and check if the beep switches speaker. The wav and ogg should have the same switching. (That is what the test attempts to check).

Fix WavDecoder::total_duration. Note there are tests as part of #697 those are partially wrong and will break because of this being merged. We should use ffprobe to check the correct play times and put those in the test.

roderickvd · 2025-02-11T21:09:01Z

I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

I see, its the seek_does_not_break_channel_order test that is failing. That is strange given it works with the other symphonia decoders. Its not a difference in test files, I've taken a quick look in audacity: RL.wav & RL.ogg look identical.

Might be time to test by hand, just to be sure the test is correct. Try making an example that plays the left right and check if the beep switches speaker. The wav and ogg should have the same switching. (That is what the test attempts to check).

I've found that I can get the test to pass by increasing the sample window in tests/seek.rs from 100 to ~1000 samples. When seeking these Ogg Vorbis and MP4/AAC files, it turns out that the actual position is slightly under 100 time units off from the target position.

It doesn't sit very well with me, and I'm finding SymphoniaDecoder::refine_position a bit suspect. I'm struggling to understand it. I assume its purpose is to do a linear search forward to the target position. Where I'm struggling is that to do that, it immediately goes for the next_packet and decodes that. Edit: see comment below.

reviewed up to a4db631

I rebased to pull in #699. For the record your review was now up to fb1b777.

roderickvd · 2025-02-11T21:14:18Z

Also this:

let mut samples_to_pass = seek_res.required_ts - seek_res.actual_ts;
// ...
self.current_span_offset = samples_to_pass as usize * self.channels() as usize;

The unit of required_ts and actual_ts is timestamp in TimeBase units, not number of samples.
So it's more like a time_to_pass from which we would have to calculate the number of samples.

Playing around with this but it doesn't yet pass the tests:

    fn try_seek(&mut self, pos: Duration) -> Result<(), source::SeekError> {
        // saturating logic removed for brevity

        // make sure the next sample is for the right channel
        let skip_channel = self.current_span_offset % self.channels() as usize;

        let seek_res = self
            .format
            .seek(
                self.seek_mode,
                SeekTo::Time {
                    time: target.into(),
                    track_id: None,
                },
            )
            .map_err(SeekError::BaseSeek)?;

        // Should not be needed if we use our own iterator to drain
        //self.refine_position(seek_res)?;
        
        // Seeking is a demuxer operation without the decoder knowing about it,
        // so we need to reset the decoder to make sure it's in sync and prevent
        // audio glitches.
        self.decoder.reset();
        
        // Force the next packet to be decoded.
        self.current_span_offset = self.buffer.len();
        
        // Calculate the number of samples to skip
        let samples_to_skip = (Duration::from(
            self.time_base
                .unwrap()
                .calc_time(seek_res.required_ts - seek_res.actual_ts),
        )
        .as_secs_f32()
            * self.sample_rate() as f32
            * self.channels() as f32) as usize;

        // Skip the samples
        for _ in 0..(samples_to_skip + skip_channel) {
            self.next();
        }

        Ok(())
    }

dvdsk · 2025-02-11T23:25:38Z

Where I'm struggling is that to do that, it immediately goes for the next_packet and decodes that

Yeah that should have been documented. See the symphonia docs: https://docs.rs/symphonia-core/0.5.4/symphonia_core/formats/trait.FormatReader.html#tymethod.seek specifically:

After a seek, all Decoders consuming packets from this reader should be reset.

do please add a comment noting that, I should have done that when I wrote it 😅

dvdsk · 2025-02-11T23:30:52Z

The unit of required_ts and actual_ts is timestamp in TimeBase units, not number of samples.

thanks! & decoder.reset() seems to also be missing.

dvdsk · 2025-02-11T23:31:09Z

Playing around with this but it doesn't yet pass the tests:

is it just the ogg thats failing or the rest too? To my eye it should work, especially since the previous code did (on all but ogg).

roderickvd · 2025-02-14T21:35:05Z

thanks! & decoder.reset() seems to also be missing.

Yes, added that in my code snippet above that I'm trying to get to work appropriately. Maybe I need to start looking at the angle of the tests too, see whether they are correct in full.

is it just the ogg thats failing or the rest too? To my eye it should work, especially since the previous code did (on all but ogg).

Currently failing all five cases of seek_does_not_break_channel_order. So something is really off.

By the way, merely putting self.decoder.reset() in the current master also causes breakage.
Will investigate further, but have a bit less time on my hands the coming days.

roderickvd · 2025-02-15T19:24:36Z

I think I'm onto it: due to the truncating behavior of calculating the number of samples from the duration, the code above does not guarantee it is aligned with the correct channel. Working on it to make it so.

roderickvd · 2025-02-19T20:16:16Z

😅 that should do it. Next to my hunch in #697 (comment), there was also the case for Ogg & MP4 that rewinding a stream would successfully decode a packet with zero samples and only metadata, which caused the iterator to end.

Even so, I disabled one specific MP4 test (seek_does_not_break_channel_order) because there are several outstanding issues with Symphonia that make position calculation unreliable. There's multiple issues and PRs handling that, so we should re-check when Symphonia 0.5.5 or greater is released.

Increasing the test sample window from 100 to 1000 makes the test pass, but hides the underlying problem.

Props on having all these tests; they caught a lot of stuff.

While I still have more than a few ideas to refactor the Symphonia decoder, I am going to move this out of draft now. Appreciate your review to get this settled first.

Adds PhantomData to Symphonia variant of DecoderImpl to ensure the type parameter R is used even when only the Symphonia decoder is enabled. This fixes type inference errors in LoopedDecoder's Source implementation when building with only Symphonia enabled.

Implementation changes: - size_hint now returns inner decoder's lower bound when active, previously always returned (0, None) Documentation improvements: - Explain size_hint behavior with active/inactive decoder - Document sample_rate returns default when no decoder - Clarify seek behavior with regards to looping - Add details about error cases in try_seek Makes the size hint more accurate by respecting the underlying decoder's capabilities while maintaining the unbounded upper limit for looped playback.

Replace Mp4Type enum with MIME type string for MP4 format detection: - Remove Mp4Type enum and its FromStr/Display implementations - Update new_mp4() to use "audio/mp4" MIME type instead - Simplify documentation to reflect the change This change removes unnecessary complexity around MP4 format variants while maintaining the same functionality through MIME types. Fixes RustAudio#612

Add Source and Iterator trait implementations for DecoderOutput to delegate all operations to the underlying Normal/Looped decoder variants. This allows using DecoderOutput directly in contexts that expect these traits. Implementation simply forwards all trait methods to the contained decoder without changing any behavior.

- Add TryFrom implementation for File->Decoder that automatically handles buffering and byte length optimization - Fix OGG duration calculation in symphonia decoder - Re-enable MP4 tests now that symphonia decoder is seeking correctly Fixes RustAudio#577

Simplifies Decoder struct documentation by referencing module-level docs instead of duplicating examples. Updates examples to show preferred Decoder::builder() API instead of DecoderBuilder::new(). Part of improving the builder pattern API ergonomics.

Makes TryFrom<File> for Decoder return an error if file metadata cannot be read, rather than silently continuing without byte_len. This ensures seeking optimizations are always enabled when using this conversion.

Adds support for creating decoders directly from BufReader and Cursor types. Documents that TryFrom<File> is preferred for files due to automatic seeking optimizations from byte_len.

Replace DecoderOutput enum with separate build() and build_looped() methods on DecoderBuilder. This simplifies the API by: - Removing the need for DecoderOutput enum and match statements - Making it more explicit whether a normal or looped decoder is being created - Reducing code duplication in decoder creation methods like new_mp3(), new_aac() etc.

Change the error returned when seeking on an ended looped decoder from SeekError::NotSupported to a more specific IoError. This better reflects that the error occurred because the decoder failed to loop back to the beginning rather than seeking not being supported.

- Rename test from seek_returns_err_if_unsupported to decoder_returns_total_duration - Enabled Ogg Vorbis and MP4/AAC decoders in tests. - Switch from using try_from to explicit builder pattern to properly test decoder duration behavior: - Set byte_len from file metadata (required for duration calculation in some formats) - Enable seeking (as duration may require seeking in some formats) - Disable gapless playback (to include padding frames in duration) - Update expected durations to exact values from reference files. Values are verified with ffprobe. - Symphonia decoder improvements: - Simplify saturating seeks. - Improve total_duration performance.

…and examples Use Decoder::try_from() instead of Decoder::new(). This new API is the preferre way to decode files as it automatically: - Wraps files in BufReader for better performance - Sets file length from metadata for improved seeking - Enables seeking by default The functionality remains the same but uses the more optimized path.

Improve the symphonia decoder's seeking behavior in several ways: - Reset decoder after seeking to prevent audio glitches, as the decoder needs to resync after demuxer operations - When using accurate seek mode, calculate precise sample position using time base rather than assuming seek difference equals sample count - Fix channel alignment issues that could cause wrong channel to play first after seeking - Preserve active channel offset during seeking operations Additionally: - Temporarily skip seek channel order test for m4a files due to known Symphonia AAC timing issues - Replace Duration::from_secs(0) with Duration::ZERO for consistency

Move the DecoderBuilder implementation into its own module file to improve code organization and maintainability. The builder pattern is now contained in decoder/builder.rs while keeping the core Decoder types and traits in decoder/mod.rs. This change: - Extracts builder-specific code to builder.rs - Updates documentation to reflect the split - Maintains existing functionality and API

roderickvd · 2025-03-06T11:15:51Z

Rebasing done, CI passes, good to merge.

dvdsk · 2025-03-06T12:24:12Z

yay 🎈

nc7s · 2025-04-01T06:56:03Z

Thanks for implementing this! I can now save adding symphonia and doing the calculations myself. Would you mind kindly releasing it?

roderickvd · 2025-04-01T12:11:45Z

Yes, see #721

This was referenced Feb 4, 2025

total_duration not working with symphonia decoders #696

Closed

flac, isomp4, mkv: optimize length detection by using byte_len() instead of seeking pdeljanov/Symphonia#340

Open