Skip to content

Decoder builder with additional settings #697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Mar 6, 2025

Conversation

roderickvd
Copy link
Collaborator

@roderickvd roderickvd commented Feb 4, 2025

This PR started with the intention to make the Symphonia decoder more robust and expose more of its functionality to users. The changes are implemented through a builder pattern while maintaining compatibility with the existing API.

Along the way, opportunities are taken to refactor the existing codebase.

Commits so far focus on:

  • Adding a builder pattern for configuring decoders
  • Improving ReadSeekSource to better handle seeking and duration calculations
  • Making seeking behavior configurable (coarse vs accurate seeking)
  • Exposing gapless playback configuration
  • Adding comprehensive documentation

Fixes:

Supersedes:

@gulis1
Copy link

gulis1 commented Feb 5, 2025

The builder looks great to me, thanks for the work!

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

@roderickvd
Copy link
Collaborator Author

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

Yes, I've got a with_file cooked up that I'll push hopefully tomorrow.

@roderickvd
Copy link
Collaborator Author

roderickvd commented Feb 6, 2025

Maybe we could add a new with_file or from_file method? I think playing audio from files is one of the most common things new users will use Rodio for, so having a method that extracts the byte_len from the file's metadata (specially for ogg vorbis files)`would be helpful.

So I was trying different things, Decoder::with_file, Decoder::from_file. With 558e613 you can now do:

let file = File::open("audio.mp3").unwrap();
let decoder = Decoder::try_from(file).unwrap();

Which wraps it in a BufReader and sets byte_len when available. What do you guys think about that? Nice and ergonomic or hiding too much from the user?

Old API remains unchanged:

let file = File::open("audio.mp3").unwrap();
let decoder = Decoder::new(file).unwrap(); // just as a reader

Or for advanced usage, use the builder:

let file = File::open("audio.mp3").unwrap();
let len = file.metadata().unwrap().len();
let decoder = DecoderBuilder::new()
    .with_data(BufReader::new(file))
    .with_byte_len(len)
    .build()
    .unwrap();

@roderickvd
Copy link
Collaborator Author

Probably should re-add Decoder::builder as a shorthand to DecoderBuilder::new.

@dvdsk
Copy link
Collaborator

dvdsk commented Feb 7, 2025

What should we do about other decoders? I think its out of the scope of this PR to check what features they could support and implement them. We should still communicate to users that the chose feature set is not supported.

Maybe decoder creation should return a Error feature not supported? (and we add a bold warning to the builder options that could cause that). The defaults should probably be picked such that they do not produce that error.

Copy link
Collaborator

@dvdsk dvdsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed up to a4db631

@roderickvd
Copy link
Collaborator Author

roderickvd commented Feb 7, 2025

What should we do about other decoders? I think its out of the scope of this PR to check what features they could support and implement them. We should still communicate to users that the chose feature set is not supported.

Maybe decoder creation should return a Error feature not supported? (and we add a bold warning to the builder options that could cause that). The defaults should probably be picked such that they do not produce that error.

Today, without this PR, the Symphonia decoder is gapless (trims) and has seeking functionality in a lot of cases.
The other decoders do not without much of a hint to the user. (yeah, Err when attempting to seek)

Settings::default in this PR shares with current behavior that gapless is enabled by default, but differs in that is_seekable is false by default. That's in order to deal with the upcoming fix in Rodio#340: byte_len needs to be set for correct seeking behavior, either from metadata or by manually doing Seek::seek(SeekFrom::End(0)).

At one point I thought about having builder methods feature-gated, but that won't work because of #539.

So that leaves us with:

  1. soft-failure like now, with documented behavior
  2. printing / logging a message
  3. throwing an error somewhere

Numbers 2 and 3 could only be done in the build / build_looped phase, because only at that point will we know whether, say, an Ogg Vorbis file will be decoded by lewton or Symphonia.

I'm kind of partial to number 1: settings only do something with Symphonia and that's that.
I'll do a check to see what settings the other decoders could take, but I don't think it's much (or anything at all). In that check I'll also consider to include some further Symphonia settings or not.

Besides that check, and moving stuff out into manageable-sized files, I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

reviewed up to a4db631

🙏

@roderickvd roderickvd changed the title Decoder refactoring Decoder builder with additional settings Feb 7, 2025
@dvdsk
Copy link
Collaborator

dvdsk commented Feb 8, 2025

I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

I see, its the seek_does_not_break_channel_order test that is failing. That is strange given it works with the other symphonia decoders. Its not a difference in test files, I've taken a quick look in audacity: RL.wav & RL.ogg look identical.

Might be time to test by hand, just to be sure the test is correct. Try making an example that plays the left right and check if the beep switches speaker. The wav and ogg should have the same switching. (That is what the test attempts to check).

dvdsk added a commit that referenced this pull request Feb 10, 2025
Fix WavDecoder::total_duration. 

Note there are tests as part of #697 those are partially wrong and will break because of this being merged. We should use ffprobe to check the correct play times and put those in the test.
@roderickvd roderickvd force-pushed the refactor/decoder-improvements branch from 485099d to 001d8df Compare February 11, 2025 20:09
@roderickvd
Copy link
Collaborator Author

roderickvd commented Feb 11, 2025

I'm having a bit of a fight with the seek.rs integration test for RL.ogg and RL.m4a. They were disabled before, but should be working now. Any ideas?

I see, its the seek_does_not_break_channel_order test that is failing. That is strange given it works with the other symphonia decoders. Its not a difference in test files, I've taken a quick look in audacity: RL.wav & RL.ogg look identical.

Might be time to test by hand, just to be sure the test is correct. Try making an example that plays the left right and check if the beep switches speaker. The wav and ogg should have the same switching. (That is what the test attempts to check).

I've found that I can get the test to pass by increasing the sample window in tests/seek.rs from 100 to ~1000 samples. When seeking these Ogg Vorbis and MP4/AAC files, it turns out that the actual position is slightly under 100 time units off from the target position.

It doesn't sit very well with me, and I'm finding SymphoniaDecoder::refine_position a bit suspect. I'm struggling to understand it. I assume its purpose is to do a linear search forward to the target position. Where I'm struggling is that to do that, it immediately goes for the next_packet and decodes that. Edit: see comment below.

reviewed up to a4db631

I rebased to pull in #699. For the record your review was now up to fb1b777.

@roderickvd
Copy link
Collaborator Author

roderickvd commented Feb 11, 2025

Also this:

let mut samples_to_pass = seek_res.required_ts - seek_res.actual_ts;
// ...
self.current_span_offset = samples_to_pass as usize * self.channels() as usize;

The unit of required_ts and actual_ts is timestamp in TimeBase units, not number of samples.
So it's more like a time_to_pass from which we would have to calculate the number of samples.

Playing around with this but it doesn't yet pass the tests:

    fn try_seek(&mut self, pos: Duration) -> Result<(), source::SeekError> {
        // saturating logic removed for brevity

        // make sure the next sample is for the right channel
        let skip_channel = self.current_span_offset % self.channels() as usize;

        let seek_res = self
            .format
            .seek(
                self.seek_mode,
                SeekTo::Time {
                    time: target.into(),
                    track_id: None,
                },
            )
            .map_err(SeekError::BaseSeek)?;

        // Should not be needed if we use our own iterator to drain
        //self.refine_position(seek_res)?;
        
        // Seeking is a demuxer operation without the decoder knowing about it,
        // so we need to reset the decoder to make sure it's in sync and prevent
        // audio glitches.
        self.decoder.reset();
        
        // Force the next packet to be decoded.
        self.current_span_offset = self.buffer.len();
        
        // Calculate the number of samples to skip
        let samples_to_skip = (Duration::from(
            self.time_base
                .unwrap()
                .calc_time(seek_res.required_ts - seek_res.actual_ts),
        )
        .as_secs_f32()
            * self.sample_rate() as f32
            * self.channels() as f32) as usize;

        // Skip the samples
        for _ in 0..(samples_to_skip + skip_channel) {
            self.next();
        }

        Ok(())
    }

@dvdsk
Copy link
Collaborator

dvdsk commented Feb 11, 2025

Where I'm struggling is that to do that, it immediately goes for the next_packet and decodes that

Yeah that should have been documented. See the symphonia docs: https://docs.rs/symphonia-core/0.5.4/symphonia_core/formats/trait.FormatReader.html#tymethod.seek specifically:

After a seek, all Decoders consuming packets from this reader should be reset.

do please add a comment noting that, I should have done that when I wrote it 😅

@dvdsk
Copy link
Collaborator

dvdsk commented Feb 11, 2025

The unit of required_ts and actual_ts is timestamp in TimeBase units, not number of samples.

thanks! & decoder.reset() seems to also be missing.

@dvdsk
Copy link
Collaborator

dvdsk commented Feb 11, 2025

Playing around with this but it doesn't yet pass the tests:

is it just the ogg thats failing or the rest too? To my eye it should work, especially since the previous code did (on all but ogg).

@roderickvd
Copy link
Collaborator Author

thanks! & decoder.reset() seems to also be missing.

Yes, added that in my code snippet above that I'm trying to get to work appropriately. Maybe I need to start looking at the angle of the tests too, see whether they are correct in full.

is it just the ogg thats failing or the rest too? To my eye it should work, especially since the previous code did (on all but ogg).

Currently failing all five cases of seek_does_not_break_channel_order. So something is really off.

By the way, merely putting self.decoder.reset() in the current master also causes breakage.
Will investigate further, but have a bit less time on my hands the coming days.

@roderickvd
Copy link
Collaborator Author

I think I'm onto it: due to the truncating behavior of calculating the number of samples from the duration, the code above does not guarantee it is aligned with the correct channel. Working on it to make it so.

@roderickvd
Copy link
Collaborator Author

😅 that should do it. Next to my hunch in #697 (comment), there was also the case for Ogg & MP4 that rewinding a stream would successfully decode a packet with zero samples and only metadata, which caused the iterator to end.

Even so, I disabled one specific MP4 test (seek_does_not_break_channel_order) because there are several outstanding issues with Symphonia that make position calculation unreliable. There's multiple issues and PRs handling that, so we should re-check when Symphonia 0.5.5 or greater is released.

Increasing the test sample window from 100 to 1000 makes the test pass, but hides the underlying problem.

Props on having all these tests; they caught a lot of stuff.

While I still have more than a few ideas to refactor the Symphonia decoder, I am going to move this out of draft now. Appreciate your review to get this settled first.

roderickvd and others added 20 commits March 6, 2025 11:57
Adds PhantomData to Symphonia variant of DecoderImpl to ensure the type
parameter R is used even when only the Symphonia decoder is enabled.
This fixes type inference errors in LoopedDecoder's Source implementation
when building with only Symphonia enabled.
Implementation changes:
- size_hint now returns inner decoder's lower bound when active,
  previously always returned (0, None)

Documentation improvements:
- Explain size_hint behavior with active/inactive decoder
- Document sample_rate returns default when no decoder
- Clarify seek behavior with regards to looping
- Add details about error cases in try_seek

Makes the size hint more accurate by respecting the underlying
decoder's capabilities while maintaining the unbounded upper limit
for looped playback.
Replace Mp4Type enum with MIME type string for MP4 format detection:
- Remove Mp4Type enum and its FromStr/Display implementations
- Update new_mp4() to use "audio/mp4" MIME type instead
- Simplify documentation to reflect the change

This change removes unnecessary complexity around MP4 format
variants while maintaining the same functionality through MIME types.

Fixes RustAudio#612
Add Source and Iterator trait implementations for DecoderOutput to delegate
all operations to the underlying Normal/Looped decoder variants.
This allows using DecoderOutput directly in contexts that expect
these traits.

Implementation simply forwards all trait methods to the contained decoder
without changing any behavior.
- Add TryFrom implementation for File->Decoder that automatically handles
  buffering and byte length optimization
- Fix OGG duration calculation in symphonia decoder
- Re-enable MP4 tests now that symphonia decoder is seeking correctly

Fixes RustAudio#577
Simplifies Decoder struct documentation by referencing module-level docs
instead of duplicating examples. Updates examples to show preferred
Decoder::builder() API instead of DecoderBuilder::new().

Part of improving the builder pattern API ergonomics.
Makes TryFrom<File> for Decoder return an error if file metadata cannot
be read, rather than silently continuing without byte_len. This ensures
seeking optimizations are always enabled when using this conversion.
Adds support for creating decoders directly from BufReader and Cursor types.
Documents that TryFrom<File> is preferred for files due to automatic
seeking optimizations from byte_len.
Replace DecoderOutput enum with separate build() and build_looped() methods
on DecoderBuilder. This simplifies the API by:

- Removing the need for DecoderOutput enum and match statements
- Making it more explicit whether a normal or looped decoder is being created
- Reducing code duplication in decoder creation methods like new_mp3(),
  new_aac() etc.
Change the error returned when seeking on an ended looped decoder from
SeekError::NotSupported to a more specific IoError. This better reflects
that the error occurred because the decoder failed to loop back to the
beginning rather than seeking not being supported.
- Rename test from seek_returns_err_if_unsupported to decoder_returns_total_duration

- Enabled Ogg Vorbis and MP4/AAC decoders in tests.

- Switch from using try_from to explicit builder pattern to properly test
  decoder duration behavior:
  - Set byte_len from file metadata (required for duration calculation in some formats)
  - Enable seeking (as duration may require seeking in some formats)
  - Disable gapless playback (to include padding frames in duration)

- Update expected durations to exact values from reference files.
  Values are verified with ffprobe.

- Symphonia decoder improvements:
  - Simplify saturating seeks.
  - Improve total_duration performance.
…and examples

Use Decoder::try_from() instead of Decoder::new(). This new API is the preferre
way to decode files as it automatically:

- Wraps files in BufReader for better performance
- Sets file length from metadata for improved seeking
- Enables seeking by default

The functionality remains the same but uses the more optimized path.
Improve the symphonia decoder's seeking behavior in several ways:

- Reset decoder after seeking to prevent audio glitches, as the decoder
  needs to resync after demuxer operations
- When using accurate seek mode, calculate precise sample position using
  time base rather than assuming seek difference equals sample count
- Fix channel alignment issues that could cause wrong channel to play
  first after seeking
- Preserve active channel offset during seeking operations

Additionally:
- Temporarily skip seek channel order test for m4a files due to known
  Symphonia AAC timing issues
- Replace Duration::from_secs(0) with Duration::ZERO for consistency
Move the DecoderBuilder implementation into its own module file to improve
code organization and maintainability. The builder pattern is now contained
in decoder/builder.rs while keeping the core Decoder types and traits in
decoder/mod.rs.

This change:
- Extracts builder-specific code to builder.rs
- Updates documentation to reflect the split
- Maintains existing functionality and API
@roderickvd roderickvd force-pushed the refactor/decoder-improvements branch from ca66e33 to a8a8b04 Compare March 6, 2025 11:10
@roderickvd roderickvd removed the waiting for rebase PR needs to be rebased before it can be reviewed/merged label Mar 6, 2025
@roderickvd
Copy link
Collaborator Author

Rebasing done, CI passes, good to merge.

@dvdsk dvdsk merged commit 1c2cd2f into RustAudio:master Mar 6, 2025
9 checks passed
@dvdsk
Copy link
Collaborator

dvdsk commented Mar 6, 2025

yay 🎈

@nc7s
Copy link

nc7s commented Apr 1, 2025

Thanks for implementing this! I can now save adding symphonia and doing the calculations myself. Would you mind kindly releasing it?

@roderickvd
Copy link
Collaborator Author

Yes, see #721

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

total_duration not working with symphonia decoders Refactor: Remove Mp4Type
4 participants