-
Notifications
You must be signed in to change notification settings - Fork 133
WIP: feat(web): Web Audio support for RDP web clients #904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Enables RDP web clients to receive and play PCM audio from remote sessions through the browser's native Web Audio API, supporting various sample rates. Highlights: - Web Audio API backend with AudioContext management and sample buffering - PCM sample rate conversion - Extension helpers for web client session integration - Error handling for audio context creation and playback failures
ab51784
to
8f8a39f
Compare
@@ -231,6 +241,12 @@ impl iron_remote_desktop::SessionBuilder for SessionBuilder { | |||
}; | |||
self.0.borrow_mut().outbound_message_size_limit = if limit > 0 { Some(limit) } else { None }; | |||
}; | |||
|enable_audio: bool| { self.0.borrow_mut().enable_audio = enable_audio }; | |||
|audio_sample_rate: f64| { | |||
#[expect(clippy::cast_possible_truncation)] // JavaScript numbers are f64, audio uses f32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thank you for also including the reason of the suppression.
@@ -977,6 +1013,13 @@ async fn connect( | |||
); | |||
} | |||
|
|||
if enable_audio { | |||
debug!("Enabling audio with sample rate: {:?}", audio_sample_rate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: It’s not bad per se, but the general preference goes for structured logging, like so
debug!("Enabling audio with sample rate: {:?}", audio_sample_rate); | |
debug!(audio_sample_rate, "Enabling audio"); |
It’s generally more concise while maintaining the same density of information.
Also, in theory, we could also parse the logs since it’s then structured.
if enable_audio { | ||
debug!("Enabling audio with sample rate: {:?}", audio_sample_rate); | ||
let audio_backend = WebAudioBackend::new(audio_sample_rate) | ||
.map_err(|e| anyhow::Error::msg(format!("failed to initialize Web Audio backend: {e:?}")))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: You can use the context
method provided by the anyhow::Context
trait.
.map_err(|e| anyhow::Error::msg(format!("failed to initialize Web Audio backend: {e:?}")))?; | |
.context("failed to initialize Web Audio backend)?; |
I’ve spotted similar patterns in the audio module as well, you may want to double check.
@@ -889,7 +921,7 @@ fn build_config( | |||
platform: ironrdp::pdu::rdp::capability_sets::MajorPlatformType::UNSPECIFIED, | |||
no_server_pointer: false, | |||
autologon: false, | |||
no_audio_playback: true, | |||
no_audio_playback: !enable_audio, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: For a follow up PR, no_audio_playback
should be changed to enable_audio
or enable_audio_playback
, as I believe its a better naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #907
/** | ||
* Enable or disable audio playback for the RDP session. | ||
* | ||
* When enabled, the client will negotiate audio capabilities with the server | ||
* and attempt to play PCM audio through the browser's Web Audio API. | ||
* | ||
* Requirements: | ||
* - Modern browsers with Web Audio API support (Chrome 14+, Firefox 25+, Safari 6+) | ||
* - User gesture activation (click, touch, or keypress) required by browser security policy | ||
* | ||
* @param enable - Whether to enable audio playback | ||
* @returns Extension for audio enablement | ||
*/ | ||
export function enableAudio(enable: boolean): Extension { | ||
return new Extension('enable_audio', enable); | ||
} | ||
|
||
/** | ||
* Set the preferred sample rate for audio format negotiation. | ||
* | ||
* This influences which PCM format the server is likely to choose by placing | ||
* the specified sample rate first in the client's advertised format list. | ||
* The implementation automatically handles sample rate conversion if the server | ||
* chooses a different rate, so this is primarily an optimization. | ||
* | ||
* Common sample rates: | ||
* - 22050 Hz - Lower bandwidth, suitable for voice | ||
* - 44100 Hz - CD quality | ||
* - 48000 Hz - Professional audio, often browser native | ||
* | ||
* If not specified, the browser's native sample rate is used as the preference. | ||
* | ||
* @param rate - Preferred sample rate in Hz (e.g., 48000 for 48kHz) | ||
* @returns Extension for sample rate preference | ||
*/ | ||
export function audioSampleRate(rate: number): Extension { | ||
return new Extension('audio_sample_rate', rate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Many thanks for all the documentation!
fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { | ||
f.debug_struct("IronError").field("source", &self.source).finish() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Why was it necessary to implement Debug on IronError? The main purpose of IronError is WASM/JavaScript interop, internally we don’t really use it like the usual Rust idiomatic errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take your point.
info!( | ||
"WebAudioBackend initialized: {} supported formats, context sample rate: {}Hz", | ||
backend.supported_formats.len(), | ||
backend.context_sample_rate | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: I would also suggest the tracing structured logging approach here, but it’s fine like this as well. I’ll let you decide for the rest of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll adapt it to your preferred style.
.create_gain() | ||
.map_err(|e| anyhow::Error::msg(format!("failed to create Web Audio gain node: {e:?}")))?; | ||
|
||
// Connect gain node to destination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Detect browser audio capabilities | ||
/// Simplified to focus on sample rate support since we only use PCM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Proper docstring should be more like this:
/// Detect browser audio capabilities | |
/// Simplified to focus on sample rate support since we only use PCM | |
/// Detect browser audio capabilities | |
/// | |
/// Simplified to focus on sample rate support since we only use PCM. |
The first paragraph is used as the summary in the overview, and since it’s markdown, the lines a merged together, giving something like:
Detect browser audio capabilities Simplified to focus on sample rate support since we only use PCM
Whereas my suggestion will result into:
Detect browser audio capabilities
Simplified to focus on sample rate support since we only use PCM.
// Reasonable upper bound for audio buffer size (10 seconds of 48kHz stereo audio) | ||
const MAX_REASONABLE_AUDIO_BUFFER_SIZE: usize = 48000 * 2 * 2 * 10; // ~1.9MB | ||
|
||
if pcm_data.len() > MAX_REASONABLE_AUDIO_BUFFER_SIZE { | ||
return Err(anyhow::Error::msg(format!( | ||
"audio buffer too large ({} bytes), possible malformed data (max: {} bytes)", | ||
pcm_data.len(), | ||
MAX_REASONABLE_AUDIO_BUFFER_SIZE | ||
)) | ||
.into()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Good defensive pattern.
} | ||
|
||
/// Convert PCM 16-bit signed integer data to 32-bit float samples | ||
fn convert_pcm_to_float(pcm_data: &[u8], format: &AudioFormat) -> Result<Vec<f32>, IronError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: It would be more idiomatic to return anyhow::Error
(you can use anyhow::Result<Vec<f32>>
) at this point, and only convert to IronError
in the top-level function, unless you really need to specify an error kind, but I don’t think you do so so far.
// SAFETY: In WebAssembly single-threaded environment, Send is safe for WebAudioBackend | ||
unsafe impl Send for WebAudioBackend {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Strictly speaking, Send is really to communicate that the type is thread-safe, that it is safe to send it to another thread, and it’s obviously not the case for WebAudioBackend (using Rc
instead of Arc
, etc).
So we should rather say this is a workaround that is not compromising the memory safety of the program, because as far as we are concerned for this crate which is compiled into a WASM module and run in a single-threaded environment, WebAudioBackend
is never actually used in a multi-threaded context.
The real fix would be to relax the Send bound somewhere, but I’m okay with the workaround for the purpose of getting stuff done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just knew you'd call this out ;) Let me see what I can do.
match self.create_audio_buffer(&data, &format) { | ||
Ok(buffer) => { | ||
if let Err(e) = self.audio_queue.enqueue_audio(buffer, &self.gain_node) { | ||
error!("Failed to enqueue audio: {:?}", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
|
||
#[test] | ||
fn test_convert_pcm_to_float_valid_16bit() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: Unit tests are not run by default (see the Cargo.toml), because:
- https://github.com/Devolutions/IronRDP/blob/master/ARCHITECTURE.md#testing
IronRDP/crates/ironrdp-testsuite-core/tests/main.rs
Lines 3 to 12 in 39566bf
//! Integration Tests (IT) //! //! Integration tests are all contained in this single crate, and organized in modules. //! This is to prevent `rustc` to re-link the library crates with each of the integration //! tests (one for each *.rs file / test crate under the `tests/` folder). //! Performance implication: https://github.com/rust-lang/cargo/pull/5022#issuecomment-364691154 //! //! This is also good for execution performance. //! Cargo will run all tests from a single binary in parallel, but //! binaries themselves are run sequentially.
But you don’t have to care now, I can look into moving that on my side, in a follow up PR.
Note: I haven't had time to pick this up again, will probably be a couple of weeks |
Enables RDP web clients to receive and play PCM audio from remote sessions through the browser's native Web Audio API, supporting various sample rates.
Highlights: