-
Couldn't load subscription status.
- Fork 112
Expose desktop capturer #725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| #[cfg(target_os = "linux")] | ||
| { | ||
| /* This is needed for getting the system picker for screen sharing. */ | ||
| use glib::MainLoop; | ||
| let main_loop = MainLoop::new(None, false); | ||
| let _handle = std::thread::spawn(move || { | ||
| main_loop.run(); | ||
| }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the need for this GLib event loop? With the GStreamer Rust bindings for example, GStreamer events are usually handled by a GLib event loop, but GStreamer has an API to register an event callback that the library calls internally, which can be used from Rust to send events to a Rust thread or async stream over a channel. This way the Rust program calling the library doesn't need to setup its own GLib event loop. Here's an example: https://codeberg.org/Be.ing/dance-video-recorder/src/commit/4aab73f5514ff08000ef330f221dc6035831a4ee/src/camera_monitor.rs#L22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I didn't know about this. I will update the example to do it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if such an approach is feasible without a clearer idea of what the need for the GLib event loop is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the example I linked above, the GStreamer bindings are creating a futures_channel::mpsc under the hood to create an ergonomic Rust stream API: https://gitlab.freedesktop.org/gstreamer/gstreamer-rs/-/blob/c533fe960af057c55916ba3fb48c42837da98565/gstreamer/src/bus.rs#L345
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the need for the GLib event loop is
The event loop is needed for opening the system ui picker. It is related to the xdg_portal setup. Maybe there is something here. I need to have a better look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if Livekit handled starting the GLib main loop so users of the library wouldn't have to, but that wouldn't be good for applications that are already using GLib for something such as the GStreamer or GTK Rust bindings. So, best to leave it to the user of the library with documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree that is better to leave it to the client to decide what to do with the main loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In gethopp#3 I moved this into the library behind a Cargo feature flag. Most Rust applications won't need to be concerned with this, but if they want to opt out, they can disable default features for the libwebrtc crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer the maintainers of LiveKit to decide how they want to handle this, in the end of the day is their project. I was thinking that is more appropriate to have desktop capturer as a feature in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that this library is likely to get used by cross-platform applications whose developers may or may not know anything about Linux. It would be easy for a downstream developer who doesn't know anything about Linux to overlook starting a GLib event loop and have their application not work on Wayland without realizing it. So I think it'd be good to have it just work by default and let developers opt out if they don't need it. I don't have a strong opinion on this either way though; whatever the Livekit developers decide is fine.
I was thinking that is more appropriate to have desktop capturer as a feature in general.
Sure, though that's orthogonal to this question about the GLib event loop.
|
I will give this a closer look and testing this week. Big thanks for adding a new example program! |
|
EDIT: Nevermind, I checked out the wrong branch. |
|
How do I build the example? I don't see any documentation how this build system works and libwebrtc-sys's build.rs is quite complicated. |
|
That build_linux.sh script fails. I'm puzzled what it's doing. It seems to be downloading a bunch of stuff including a Debian image or something?? And building Abseil... does libwebrtc depend on Abseil? Then it fails to find a header from glibc. |
|
I think I need to rebase #648 to get this to build... this build system needs some work. |
|
After 3 days of yak shaving, I got libwebrtc to build locally with #730. I had to make a few changes to this branch to get it to build, so I made a PR for your fork: gethopp#2. Now I am wondering what LIVEKIT_URL to use for development. Is there a test server I can use? Or a tool for running a test server locally? |
|
@Be-ing you can either
|
|
I tested and this works on KDE Plasma Wayland! 🎉 I will do more extensive testing on different desktops, Wayland and X11, tomorrow. |
|
@Be-ing thanks for testing it. I haven't found time during the week to check X11. I will do it during the weekend. |
Implements screen sharing by exposing libwebrtc's
DesktopCapturer.
A few platform specific notes:
- macos:
* It is using screen capture kit and the system picker by
default. If the system picker is disabled then get_sources
returns an empty list when trying to capture a display. The
display native id needs to be acquired using different means
from the client.
- linux:
* With pipewire the only way to select window or display is via
the system picker.
21c54ef to
c60d7be
Compare
|
Could you rebase this branch on the main branch to incorporate #731, add the new example to the workspace root Cargo.toml, and set |
|
I opened another PR for your fork: gethopp#3 |
|
I've tested this on: and confirm both desktop and window capture work on all of them. |
| let source_type = if args.capture_window { | ||
| DesktopCaptureSourceType::WINDOW | ||
| } else { | ||
| DesktopCaptureSourceType::SCREEN | ||
| }; | ||
| let mut options = DesktopCapturerOptions::new(source_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I like this API better, nice idea. It's a bit odd that the WINDOW and SCREEN enum variants are all caps though.
| if cfg!(target_os = "linux") { | ||
| files.push(desktop_capture_path); | ||
| } else if desktop_capture_path.exists() { | ||
| files.push(desktop_capture_path); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditions of this if-else block don't make much sense to me. Shouldn't this just check if the path exists regardless of target_os?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just got lazy here. I will update the scripts for every platform.
| let stream_width = 1920; | ||
| let stream_height = 1080; | ||
| let buffer_source = | ||
| NativeVideoSource::new(VideoResolution { width: stream_width, height: stream_height }); | ||
| let track = LocalVideoTrack::create_video_track( | ||
| "screen_share", | ||
| RtcVideoSource::Native(buffer_source.clone()), | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused what the right thing to do is here for the resolution. Unless I'm missing something, libwebrtc doesn't expose an API to get the resolution of a DesktopCapturer::Source; you can only get the resolution of the captured screen/window from the DesktopFrame passed to the callback after the stream has started. Nor is there a cross-platform way to change the resolution of the VideoTrackSource after it is created. There is a method on DesktopCapturer void UpdateResolution(uint32_t width, uint32_t height) which is guarded by defined(WEBRTC_USE_PIPEWIRE) || defined(WEBRTC_USE_X11) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe someone from Livekit has some idea about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging deeper into the NativeVideoSource code, it seems the requirement for the video resolution is for the scaling performed by VideoTrackSource::InternalSource::on_captured_frame. I was wondering if this was a requirement imposed by libwebrtc somehow, but in that case, it wouldn't make sense that libwebrtc doesn't provide an API to get the resolution before starting the stream. If I understand that correctly, then I think we can add a method to NativeVideoSource to change the VideoResolution after the NativeVideoSource is created and call that in the callback. Or better yet, we could add a method to VideoTrackSource that takes a DesktopFrame and takes care of this automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what is the problem here TBH.
You should know with which resolution you want to share your stream (of course you need to do change the dims to match the aspect ration of the source to avoid distortions), which of course most of the times will be different from the capture res.
In platforms that allow you to choose your source without a system picker you should be able to get the source's dims before starting capturing. When you are using the system picker and you can't know which source the user selected you can simply start publishing your track after capturing has started.
IMO the only improvement that could be done here is to change VideoTrackSource::InternalSource::on_captured_frame to scale the buffer to match resolution_ (with the aspect ratio change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should know with which resolution you want to share your stream
How can you know that without any idea what the resolution of the source is? There's no way to know in advance if you're capturing a 200 x 200 window or a 7680 × 4320 screen. Up/down scaling either of these to some arbitrary assumed sized is not going to look good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, I started looking into this because the current screen capturing code in Zed assumes the resolution is known in advance and sets the stream dimensions based on that. That assumption is true for the current capture mechanisms (the scap crate and macOS APIs), but not for libwebrtc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging more into the code, I'm getting more confused. I don't understand the point of calling AdaptedVideoTrackSource::AdaptFrame in VideoTrackSource::InternalSource::on_captured_frame because the AdaptedVideoTrackSource's VideoAdapter is never really configured with a VideoSinkWants... there's just this code with a default constructed VideoSinkWants:
void VideoTrack::add_sink(const std::shared_ptr<NativeVideoSink>& sink) const {
webrtc::MutexLock lock(&mutex_);
track()->AddOrUpdateSink(sink.get(),
webrtc::VideoSinkWants()); // TODO(theomonnom): Expose
// VideoSinkWants to Rust?
sinks_.push_back(sink);
}When the AdaptFrame call does change the resolution, it always outputs a square resolution (height and width equal) regardless of the input resolution.
Another confusing bit of code I found is this in libwebrtc::NativeVideoSource::new
livekit_runtime::spawn({
let source = source.clone();
let i420 = I420Buffer::new(resolution.width, resolution.height);
async move {
let mut interval = interval(Duration::from_millis(100)); // 10 fps
loop {
interval.tick().await;
let inner = source.inner.lock();
if inner.captured_frames > 0 {
break;
}
let mut builder = vf_sys::ffi::new_video_frame_builder();
builder.pin_mut().set_rotation(VideoRotation::VideoRotation0);
builder.pin_mut().set_video_frame_buffer(i420.as_ref().sys_handle());
let now = SystemTime::now().duration_since(UNIX_EPOCH).unwrap();
builder.pin_mut().set_timestamp_us(now.as_micros() as i64);
source.sys_handle.on_captured_frame(&builder.pin_mut().build());
}
}
});With desktop capturing, we're calling on_captured_frame when the DesktopCapturer's callback is invoked, so it doesn't make sense to me to have livekit_runtime running this loop. And I definitely want more than 10 FPS for screen captures. I also don't understand why livekit_runtime is a dependency of the libwebrtc crate.
I found that code because I tried setting the VideoResolution for NativeVideoSource::new to 0,0 to test this code in VideoTrackSource::InternalSource::on_captured_frame:
if (resolution_.height == 0 || resolution_.width == 0) {
resolution_ = VideoResolution{static_cast<uint32_t>(buffer->width()),
static_cast<uint32_t>(buffer->height())};
}However, I have to comment out the call to livekit_runtime::spawn above because I420Buffer::new asserts that the width and height aren't 0, so this code I pasted from VideoTrackSource::InternalSource::on_captured_frame is effectively unusable.
@theomonnom as the author of much of this code, can you explain what's going on here? What is really the purpose of the VideoResolution passed to NativeVideoSource::new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, the NativeVideoSource's resolution is read in livekit::LocalParticipant::publish_track. So I think what is needed is a new API to change the resolution of an existing track after it has been published.
| let (y, u, v) = capture_buffer.data_mut(); | ||
| yuv_helper::argb_to_i420(data, stride, y, s_y, u, s_u, v, s_v, width, height); | ||
|
|
||
| let scaled_buffer = capture_buffer.scale(stream_width as i32, stream_height as i32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scaling is redundant. NativeVideoSource::capture_frame will scale it automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right, I had missed the VideoTrackSource::InternalSource::on_captured_frame implementation. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I had another look and I think the scaling is actually needed. When setting the NativeVideoSource to 1080p and the frame dims are 2880x1800, adapted_width/height are becoming 2880x1800 and not 1080p.
|
There are some merge conflicts now, I think from #753. |
Implements screen sharing by exposing libwebrtc's
DesktopCapturer.A few notes:
GetSourceListto actually get the available displays without the system picker. I could port my patch to your libwebrtc fork if you want. For selecting windowsGetSourceListseems to work.Tested on:
This is related to #92 and zed-industries/zed#28754.