Skip to content

Commit 29efed7

Browse files
authored
feat(audio): encoded and decoded opus audio tracks (#633)
1 parent ac50dec commit 29efed7

File tree

6 files changed

+582
-13
lines changed

6 files changed

+582
-13
lines changed

.github/workflows/buildtest.yaml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,16 @@ jobs:
3535
~/go/bin
3636
~/.cache
3737
key: server-sdk-go
38+
39+
- name: Set up SoX resampler and Opus
40+
run: |
41+
sudo apt-get update
42+
sudo apt-get install -y libsoxr-dev libopus-dev libopusfile-dev opus-tools
3843
3944
- name: Set up Go
4045
uses: actions/setup-go@v5
4146
with:
42-
go-version: 1.23
47+
go-version: 1.24.2
4348

4449
- name: Set up gotestfmt
4550
run: go install github.com/gotesttools/gotestfmt/v2/cmd/gotestfmt@latest

README.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,42 @@ if _, err = room.LocalParticipant.PublishTrack(track, &lksdk.TrackPublicationOpt
215215
For a full working example, refer to [filesender](https://github.com/livekit/server-sdk-go/blob/main/examples/filesender/main.go). This
216216
example sends all audio/video files in the current directory.
217217

218+
### Publishing audio from PCM16 Samples
219+
220+
In order to publish audio from PCM16 Samples, you can use the NewPCMLocalTrack API as follows:
221+
222+
```go
223+
publishTrack, err := lksdk.NewPCMLocalTrack(sourceSampleRate, sourceChannels, logger.GetLogger())
224+
if err != nil {
225+
return err
226+
}
227+
228+
if _, err = room.LocalParticipant.PublishTrack(publishTrack, &lksdk.TrackPublicationOptions{
229+
Name: "test",
230+
}); err != nil {
231+
return err
232+
}
233+
```
234+
235+
You can then write PCM16 samples to the `publishTrack` like:
236+
237+
```go
238+
err = publishTrack.WriteSample(sample)
239+
if err != nil {
240+
logger.Errorw("error writing sample", err)
241+
}
242+
```
243+
244+
The SDK will encode the sample to Opus and write it to the track. If the sourceSampleRate is not 48000, resampling is also handled internally.
245+
246+
The API also provides an option to write silence to the track when no data is available, this is disabled by default but you can enable it using:
247+
248+
```go
249+
publishTrack, err := lksdk.NewPCMLocalTrack(sourceSampleRate, sourceChannels, logger.GetLogger(), lksdk.WithWriteSilenceOnNoData(true))
250+
```
251+
252+
**Note**: Stereo audio is currently not supported, it may result in unpleasant audio.
253+
218254
### Publish from other sources
219255

220256
In order to publish from non-file sources, you will have to implement your own `SampleProvider`, that could provide frames of data with a `NextSample` method.
@@ -257,6 +293,60 @@ With the Go SDK, you can accept media from the room.
257293
For a full working example, refer to [filesaver](https://github.com/livekit/server-sdk-go/blob/main/examples/filesaver/main.go). This
258294
example saves the audio/video in the LiveKit room to the local disk.
259295

296+
### Decoding an Opus track to PCM16
297+
298+
To get PCM audio out of a remote Opus audio track, you can use the following API:
299+
300+
```go
301+
import (
302+
...
303+
304+
media "github.com/livekit/media-sdk"
305+
)
306+
307+
type PCM16Writer struct {
308+
closed atomic.Bool
309+
}
310+
311+
func (w *PCM16Writer) WriteSample(sample media.PCM16Sample) error {
312+
if !w.closed.Load() {
313+
// Use the sample however you want
314+
}
315+
}
316+
317+
func (w *PCM16Writer) SampleRate() int {
318+
// return sample rate of the writer
319+
// it'll be the same as targetSampleRate used below
320+
}
321+
322+
func (w *PCM16Writer) String() string {
323+
// return a desired string
324+
// can be used to monitor writer stages or config, etc.
325+
}
326+
327+
func (w *PCM16Writer) Close() error {
328+
w.closed.Store(true)
329+
// close the writer
330+
}
331+
332+
...
333+
334+
var writer media.WriteCloser[media.PCM16Sample] = &PCM16Writer{}
335+
pcmTrack, err := lksdk.NewPCMRemoteTrack(remoteTrack, &writer, targetSampleRate, targetChannels)
336+
if err != nil {
337+
return err
338+
}
339+
```
340+
341+
The SDK will then read the provided remote track, decode the audio and write the PCM16 samples to the provided writer. Resampling to the target sample rate is handled internally, and so is upmixing/downmixing to the target channel count.
342+
343+
The API also provides an option to handle jitter, this is enabled by default but you can disable it using:
344+
345+
```go
346+
pcmTrack, err := lksdk.NewPCMRemoteTrack(remoteTrack, &writer, targetSampleRate, targetChannels, lksdk.WithHandleJitter(false))
347+
```
348+
349+
**Note**: Stereo remote tracks are currently not supported, they may result in unpleasant audio.
260350

261351
## Receiving webhooks
262352

0 commit comments

Comments
 (0)