-
-
Notifications
You must be signed in to change notification settings - Fork 148
Description
I have a custom streaming pipeline with a VAD setup that triggers ASR processing only when speech is detected on a small chunk. The pipeline operates in a streaming fashion, processing audio chunks sequentially from the client.
In Diart, it seems we need to provide a file path, microphone input, or websocket for audio input. Is there a way to integrate Diart directly into my pipeline, allowing me to pass audio chunks to the diarization module and receive results in real-time? Maintaining speaker consistency across chunks is crucial, as each new chunk shouldn't be treated as a separate audio session.
I attempted to modify the AudioSource class in source.py, experimenting with custom inputs and code adjustments, but I couldn't achieve the desired results.
Could you kindly guide me on how to implement this? If possible, I would greatly appreciate a code snippet to help clarify the approach. From what I understand, the solution likely involves customizing the AudioSource class.
Thanks