Skip to content

Conversation

dudududukim
Copy link

Summary

This PR updates event handling in OpenAISTTTranscriptionSession to support both legacy "input_audio_transcription_completed" and the currently observed "conversation.item.input_audio_transcription.completed" events when streaming transcription over the WebSocket endpoint (wss://api.openai.com/v1/realtime?intent=transcription).

The test suite (tests/voice/test_openai_stt.py) is also updated to cover both event types, ensuring that the transcription output is properly yielded from either event in real scenarios.

Motivation

In my environment, events retrieved from
event = await asyncio.wait_for(self._event_queue.get(), timeout=EVENT_INACTIVITY_TIMEOUT) correspond to those emitted by the OpenAI WebSocket session.

The observed event for completed transcription is currently "conversation.item.input_audio_transcription.completed", not the legacy event name.

These changes allow realtime STT workflows to work as expected, without manual hotfixes.

Testing

before/after Top: Using the legacy event name;
Bottom: After adding support for conversation.item.input_audio_transcription.completed.

Tests pass successfully in my local environment

dudududukim/OpenAI_agent_practice

The updated test confirms correct behavior for both event types.

@seratch seratch added enhancement New feature or request feature:voice labels Aug 20, 2025
@seratch seratch requested a review from rm-openai August 20, 2025 05:01
@dudududukim dudududukim reopened this Aug 20, 2025
@seratch seratch enabled auto-merge (squash) August 20, 2025 23:10
@dudududukim
Copy link
Author

Does my PR processing?

@seratch seratch disabled auto-merge August 27, 2025 22:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature:voice
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants