[Model] Add transcription support for Qwen3-Omni #29828

mu-hashmi · 2025-12-02T00:53:48Z

Purpose

Only 4 models are supported as a Transcription model according to the docs. This PR adds Qwen3-Omni per the feature request: #29405

Test Plan

Testing the transcription and translation endpoints (https://docs.vllm.ai/en/latest/contributing/model/transcription/#test-with-the-api) and ensuring correctness. Using the same audio files from the Speech Recognition colab notebook https://colab.research.google.com/github/QwenLM/Qwen3-Omni/blob/main/cookbooks/speech_recognition.ipynb provided by the Qwen team on huggingface.

Test Result

Transcription:

curl -s -X POST \
  -H "Authorization: Bearer $VLLM_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/Users/muhash/Downloads/asr_en.wav" \
  -F "model=cpatonn/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit" \
  http://localhost:8000/v1/audio/transcript
ions
{"text":"Mhm. Oh, yeah, yeah. He wasn't even that big when I started listening to him, but and his solo music didn't do overly well, but he did very well when he started writing for other people.","usage":{"type":"duration","seconds":16}}

Translation:

curl -s -X POST \
  -H "Authorization: Bearer $VLLM_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/Users/muhash/Downloads/asr_fr.wav" \
  -F "model=cpatonn/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit" \
  http://localhost:8000/v1/audio/translations
{"text":"Well, we're going to move on to some more general questions. For you, what was the opportunity that led you to get involved in the dubbing industry?"}

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2025-12-02T00:54:27Z

Documentation preview: https://vllm--29828.org.readthedocs.build/en/29828/

github-actions · 2025-12-02T00:57:56Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

mergify · 2025-12-04T13:49:25Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mu-hashmi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Muhammad Hashmi <[email protected]>

mergify · 2025-12-04T23:05:23Z

Documentation preview: https://vllm--29828.org.readthedocs.build/en/29828/

mergify bot added documentation Improvements or additions to documentation qwen Related to Qwen models labels Dec 2, 2025

mu-hashmi mentioned this pull request Dec 2, 2025

[Feature]: Qwen3 Omni Transcriptions #29405

Open

1 task

DarkLight1337 requested a review from NickLucche December 2, 2025 09:57

mergify bot added the needs-rebase label Dec 4, 2025

robertgshaw2-redhat added the frontend label Dec 4, 2025

mu-hashmi closed this Dec 4, 2025

mu-hashmi force-pushed the feature/qwen3-transcription branch from e043d82 to 1f0d184 Compare December 4, 2025 22:55

Add transcription support for Qwen3-Omni model

4b687db

Signed-off-by: Muhammad Hashmi <[email protected]>

mu-hashmi reopened this Dec 4, 2025

mergify bot removed the needs-rebase label Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Add transcription support for Qwen3-Omni #29828

[Model] Add transcription support for Qwen3-Omni #29828

mu-hashmi commented Dec 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

mergify bot commented Dec 4, 2025

Uh oh!

mergify bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Model] Add transcription support for Qwen3-Omni #29828

Are you sure you want to change the base?

[Model] Add transcription support for Qwen3-Omni #29828

Conversation

mu-hashmi commented Dec 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

mergify bot commented Dec 4, 2025

Uh oh!

mergify bot commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mu-hashmi commented Dec 2, 2025 •

edited by github-actions bot

Loading