Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[CPU Backend] [Doc]: Update Installation Docs for CPUs documentation Improvements or additions to documentation
#29868 opened Dec 2, 2025 by ioghiban Loading…
3 of 5 tasks
Remove default values from InitVars so that they're not stored kv-connector nvidia performance Performance-related issues ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding tpu Related to Google TPUs v1
#29859 opened Dec 2, 2025 by hmellor Loading… v0.12.0
[CI][DCP][Perf] reduce DCP CI execution time
#29858 opened Dec 2, 2025 by pisceskkk Loading…
[Doc]: add Star History add-on to README.md. documentation Improvements or additions to documentation
#29857 opened Dec 2, 2025 by didier-durand Loading…
2 tasks done
[Model] Add LoRA support for Whisper models documentation Improvements or additions to documentation
#29856 opened Dec 2, 2025 by daje0601 Loading…
Refactor example prompts fixture
#29854 opened Dec 2, 2025 by nwaughachukwuma Loading…
[WIP] Calculate MFU/MBU for whole model forward pass
#29853 opened Dec 2, 2025 by LinWang-avivia Loading…
5 tasks
[Chore] Use tokenizer.encode and tokenizer.decode directly frontend llama Related to Llama models multi-modality Related to multi-modality (#4194) qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed
#29851 opened Dec 2, 2025 by DarkLight1337 Loading…
5 tasks
v0.12.0
Add DeepSeek-V3.2 tool parser. deepseek Related to DeepSeek models frontend tool-calling
#29848 opened Dec 2, 2025 by Xu-Wenqing Loading…
5 tasks
Atomics Reduce Counting optimization for skinny GEMMs. rocm Related to AMD ROCm
#29843 opened Dec 2, 2025 by amd-hhashemi Loading…
5 tasks
[Frontend] supports deepseekv32 chat template deepseek Related to DeepSeek models frontend ready ONLY add when PR is ready to merge/full CI is needed
#29837 opened Dec 2, 2025 by chaunceyjiang Loading…
5 tasks
[Core] Add token-level KV cache metrics to V1 engine v1
#29836 opened Dec 2, 2025 by Minsung-commit Loading…
8 of 9 tasks
enable multi-node in external launcher mode
#29833 opened Dec 2, 2025 by xieyangxu Loading…
[Misc] Remove redundant engine arg tokens_only
#29832 opened Dec 2, 2025 by zhuohan123 Loading…
5 tasks
Added regression test for openai/harmony/issues/78 gpt-oss Related to GPT-OSS models
#29830 opened Dec 2, 2025 by jacobthebanana Loading…
3 of 5 tasks
[Model] Add transcription support for Qwen3-Omni documentation Improvements or additions to documentation qwen Related to Qwen models
#29828 opened Dec 2, 2025 by mu-hashmi Draft
4 of 5 tasks
[Perf] Avoid pageable HtoD transfer in MinTokensLogitsProcessor ready ONLY add when PR is ready to merge/full CI is needed v1
#29826 opened Dec 2, 2025 by jthomson04 Loading…
5 tasks
v0.12.0
Add logging for cudagraph related info nvidia ready ONLY add when PR is ready to merge/full CI is needed v1
#29825 opened Dec 2, 2025 by sarckk Loading…
3 of 5 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.