-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
OffloadingConnector: Fix bug in handling of preemptions
kv-connector
#29870
opened Dec 2, 2025 by
orozery
Loading…
[CPU Backend] [Doc]: Update Installation Docs for CPUs
documentation
Improvements or additions to documentation
#29868
opened Dec 2, 2025 by
ioghiban
Loading…
3 of 5 tasks
Remove default values from Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
tpu
Related to Google TPUs
v1
InitVars so that they're not stored
kv-connector
nvidia
performance
[Doc]: add Star History add-on to README.md.
documentation
Improvements or additions to documentation
#29857
opened Dec 2, 2025 by
didier-durand
Loading…
2 tasks done
[Model] Add LoRA support for Whisper models
documentation
Improvements or additions to documentation
#29856
opened Dec 2, 2025 by
daje0601
Loading…
[WIP] Calculate MFU/MBU for whole model forward pass
#29853
opened Dec 2, 2025 by
LinWang-avivia
Loading…
5 tasks
[Chore] Use Related to Llama models
multi-modality
Related to multi-modality (#4194)
qwen
Related to Qwen models
ready
ONLY add when PR is ready to merge/full CI is needed
tokenizer.encode and tokenizer.decode directly
frontend
llama
Add DeepSeek-V3.2 tool parser.
deepseek
Related to DeepSeek models
frontend
tool-calling
#29848
opened Dec 2, 2025 by
Xu-Wenqing
Loading…
5 tasks
feat: support tree attention in flash-linear-attenton
speculative-decoding
v1
#29846
opened Dec 2, 2025 by
menggeliu1205
Loading…
[WIP] Simplified alternative padded-speculation acceptance rate fix
speculative-decoding
v1
#29845
opened Dec 2, 2025 by
LucasWilkinson
•
Draft
feat: add optional int8 quantization for KV cache transfer
kv-connector
#29844
opened Dec 2, 2025 by
xbfs
Loading…
Atomics Reduce Counting optimization for skinny GEMMs.
rocm
Related to AMD ROCm
#29843
opened Dec 2, 2025 by
amd-hhashemi
Loading…
5 tasks
[CI/Build][AMD] Skip test_shared_storage_connector_hashes in test_shared_storage_connector.py due to hipErrorLaunchFailure when calling .cpu()
kv-connector
rocm
Related to AMD ROCm
v1
#29839
opened Dec 2, 2025 by
rasmith
Loading…
[Frontend] supports deepseekv32 chat template
deepseek
Related to DeepSeek models
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#29837
opened Dec 2, 2025 by
chaunceyjiang
Loading…
5 tasks
[Core] Add token-level KV cache metrics to V1 engine
v1
#29836
opened Dec 2, 2025 by
Minsung-commit
Loading…
8 of 9 tasks
[Misc] Remove redundant engine arg tokens_only
#29832
opened Dec 2, 2025 by
zhuohan123
Loading…
5 tasks
Added regression test for openai/harmony/issues/78
gpt-oss
Related to GPT-OSS models
#29830
opened Dec 2, 2025 by
jacobthebanana
Loading…
3 of 5 tasks
[Model] Add transcription support for Qwen3-Omni
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
[Perf] Avoid pageable HtoD transfer in MinTokensLogitsProcessor
ready
ONLY add when PR is ready to merge/full CI is needed
v1
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.