-
Notifications
You must be signed in to change notification settings - Fork 441
Closed
Description
Release Checklist
Release Version: v0.10.1.1rc1
Release Branch: main
Release Date:
Release Manager: @MengqingCao
Prepare Release Note
-
Create a new issue for release feedback [v0.10.1rc1] FAQ / Feedback | 问题/反馈 #2630
-
Write the release note PR. [ReleaseNote] Add Release Note for v0.10.1rc1 #2635
-
Update the feedback issue link in docs/source/faqs.md
-
Add release note to docs/source/user_guide/release_notes.md
-
Update version info in docs/source/community/versioning_policy.md
-
Update contributor info in docs/source/community/contributors.md
-
Update package version in docs/conf.py
-
PR need Merge
- [Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut [Bugfix]Support Qwen3-MOE on aclgraph mode in sizes capture and add new ut #2511
- [Fix] Add operations in _dummy_run to maintain synchronization with _process_reqs, resolving a service hang [Fix] Add operations in
_dummy_run
to maintain synchronization with_process_reqs
, resolving a service hang #2454 - [Bug]: test_lm_eval_correctness failed: ImportError: cannot import name 'CUDAGraphMode' from 'vllm.config' #2522
- [main][bugfix] Fix MatmulNZ format bug on some machines #2549
- [Aclgraph] Update compilation config in
check_and_update_config
#2540 - [bugfix] fix torchair runtime error caused by configuration mismtaches and file missing #2532
- fix the bug with torchair + dp [Bugfix] Fix the bug of cos invalid shape when dp #2558
- [main][bugfix] Fix bugs and refactor cached mask generation logic #2442
- [Bugfix] Fix aclgraph not enabled by default #2590
- [Bugfix][LoRA][Patch] Fix the LoRA inference bug after upstream vLLM codebase changed #2560
- Support v0.10.1 #2584
- [Bugfix] Fix mc2 operator error in aclgraph + ep<16 scenario #2609
- [V1][BUGFIX][0.10.1] FIX mtp on main branch #2632
- [BugFix][MLA] Fix attn_mask bug for ring mla #2704
- [Bugfix] Fix qwen2.5-vl-without-padding #2623
- [Bugfix] Keep prefill node using prefill code path only in disaggregation mode #2660
- [P/D]mooncake_connector adapted to 0.10.1 #2664
- bugfix: fix initialization error for mooncake in k8s #2541
- [Bugfix][APC] Fix accuracy issue on prefix caching with AscendScheduler #2714
Functional Test
Bug needs to be fixed:
- issue on mc2: [Bug]: Deepseek runs failed with ep>=16 for graph mode #2523 [Aclgraph] Update compilation config in
check_and_update_config
#2540 @MengqingCao - [Bug]: Qwen2.5-7B The process exits for this inner error, and the current working operator name is SelfAttentionOperation #2239 @leo-pony
- GLM4.5 long seq @shen-shanshan [Bugfix] Fix long context seq accuracy problem for
GLM4.5
#2601 - GLM4V @Yikun will track on [Bug]: ZhipuAI/GLM-4.5V inference failed in eager mode and start failed in aclgraph mode #2516
- DS 400 QPM @Potabk
No feature regression - Qwen3 235B aclgraph @MengqingCao
- gpt-oss: Add gpt oss #2436 depends on [2/N][Feat] Add MC2 communication method for MoE layers #2469
- logits preprocessor @Potabk
- [Bug]: perf test failed due to AttributeError: '_OpNamespace' '_C' object has no attribute 'apply_repetition_penalties_' #2533 @Potabk online test
Doc Test
- Tutorial is updated.
- User Guide is updated.
- Developer Guide is updated.
Prepare Artifacts
- Docker image is ready.
- Wheel package is ready.
Release Step
- Release note PR is merged.
- Post the release on GitHub release page.
- Generate official doc page on https://app.readthedocs.org/dashboard/
- Wait for the wheel package to be available on https://pypi.org/project/vllm-ascend
- Wait for the docker image to be available on https://quay.io/ascend/vllm-ascend
- Upload 310p wheel to Github release page
- Broadcast the release news (By message, blog , etc)
- Close this issue
Metadata
Metadata
Assignees
Labels
No labels