Fix some ci issue and refactor modelrunner #2445

MengqingCao · 2025-08-19T14:41:04Z

What this PR does / why we need it?

Fix some ci issue and refactor modelrunner

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with existing test.

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@4d9c619

Signed-off-by: wangli <[email protected]>

* [AclGraph] Adapt aclgraph into new graph dispatcher arch Signed-off-by: MengqingCao <[email protected]> Signed-off-by: wangli <[email protected]>

Signed-off-by: weiguihua2 <[email protected]>

Signed-off-by: MengqingCao <[email protected]>

github-actions · 2025-08-19T14:41:12Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a significant refactoring of the model runner and attention mechanisms. The key changes include decoupling the attention metadata builders from the model runner, introducing a common attention metadata structure, and adding a generic ACL graph wrapper for graph capture and replay. These changes improve code modularity, maintainability, and align the codebase with a more modern architecture for handling graph-based execution. The tests have also been substantially improved to reflect these changes. Overall, this is a high-quality refactoring with no apparent critical issues.

Signed-off-by: MengqingCao <[email protected]>

wangxiyuan

Known issue:

lint is disabled. - @MengqingCao
unit test failed - @MengqingCao @Potabk
lora test failed @paulyu12
multicard test is cancelled - @MengqingCao
ds r1+quantization + tp8 failed. @weiguihua2
ep doesn't work with tp <4 in torchair mode

We'll fix them in the next PRs in quick

Let's merge this to unblock other PR.

### What this PR does / why we need it? add lint block before running e2e. follow up #2445 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? N/A Signed-off-by: MengqingCao <[email protected]>

### What this PR does / why we need it? This PR move current unified mla backend to torchair folder and remove torchair-related code in attention/mla_v1.py (1.3k -> 0.9k). ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Running eager mode with mla backend, and torchair mode with code before [2445](#2445) - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@f571ff8 Signed-off-by: linfeng-yuan <[email protected]>

Potabk and others added 5 commits August 19, 2025 17:00

support logitsprocessor

2fcd79a

Signed-off-by: wangli <[email protected]>

Some fixes to adapt latest vllm

1ed0790

* [AclGraph] Adapt aclgraph into new graph dispatcher arch Signed-off-by: MengqingCao <[email protected]> Signed-off-by: wangli <[email protected]>

refact attention metadata build

3fcf35a

Signed-off-by: weiguihua2 <[email protected]>

refact attention metadata build

128f00a

Signed-off-by: weiguihua2 <[email protected]>

revert ci change

54582c7

Signed-off-by: MengqingCao <[email protected]>

github-actions bot added module:tests module:core labels Aug 19, 2025

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

fix ci

b0de043

Signed-off-by: MengqingCao <[email protected]>

MengqingCao force-pushed the aclgraph4 branch from 7c6c437 to b0de043 Compare August 19, 2025 15:09

wangxiyuan approved these changes Aug 20, 2025

View reviewed changes

wangxiyuan merged commit 1327f9b into vllm-project:main Aug 20, 2025
18 of 20 checks passed

MengqingCao deleted the aclgraph4 branch August 20, 2025 01:09

MengqingCao mentioned this pull request Aug 20, 2025

[CI] add lint block before running e2e #2447

Merged

MengqingCao restored the aclgraph4 branch August 20, 2025 01:29

linfeng-yuan mentioned this pull request Aug 21, 2025

[2/N][refactor] torchair deepseek mla backend refactor #2459

Merged

MengqingCao mentioned this pull request Aug 22, 2025

[Fix] Add operations in _dummy_run to maintain synchronization with _process_reqs, resolving a service hang #2454

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix some ci issue and refactor modelrunner #2445

Fix some ci issue and refactor modelrunner #2445

MengqingCao commented Aug 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

wangxiyuan left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Fix some ci issue and refactor modelrunner #2445

Fix some ci issue and refactor modelrunner #2445

Conversation

MengqingCao commented Aug 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

wangxiyuan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MengqingCao commented Aug 19, 2025 •

edited by github-actions bot

Loading

wangxiyuan left a comment •

edited

Loading