[1/N][refactor] torchair fused_moe refactor #2438

hust17yixuan · 2025-08-19T07:19:15Z

What this PR does / why we need it?

Move torchair related fused_moe section into torchair_fused_moe to make the code clear. Next step we'll remove all torchair related code outside of torchair_fused_moe .

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@08d5f71

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@170e8ea

gemini-code-assist

Code Review

This pull request refactors the fused_moe logic for torchair by moving it into a new, dedicated file, vllm_ascend/torchair/ops/torchair_fused_moe.py. This is a good step towards better code organization. The changes in torchair_deepseek_v2.py correctly adopt the new refactored module. However, I've identified a critical issue in the newly created torchair_fused_moe.py file which appears to be a copy-paste error from the refactoring. This will cause a NameError at runtime and needs to be addressed.

gemini-code-assist · 2025-08-19T07:20:16Z

vllm_ascend/torchair/ops/torchair_fused_moe.py

+        if envs_ascend.VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ and isinstance(
+                self.quant_method, AscendUnquantizedFusedMoEMethod):


AscendUnquantizedFusedMoEMethod is not defined or imported in this file, which will lead to a NameError at runtime. Based on the refactoring, it seems the check should be against TorchairAscendUnquantizedFusedMoEMethod, which is defined in this file.

Suggested change

if envs_ascend.VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ and isinstance(

self.quant_method, AscendUnquantizedFusedMoEMethod):

if envs_ascend.VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ and isinstance(

self.quant_method, TorchairAscendUnquantizedFusedMoEMethod):

github-actions · 2025-08-19T07:21:44Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

codecov · 2025-08-23T05:32:48Z

Codecov Report

❌ Patch coverage is 65.55091% with 247 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.98%. Comparing base (3629bc4) to head (57ceed3).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/torchair/ops/torchair_fused_moe.py	55.12%	245 Missing ⚠️
tests/ut/torchair/ops/test_torchair_fused_moe.py	98.80%	2 Missing ⚠️

❌ Your patch status has failed because the patch coverage (65.55%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2438      +/-   ##
==========================================
- Coverage   78.04%   77.98%   -0.06%     
==========================================
  Files         132      134       +2     
  Lines       17557    18519     +962     
==========================================
+ Hits        13702    14442     +740     
- Misses       3855     4077     +222

Flag	Coverage Δ
unittests	`77.98% <65.55%> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]> Signed-off-by: hust17yixuan <[email protected]>

wangxiyuan · 2025-08-25T07:45:55Z

vllm_ascend/torchair/models/torchair_deepseek_v2.py

            self.gate.e_score_correction_bias = None

-        self.experts = AscendFusedMoE(
+        self.experts = TorchairAscendFusedMoE(


@wxsIcey please note that fused moe in torchair module is changed to this. So for your Fused Moe custom ops refactor work, you can just ignore torchair case.

### What this PR does / why we need it? Move torchair related fused_moe section into torchair_fused_moe to make the code clear. Next step we'll remove all torchair related code outside of torchair_fused_moe . ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? vLLM version: v0.10.0 vLLM main: vllm-project/vllm@08d5f71 - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@170e8ea Signed-off-by: hust17yixuan <[email protected]>

gemini-code-assist bot reviewed Aug 19, 2025

View reviewed changes

hust17yixuan force-pushed the torchair_fused_moe branch 2 times, most recently from 9ece025 to 986ed47 Compare August 22, 2025 06:57

github-actions bot added the module:tests label Aug 22, 2025

hust17yixuan force-pushed the torchair_fused_moe branch 9 times, most recently from 7109ffb to e97ad15 Compare August 23, 2025 05:16

hust17yixuan force-pushed the torchair_fused_moe branch from e97ad15 to fb77d23 Compare August 25, 2025 03:38

hust17yixuan force-pushed the torchair_fused_moe branch from fb77d23 to 57ceed3 Compare August 25, 2025 03:39

wangxiyuan approved these changes Aug 25, 2025

View reviewed changes

wangxiyuan merged commit 0f81e03 into vllm-project:main Aug 25, 2025
24 of 25 checks passed

Yikun mentioned this pull request Sep 20, 2025

[Bug]: Remove outofdate commits to improve perf test #3051

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[1/N][refactor] torchair fused_moe refactor #2438

[1/N][refactor] torchair fused_moe refactor #2438

hust17yixuan commented Aug 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 19, 2025

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 23, 2025 •

edited

Loading

Uh oh!

wangxiyuan Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

		if envs_ascend.VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ and isinstance(
		self.quant_method, AscendUnquantizedFusedMoEMethod):

[1/N][refactor] torchair fused_moe refactor #2438

[1/N][refactor] torchair fused_moe refactor #2438

Conversation

hust17yixuan commented Aug 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 19, 2025

Uh oh!

codecov bot commented Aug 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wangxiyuan Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hust17yixuan commented Aug 19, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 23, 2025 •

edited

Loading