Skip to content

Conversation

linfeng-yuan
Copy link
Contributor

@linfeng-yuan linfeng-yuan commented Aug 19, 2025

What this PR does / why we need it?

Add configuration check logic for ascend scheduler: if chunked_prefill is disabled, max_num_batched_tokens couldn't be less than max_model_len, following vLLM;

Does this PR introduce any user-facing change?

users cannot set max_num_batched_tokens smaller than max_model_len with ascend scheduler

How was this patch tested?

CI and vllm serving passed

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a validation check in AscendSchedulerConfig to prevent configurations where max_num_batched_tokens is less than max_model_len when chunked prefill is disabled. This is a valuable correctness fix that avoids unexpected rejection of long sequences. The implementation is sound, and the associated test updates and additions are thorough, covering both valid and invalid scenarios. The changes improve the robustness and user-friendliness of the scheduler configuration.

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@linfeng-yuan linfeng-yuan force-pushed the ascend_scheduler_parameter_checking branch from 3bc18ce to f697b20 Compare August 19, 2025 06:36
@linfeng-yuan linfeng-yuan force-pushed the ascend_scheduler_parameter_checking branch from f697b20 to 3ed37b2 Compare August 20, 2025 18:06
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.39%. Comparing base (1de16ea) to head (3ed37b2).
⚠️ Report is 12 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2434      +/-   ##
==========================================
+ Coverage   77.37%   77.39%   +0.01%     
==========================================
  Files         128      128              
  Lines       16455    16468      +13     
==========================================
+ Hits        12732    12745      +13     
  Misses       3723     3723              
Flag Coverage Δ
unittests 77.39% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wangxiyuan
Copy link
Collaborator

please update the commit message. Thanks

@wangxiyuan wangxiyuan merged commit 4af5b80 into vllm-project:main Aug 23, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants