Skip to content

Commit 9c5d40e

Browse files
committed
This change separates the general DP padding logic from the existing implementation specific to TorchAir. Make sure server do not hang when batch size < DP size.
Signed-off-by: Yizhou Liu <[email protected]>
1 parent e9fb895 commit 9c5d40e

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1914,6 +1914,10 @@ def _dummy_run(
19141914
)
19151915

19161916
# Padding for DP
1917+
num_pad, num_tokens_across_dp_native = self.get_dp_padding(num_tokens)
1918+
# num_tokens += num_pad ## Uncomment this after TorchAir is removed
1919+
1920+
# Padding for DP (for TorchAir)
19171921
(num_tokens, num_tokens_across_dp, with_prefill,
19181922
_) = self._get_forward_metadata_across_dp_and_pad(
19191923
num_tokens, with_prefill, False)

0 commit comments

Comments
 (0)