Skip to content

Commit 97e70d3

Browse files
committed
This change separates the general DP padding logic from the existing implementation specific to TorchAir. Make sure server do not hang when batch size < DP size.
Signed-off-by: Yizhou Liu <[email protected]>
1 parent 3f867ee commit 97e70d3

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1930,6 +1930,10 @@ def _dummy_run(
19301930
)
19311931

19321932
# Padding for DP
1933+
num_pad, num_tokens_across_dp_native = self.get_dp_padding(num_tokens)
1934+
# num_tokens += num_pad ## Uncomment this after TorchAir is removed
1935+
1936+
# Padding for DP (for TorchAir)
19331937
(num_tokens, num_tokens_across_dp, with_prefill,
19341938
_) = self._get_forward_metadata_across_dp_and_pad(
19351939
num_tokens, with_prefill, False)

0 commit comments

Comments
 (0)