Skip to content

Commit d3be50d

Browse files
committed
This change separates the general DP padding logic from the existing implementation specific to TorchAir. Make sure server do not hang when batch size < DP size.
Signed-off-by: Yizhou Liu <[email protected]>
1 parent 3f867ee commit d3be50d

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm_ascend/worker/model_runner_v1.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1930,6 +1930,11 @@ def _dummy_run(
19301930
)
19311931

19321932
# Padding for DP
1933+
num_pad, num_tokens_across_dp_native = self.get_dp_padding(
1934+
num_tokens)
1935+
# num_tokens += num_pad ## Uncomment this after TorchAir is removed
1936+
1937+
# Padding for DP (for TorchAir)
19331938
(num_tokens, num_tokens_across_dp, with_prefill,
19341939
_) = self._get_forward_metadata_across_dp_and_pad(
19351940
num_tokens, with_prefill, False)

0 commit comments

Comments
 (0)