Skip to content

Commit d217052

Browse files
committed
WAR to unblock trtllm-serve w/ logprob in PyT backend
Signed-off-by: Erin Ho <[email protected]> update comment
1 parent ff99639 commit d217052

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

tensorrt_llm/executor/result.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,10 @@ def _handle_sequence(self,
228228
output.logprobs = response_tensors.log_probs[src_idx]
229229
# overcome some WAR in the cpp executor
230230
if finish_reasons[src_idx] != tllm.FinishReason.CANCELLED:
231+
if len(output.logprobs) > output.length:
232+
# LlmResult holds a reference to LogProbStorage, which may be updated by the worker before the result is serialized.
233+
# Therefore, we treat extra logprobs/logits as expected and only consume what's needed.
234+
output.logprobs = output.logprobs[:output.length]
231235
assert len(output.logprobs) == output.length
232236
if response_tensors.generation_logits is not None:
233237
output.generation_logits = response_tensors.generation_logits[

0 commit comments

Comments
 (0)