Skip to content

Commit 5d91d2b

Browse files
maang-hheheda12345
andauthored
[Doc] Add allocate_slots parameter docs (#29777)
Signed-off-by: maang <[email protected]> Signed-off-by: maang-h <[email protected]> Co-authored-by: Chen Zhang <[email protected]>
1 parent c014de1 commit 5d91d2b

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

vllm/v1/core/kv_cache_manager.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -230,6 +230,9 @@ def allocate_slots(
230230
delay_cache_blocks: Whether to skip caching the blocks. This is
231231
used by P/D when allocating blocks used in a KV transfer
232232
which will complete in a future step.
233+
num_encoder_tokens: The number of encoder tokens to allocate for
234+
cross-attention in encoder-decoder models(e.g., Whisper).
235+
For decoder-only models, this should be 0.
233236
234237
Blocks layout:
235238
```

0 commit comments

Comments
 (0)