Skip to content

Commit 789f1e8

Browse files
author
Minsung-commit
committed
Fix block_size initialization in KVCacheManager
Use actual KV cache block size from kv_cache_config instead of hash_block_size. **Issue**: The previous implementation incorrectly used `hash_block_size` for token metrics calculation. The hash_block_size is used for hashing granularity, not for the actual KV cache block size used by BlockPool. **Fix**: Initialize `self.block_size` from `kv_cache_config.kv_cache_groups[].kv_cache_spec.block_size`, which represents the actual block size used for token storage. **Impact**: This ensures token-level metrics (total_tokens, used_tokens, free_tokens) accurately reflect the real KV cache capacity, especially for models using larger block sizes than the hash granularity. Addresses bot review feedback on PR vllm-project#29836. Signed-off-by: Minsung-commit <[email protected]>
1 parent 3800caf commit 789f1e8

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

vllm/v1/core/kv_cache_manager.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,22 @@ def __init__(
106106
metrics_collector: KVCacheMetricsCollector | None = None,
107107
) -> None:
108108
self.max_model_len = max_model_len
109-
self.block_size = hash_block_size
110109

111110
self.enable_caching = enable_caching
111+
112+
# Initialize block_size from kv_cache_config
113+
# Note: We use the actual KV cache block size, not hash_block_size
114+
self.block_size: int | None = None
115+
if self.enable_caching:
116+
# Ensure all kv_cache_groups have the same block_size
117+
block_sizes = set(
118+
g.kv_cache_spec.block_size
119+
for g in kv_cache_config.kv_cache_groups
120+
)
121+
assert len(block_sizes) == 1, "Only one block size is supported for now"
122+
self.block_size = kv_cache_config.kv_cache_groups[
123+
0].kv_cache_spec.block_size
124+
# Note: DCP/PCP scaling handled by kv_cache_config if needed
112125
self.use_eagle = use_eagle
113126
self.log_stats = log_stats
114127
self.metrics_collector = metrics_collector

0 commit comments

Comments
 (0)