Commit 0aa1356
Minsung-commit
[Core] Add token-level KV cache metrics to V1 engine
Add token-level KV cache metrics (total, used, free) to complement
existing percentage-based metrics in the V1 engine.
## Motivation
Current V1 engine only provides kv_cache_usage as percentage (0.0-1.0).
Absolute token counts are critical for:
- Capacity planning: "28k tokens left" vs "35% free"
- Cost accounting: Token-based billing
- Monitoring: Prometheus/Grafana dashboards
- Debugging: Understanding exact cache state
## Changes
1. **vllm/v1/metrics/stats.py**: Add fields to SchedulerStats
- kv_cache_total_tokens: Total capacity
- kv_cache_used_tokens: Currently occupied
- kv_cache_free_tokens: Available space
2. **vllm/v1/core/block_pool.py**: Add get_num_total_blocks()
- Returns total GPU blocks (excludes 1 reserved block)
3. **vllm/v1/core/kv_cache_manager.py**: Add properties
- total_tokens, free_tokens, used_tokens
- Derives block_size from coordinator (handles DCP/PCP scaling)
4. **vllm/v1/core/sched/scheduler.py**: Populate metrics in make_stats()
## Example Output
Before:
kv_cache_usage: 0.65
After:
kv_cache_usage: 0.65
kv_cache_total_tokens: 82448
kv_cache_used_tokens: 53591
kv_cache_free_tokens: 28857
Addresses #12283, #26850
Signed-off-by: Minsung-commit <[email protected]>1 parent 6fc5841 commit 0aa1356
File tree
4 files changed
+57
-1
lines changed- vllm/v1
- core
- sched
- metrics
4 files changed
+57
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
440 | 440 | | |
441 | 441 | | |
442 | 442 | | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
443 | 455 | | |
444 | 456 | | |
445 | 457 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
109 | | - | |
110 | 109 | | |
111 | 110 | | |
112 | 111 | | |
| |||
130 | 129 | | |
131 | 130 | | |
132 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
133 | 137 | | |
134 | 138 | | |
135 | 139 | | |
| |||
149 | 153 | | |
150 | 154 | | |
151 | 155 | | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
152 | 190 | | |
153 | 191 | | |
154 | 192 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1439 | 1439 | | |
1440 | 1440 | | |
1441 | 1441 | | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
1442 | 1445 | | |
1443 | 1446 | | |
1444 | 1447 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
174 | 177 | | |
175 | 178 | | |
176 | 179 | | |
| |||
0 commit comments