Skip to content

Commit 46a100f

Browse files
authored
remove scheduler epp flowchart (#1573)
1 parent 29ea290 commit 46a100f

File tree

2 files changed

+1
-6
lines changed

2 files changed

+1
-6
lines changed

docs/scheduler-flowchart.png

-400 KB
Binary file not shown.

pkg/epp/README.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,4 @@ An EPP instance handles a single `InferencePool` (and so for each `InferencePool
2020
- The EPP generates metrics to enhance observability.
2121
- It reports InferenceObjective-level metrics, further broken down by target model.
2222
- Detailed information regarding metrics can be found on the [website](https://gateway-api-inference-extension.sigs.k8s.io/guides/metrics/).
23-
24-
25-
## Scheduling Algorithm
26-
The scheduling package implements request scheduling algorithms for load balancing requests across backend pods in an inference gateway. The scheduler ensures efficient resource utilization while maintaining low latency and prioritizing critical requests. It applies a series of filters based on metrics and heuristics to select the best pod for a given request. The following flow chart summarizes the current scheduling algorithm
27-
28-
<img src="../../docs/scheduler-flowchart.png" alt="Scheduling Algorithm" width="400" />
23+

0 commit comments

Comments
 (0)