Skip to content

Commit 465664b

Browse files
committed
Update guide to add steps to deploy healthcheck policy for gke
1 parent 3846265 commit 465664b

File tree

1 file changed

+66
-17
lines changed

1 file changed

+66
-17
lines changed

site-src/guides/index.md

Lines changed: 66 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ A cluster with:
1919
- Support for [sidecar containers](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/) (enabled by default since Kubernetes v1.29)
2020
to run the model server deployment.
2121

22+
Tooling:
23+
- [Helm](https://helm.sh/docs/intro/install/) installed
24+
2225
## **Steps**
2326

2427
### Deploy Sample Model Server
@@ -80,6 +83,56 @@ A cluster with:
8083
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
8184
```
8285

86+
### Deploy the InferencePool and Endpoint Picker Extension
87+
88+
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
89+
90+
The Helm install automatically installs the endpoint-picker, inferencepool along with provider specific resources.
91+
92+
=== "GKE"
93+
94+
```bash
95+
export GATEWAY_PROVIDER=gke
96+
helm install vllm-llama3-8b-instruct \
97+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
98+
--set provider.name=$GATEWAY_PROVIDER \
99+
--version v0.3.0 \
100+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
101+
```
102+
103+
=== "Istio"
104+
105+
```bash
106+
export GATEWAY_PROVIDER=none
107+
helm install vllm-llama3-8b-instruct \
108+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
109+
--set provider.name=$GATEWAY_PROVIDER \
110+
--version v0.3.0 \
111+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
112+
```
113+
114+
=== "Kgateway"
115+
116+
```bash
117+
export GATEWAY_PROVIDER=none
118+
helm install vllm-llama3-8b-instruct \
119+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
120+
--set provider.name=$GATEWAY_PROVIDER \
121+
--version v0.3.0 \
122+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
123+
```
124+
125+
=== "Agentgateway"
126+
127+
```bash
128+
export GATEWAY_PROVIDER=none
129+
helm install vllm-llama3-8b-instruct \
130+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
131+
--set provider.name=$GATEWAY_PROVIDER \
132+
--version v0.3.0 \
133+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
134+
```
135+
83136
### Deploy an Inference Gateway
84137

85138
Choose one of the following options to deploy an Inference Gateway.
@@ -113,6 +166,18 @@ A cluster with:
113166
```bash
114167
kubectl get httproute llm-route -o yaml
115168
```
169+
170+
5. Deploy the HealthCheckPolicy
171+
172+
```bash
173+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/healthcheck.yaml
174+
```
175+
176+
6. Confirm that the HealthCheckPolicy status conditions include `Attached=True`:
177+
178+
```bash
179+
kubectl get healthcheckpolicy health-check-policy -o yaml
180+
```
116181

117182
=== "Istio"
118183

@@ -267,22 +332,6 @@ A cluster with:
267332
kubectl get httproute llm-route -o yaml
268333
```
269334

270-
271-
### Deploy the InferencePool and Endpoint Picker Extension
272-
273-
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
274-
275-
```bash
276-
export GATEWAY_PROVIDER=none # See [README](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/README.md#configuration) for valid configurations
277-
helm install vllm-llama3-8b-instruct \
278-
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
279-
--set provider.name=$GATEWAY_PROVIDER \
280-
--version v0.3.0 \
281-
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
282-
```
283-
284-
The Helm install automatically installs the endpoint-picker, inferencepool along with provider specific resources.
285-
286335
### Deploy InferenceObjective (Optional)
287336

288337
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -316,7 +365,7 @@ A cluster with:
316365
1. Uninstall the InferencePool, InferenceModel, and model server resources
317366

318367
```bash
319-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencepool-resources.yaml --ignore-not-found
368+
helm uninstall vllm-llama3-8b-instruct
320369
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml --ignore-not-found
321370
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
322371
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found

0 commit comments

Comments
 (0)