Skip to content

Commit 0fc118f

Browse files
committed
Update guide to add steps to deploy healthcheck policy for gke
1 parent 62f489b commit 0fc118f

File tree

1 file changed

+52
-17
lines changed

1 file changed

+52
-17
lines changed

site-src/guides/index.md

Lines changed: 52 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ A cluster with:
1919
- Support for [sidecar containers](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/) (enabled by default since Kubernetes v1.29)
2020
to run the model server deployment.
2121

22+
Tooling:
23+
- [Helm](https://helm.sh/docs/intro/install/) installed
24+
2225
## **Steps**
2326

2427
### Deploy Sample Model Server
@@ -80,6 +83,54 @@ A cluster with:
8083
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
8184
```
8285

86+
### Deploy the InferencePool and Endpoint Picker Extension
87+
88+
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, inferencepool along with provider specific resources.
89+
90+
=== "GKE"
91+
92+
```bash
93+
export GATEWAY_PROVIDER=gke
94+
helm install vllm-llama3-8b-instruct \
95+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
96+
--set provider.name=$GATEWAY_PROVIDER \
97+
--version v0.5.1 \
98+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
99+
```
100+
101+
=== "Istio"
102+
103+
```bash
104+
export GATEWAY_PROVIDER=none
105+
helm install vllm-llama3-8b-instruct \
106+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
107+
--set provider.name=$GATEWAY_PROVIDER \
108+
--version v0.5.1 \
109+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
110+
```
111+
112+
=== "Kgateway"
113+
114+
```bash
115+
export GATEWAY_PROVIDER=none
116+
helm install vllm-llama3-8b-instruct \
117+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
118+
--set provider.name=$GATEWAY_PROVIDER \
119+
--version v0.5.1 \
120+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
121+
```
122+
123+
=== "Agentgateway"
124+
125+
```bash
126+
export GATEWAY_PROVIDER=none
127+
helm install vllm-llama3-8b-instruct \
128+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
129+
--set provider.name=$GATEWAY_PROVIDER \
130+
--version v0.5.1 \
131+
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
132+
```
133+
83134
### Deploy an Inference Gateway
84135

85136
Choose one of the following options to deploy an Inference Gateway.
@@ -268,22 +319,6 @@ A cluster with:
268319
kubectl get httproute llm-route -o yaml
269320
```
270321

271-
272-
### Deploy the InferencePool and Endpoint Picker Extension
273-
274-
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label app: vllm-llama3-8b-instruct and listening on port 8000, you can run the following command:
275-
276-
```bash
277-
export GATEWAY_PROVIDER=none # See [README](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/config/charts/inferencepool/README.md#configuration) for valid configurations
278-
helm install vllm-llama3-8b-instruct \
279-
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
280-
--set provider.name=$GATEWAY_PROVIDER \
281-
--version v0.3.0 \
282-
oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool
283-
```
284-
285-
The Helm install automatically installs the endpoint-picker, inferencepool along with provider specific resources.
286-
287322
### Deploy InferenceObjective (Optional)
288323

289324
Deploy the sample InferenceObjective which allows you to specify priority of requests.
@@ -317,7 +352,7 @@ A cluster with:
317352
1. Uninstall the InferencePool, InferenceModel, and model server resources
318353

319354
```bash
320-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencepool-resources.yaml --ignore-not-found
355+
helm uninstall vllm-llama3-8b-instruct
321356
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml --ignore-not-found
322357
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
323358
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found

0 commit comments

Comments
 (0)