Skip to content

Commit 9240af2

Browse files
committed
Add Install Gateway section in Getting Started Latest guide
- Move instructions from the Deploy an Inference Gateway section describing installation of Gateway API CRDs and provider specific GWs Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
1 parent 0fc7cfb commit 9240af2

File tree

2 files changed

+80
-88
lines changed

2 files changed

+80
-88
lines changed

site-src/guides/getting-started-latest.md

Lines changed: 78 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<!-- If you are updating this getting-started-latest.md guide, please make sure to update the index.md as well -->
2+
13
# Getting started with an Inference Gateway
24

35
!!! warning "Unreleased/main branch"
@@ -41,86 +43,110 @@
4143
kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
4244
```
4345

44-
### Deploy the InferencePool and Endpoint Picker Extension
46+
### Install the Gateway
4547

46-
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
48+
Choose one of the following options to install Gateway.
4749

48-
Set the chart version and then select a tab to follow the provider-specific instructions.
50+
=== "GKE"
4951

50-
```bash
51-
export IGW_CHART_VERSION=v0
52-
```
52+
Nothing to install here, you can move to the next [section](#deploy-the-inferencepool-and-endpoint-picker-extension)
5353

54-
--8<-- "site-src/_includes/epp-latest.md"
54+
=== "Istio"
5555

56-
### Deploy an Inference Gateway
56+
1. Requirements
57+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
5758

58-
Choose one of the following options to deploy an Inference Gateway.
59+
2. Install Istio
5960

60-
=== "GKE"
61+
```
62+
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
63+
# on Linux
64+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
65+
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
66+
# on macOS
67+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
68+
tar -xvf istioctl-$TAG-osx.tar.gz
69+
# on Windows
70+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
71+
unzip istioctl-$TAG-win.zip
6172

62-
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
63-
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
64-
for detailed instructions.
73+
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
74+
```
75+
76+
=== "Kgateway"
6577

66-
2. Deploy Inference Gateway:
78+
1. Requirements
79+
80+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
81+
- [Helm](https://helm.sh/docs/intro/install/) installed.
82+
83+
2. Set the Kgateway version and install the Kgateway CRDs.
6784

6885
```bash
69-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
86+
KGTW_VERSION=v2.1.0
87+
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
7088
```
7189

72-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
90+
3. Install Kgateway
7391

7492
```bash
75-
$ kubectl get gateway inference-gateway
76-
NAME CLASS ADDRESS PROGRAMMED AGE
77-
inference-gateway inference-gateway <MY_ADDRESS> True 22s
93+
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
7894
```
79-
3. Deploy the HTTPRoute
95+
96+
=== "Agentgateway"
97+
98+
1. Requirements
99+
100+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
101+
- [Helm](https://helm.sh/docs/intro/install/) installed.
102+
103+
2. Set the Kgateway version and install the Kgateway CRDs.
80104

81105
```bash
82-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
106+
KGTW_VERSION=v2.1.0
107+
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
83108
```
84109

85-
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
110+
3. Install Kgateway
86111

87112
```bash
88-
kubectl get httproute llm-route -o yaml
113+
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true
89114
```
90115

91-
=== "Istio"
116+
### Deploy the InferencePool and Endpoint Picker Extension
92117

93-
Please note that this feature is currently in an experimental phase and is not intended for production use.
94-
The implementation and user experience are subject to changes as we continue to iterate on this project.
118+
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
119+
120+
Set the chart version and then select a tab to follow the provider-specific instructions.
95121

96-
1. Requirements
122+
```bash
123+
export IGW_CHART_VERSION=v0
124+
```
97125

98-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
126+
--8<-- "site-src/_includes/epp-latest.md"
99127

100-
2. Install Istio
128+
### Deploy an Inference Gateway
101129

102-
```
103-
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
104-
# on Linux
105-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
106-
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
107-
# on macOS
108-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
109-
tar -xvf istioctl-$TAG-osx.tar.gz
110-
# on Windows
111-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
112-
unzip istioctl-$TAG-win.zip
130+
Choose one of the following options to deploy an Inference Gateway.
113131

114-
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
115-
```
132+
=== "GKE"
116133

117-
3. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
134+
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
135+
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
136+
for detailed instructions.
137+
138+
=== "Istio"
139+
140+
Please note that this feature is currently in an experimental phase and is not intended for production use.
141+
The implementation and user experience are subject to changes as we continue to iterate on this project.
142+
143+
1. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
118144

119145
```bash
120146
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121147
```
122148

123-
4. Deploy Gateway
149+
2. Deploy the Inference Gateway
124150

125151
```bash
126152
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
@@ -133,13 +159,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
133159
inference-gateway inference-gateway <MY_ADDRESS> True 22s
134160
```
135161

136-
5. Deploy the HTTPRoute
162+
3. Deploy the HTTPRoute
137163

138164
```bash
139165
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140166
```
141167

142-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
168+
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143169

144170
```bash
145171
kubectl get httproute llm-route -o yaml
@@ -151,25 +177,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
151177
[conformant](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/conformance/reports/v1.0.0/gateway/kgateway)
152178
gateway. Follow these steps to run Kgateway:
153179

154-
1. Requirements
155-
156-
- [Helm](https://helm.sh/docs/intro/install/) installed.
157-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
158-
159-
2. Set the Kgateway version and install the Kgateway CRDs.
160-
161-
```bash
162-
KGTW_VERSION=v2.2.0-main
163-
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
164-
```
165-
166-
3. Install Kgateway
167-
168-
```bash
169-
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
170-
```
171-
172-
4. Deploy the Gateway
180+
1. Deploy the Inference Gateway
173181

174182
```bash
175183
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
@@ -182,13 +190,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
182190
inference-gateway kgateway <MY_ADDRESS> True 22s
183191
```
184192

185-
5. Deploy the HTTPRoute
193+
2. Deploy the HTTPRoute
186194

187195
```bash
188196
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
189197
```
190198

191-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
199+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
192200

193201
```bash
194202
kubectl get httproute llm-route -o yaml
@@ -200,25 +208,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
200208
Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. Follow these steps to run Kgateway with the agentgateway
201209
data plane:
202210

203-
1. Requirements
204-
205-
- [Helm](https://helm.sh/docs/intro/install/) installed.
206-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
207-
208-
2. Set the Kgateway version and install the Kgateway CRDs.
209-
210-
```bash
211-
KGTW_VERSION=v2.2.0-main
212-
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
213-
```
214-
215-
3. Install Kgateway
216-
217-
```bash
218-
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true
219-
```
220-
221-
4. Deploy the Gateway
211+
1. Deploy the Inference Gateway
222212

223213
```bash
224214
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
@@ -231,13 +221,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
231221
inference-gateway agentgateway <MY_ADDRESS> True 22s
232222
```
233223

234-
5. Deploy the HTTPRoute
224+
2. Deploy the HTTPRoute
235225

236226
```bash
237227
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
238228
```
239229

240-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
230+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
241231

242232
```bash
243233
kubectl get httproute llm-route -o yaml

site-src/guides/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<!-- If you are updating this index.md guide, please make sure to update the getting-started-latest.md as well -->
2+
13
# Getting started with an Inference Gateway
24

35
--8<-- "site-src/_includes/intro.md"

0 commit comments

Comments
 (0)