Skip to content

Commit bdcd2d1

Browse files
committed
Add Install Gateway section in Getting Started Latest guide
- Move instructions from the Deploy an Inference Gateway section describing installation of Gateway API CRDs and provider specific GWs Signed-off-by: Dharaneeshwaran Ravichandran <[email protected]>
1 parent 438863b commit bdcd2d1

File tree

2 files changed

+61
-71
lines changed

2 files changed

+61
-71
lines changed

site-src/guides/getting-started-latest.md

Lines changed: 59 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<!-- If you are updating this getting-started-latest.md guide, please make sure to update the index.md as well -->
2+
13
# Getting started with an Inference Gateway
24

35
!!! warning "Unreleased/main branch"
@@ -41,86 +43,90 @@
4143
kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
4244
```
4345

44-
### Deploy the InferencePool and Endpoint Picker Extension
46+
### Install the Gateway
4547

46-
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
48+
Choose one of the following options to install Gateway.
4749

48-
Set the chart version and then select a tab to follow the provider-specific instructions.
50+
=== "GKE"
4951

50-
```bash
51-
export IGW_CHART_VERSION=v0
52-
```
52+
Nothing to install here, you can move to the next [section](#deploy-the-inferencepool-and-endpoint-picker-extension)
5353

54-
--8<-- "site-src/_includes/epp-latest.md"
54+
=== "Istio"
5555

56-
### Deploy an Inference Gateway
56+
1. Requirements
57+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
5758

58-
Choose one of the following options to deploy an Inference Gateway.
59+
2. Install Istio
5960

60-
=== "GKE"
61+
```
62+
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
63+
# on Linux
64+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
65+
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
66+
# on macOS
67+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
68+
tar -xvf istioctl-$TAG-osx.tar.gz
69+
# on Windows
70+
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
71+
unzip istioctl-$TAG-win.zip
6172

62-
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
63-
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
64-
for detailed instructions.
73+
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
74+
```
6575

66-
2. Deploy Inference Gateway:
76+
=== "Kgateway"
6777

68-
```bash
69-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
70-
```
78+
1. Requirements
7179

72-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
80+
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
81+
- [Helm](https://helm.sh/docs/intro/install/) installed.
7382

74-
```bash
75-
$ kubectl get gateway inference-gateway
76-
NAME CLASS ADDRESS PROGRAMMED AGE
77-
inference-gateway inference-gateway <MY_ADDRESS> True 22s
78-
```
79-
3. Deploy the HTTPRoute
83+
2. Set the Kgateway version and install the Kgateway CRDs.
8084

8185
```bash
82-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
86+
KGTW_VERSION=v2.1.0
87+
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
8388
```
8489

85-
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
90+
3. Install Kgateway
8691

8792
```bash
88-
kubectl get httproute llm-route -o yaml
93+
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
8994
```
9095

91-
=== "Istio"
96+
### Deploy the InferencePool and Endpoint Picker Extension
9297

93-
Please note that this feature is currently in an experimental phase and is not intended for production use.
94-
The implementation and user experience are subject to changes as we continue to iterate on this project.
98+
Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
9599

96-
1. Requirements
100+
Set the chart version and then select a tab to follow the provider-specific instructions.
97101

98-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
102+
```bash
103+
export IGW_CHART_VERSION=v0
104+
```
99105

100-
2. Install Istio
106+
--8<-- "site-src/_includes/epp-latest.md"
101107

102-
```
103-
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
104-
# on Linux
105-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
106-
tar -xvf istioctl-$TAG-linux-amd64.tar.gz
107-
# on macOS
108-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
109-
tar -xvf istioctl-$TAG-osx.tar.gz
110-
# on Windows
111-
wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
112-
unzip istioctl-$TAG-win.zip
108+
### Deploy an Inference Gateway
113109

114-
./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
115-
```
110+
Choose one of the following options to deploy an Inference Gateway.
111+
112+
=== "GKE"
113+
114+
1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
115+
See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
116+
for detailed instructions.
117+
118+
=== "Istio"
119+
120+
Please note that this feature is currently in an experimental phase and is not intended for production use.
121+
The implementation and user experience are subject to changes as we continue to iterate on this project.
116122

117-
3. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
123+
1. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
118124

119125
```bash
120126
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121127
```
122128

123-
4. Deploy Gateway
129+
2. Deploy the Inference Gateway
124130

125131
```bash
126132
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
@@ -133,13 +139,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
133139
inference-gateway inference-gateway <MY_ADDRESS> True 22s
134140
```
135141

136-
5. Deploy the HTTPRoute
142+
3. Deploy the HTTPRoute
137143

138144
```bash
139145
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140146
```
141147

142-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
148+
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143149

144150
```bash
145151
kubectl get httproute llm-route -o yaml
@@ -152,25 +158,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
152158
implementation. Kgateway supports Inference Gateway with the [agentgateway](https://agentgateway.dev/) data plane. Follow these steps
153159
to run Kgateway as an Inference Gateway:
154160

155-
1. Requirements
156-
157-
- [Helm](https://helm.sh/docs/intro/install/) installed.
158-
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
159-
160-
2. Set the Kgateway version and install the Kgateway CRDs.
161-
162-
```bash
163-
KGTW_VERSION=v2.2.0-main
164-
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
165-
```
166-
167-
3. Install Kgateway
168-
169-
```bash
170-
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
171-
```
172-
173-
4. Deploy the Gateway
161+
1. Deploy the Inference Gateway
174162

175163
```bash
176164
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
@@ -181,13 +169,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
181169
kubectl get gateway inference-gateway
182170
```
183171

184-
5. Deploy the HTTPRoute
172+
2. Deploy the HTTPRoute
185173

186174
```bash
187175
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
188176
```
189177

190-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
178+
3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
191179

192180
```bash
193181
kubectl get httproute llm-route -o yaml

site-src/guides/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
<!-- If you are updating this index.md guide, please make sure to update the getting-started-latest.md as well -->
2+
13
# Getting started with an Inference Gateway
24

35
--8<-- "site-src/_includes/intro.md"

0 commit comments

Comments
 (0)