1+ <!--  If you are updating this getting-started-latest.md guide, please make sure to update the index.md as well --> 
2+ 
13# Getting started with an Inference Gateway  
24
35!!! warning "Unreleased/main branch"
4143kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
4244``` 
4345
44- ### Deploy  the InferencePool and Endpoint Picker Extension  
46+ ### Install  the Gateway  
4547
46-    Install an InferencePool named  ` vllm-llama3-8b-instruct `  that selects from endpoints with label  ` app: vllm-llama3-8b-instruct `  and listening on port 8000. The Helm  install command automatically installs the endpoint-picker, InferencePool along with provider specific resources .
48+    Choose one of the following options to  install Gateway .
4749
48-    Set the chart version and then select a tab to follow the provider-specific instructions. 
50+ === "GKE" 
4951
50-    ``` bash 
51-    export  IGW_CHART_VERSION=v0
52-    ``` 
52+       Nothing to install here, you can move to the next [section](#deploy-the-inferencepool-and-endpoint-picker-extension) 
5353
54- --8<-- "site-src/ _ includes/epp-latest.md "
54+ === "Istio "
5555
56- ### Deploy an Inference Gateway  
56+       1. Requirements 
57+          - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
5758
58-    Choose one of the following options to deploy an Inference Gateway. 
59+        2. Install Istio 
5960
60- === "GKE"
61+          ``` 
62+          TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev) 
63+          # on Linux 
64+          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz 
65+          tar -xvf istioctl-$TAG-linux-amd64.tar.gz 
66+          # on macOS 
67+          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz 
68+          tar -xvf istioctl-$TAG-osx.tar.gz 
69+          # on Windows 
70+          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip 
71+          unzip istioctl-$TAG-win.zip 
6172
62-       1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.  
63-          See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway) 
64-          for detailed instructions. 
73+          ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true 
74+          ``` 
75+ 
76+ === "Kgateway"
6577
66-       2. Deploy Inference Gateway: 
78+       1. Requirements 
79+ 
80+          - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
81+          - [Helm](https://helm.sh/docs/intro/install/) installed. 
82+ 
83+       2. Set the Kgateway version and install the Kgateway CRDs. 
6784
6885         ```bash 
69-          kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml 
86+          KGTW_VERSION=v2.1.0 
87+          helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds 
7088         ``` 
7189
72-          Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:  
90+       3. Install Kgateway  
7391
7492         ```bash 
75-          $ kubectl get gateway inference-gateway 
76-          NAME                CLASS               ADDRESS         PROGRAMMED   AGE 
77-          inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s 
93+          helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true 
7894         ``` 
79-       3. Deploy the HTTPRoute 
95+ 
96+ === "Agentgateway"
97+ 
98+       1. Requirements 
99+ 
100+          - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
101+          - [Helm](https://helm.sh/docs/intro/install/) installed. 
102+ 
103+       2. Set the Kgateway version and install the Kgateway CRDs. 
80104
81105         ```bash 
82-          kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml 
106+          KGTW_VERSION=v2.1.0 
107+          helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds 
83108         ``` 
84109
85-       4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:  
110+       3. Install Kgateway  
86111
87112         ```bash 
88-          kubectl get httproute llm-route -o yaml  
113+          helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true  
89114         ``` 
90115
91- === "Istio" 
116+ ###  Deploy the InferencePool and Endpoint Picker Extension 
92117
93-       Please note that this feature is currently in an experimental phase and is not intended for production use. 
94-       The implementation and user experience are subject to changes as we continue to iterate on this project. 
118+    Install an InferencePool named ` vllm-llama3-8b-instruct `  that selects from endpoints with label ` app: vllm-llama3-8b-instruct `  and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
119+ 
120+    Set the chart version and then select a tab to follow the provider-specific instructions.
95121
96-       1.  Requirements 
122+    ``` bash 
123+    export  IGW_CHART_VERSION=v0
124+    ``` 
97125
98-           - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
126+ --8<-- "site-src/ _ includes/epp-latest.md" 
99127
100-        2. Install Istio 
128+ ###  Deploy an Inference Gateway 
101129
102-          ``` 
103-          TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev) 
104-          # on Linux 
105-          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz 
106-          tar -xvf istioctl-$TAG-linux-amd64.tar.gz 
107-          # on macOS 
108-          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz 
109-          tar -xvf istioctl-$TAG-osx.tar.gz 
110-          # on Windows 
111-          wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip 
112-          unzip istioctl-$TAG-win.zip 
130+    Choose one of the following options to deploy an Inference Gateway.
113131
114-          ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true 
115-          ``` 
132+ === "GKE"
116133
117-       3. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification: 
134+       1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.  
135+          See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway) 
136+          for detailed instructions. 
137+ 
138+ === "Istio"
139+ 
140+       Please note that this feature is currently in an experimental phase and is not intended for production use. 
141+       The implementation and user experience are subject to changes as we continue to iterate on this project. 
142+ 
143+       1. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification: 
118144
119145         ```bash 
120146         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml 
121147         ``` 
122148
123-       4 . Deploy Gateway 
149+       2 . Deploy the Inference  Gateway 
124150
125151         ```bash 
126152         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml 
@@ -133,13 +159,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
133159         inference-gateway   inference-gateway   <MY_ADDRESS>    True         22s 
134160         ``` 
135161
136-       5 . Deploy the HTTPRoute 
162+       3 . Deploy the HTTPRoute 
137163
138164         ```bash 
139165         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml 
140166         ``` 
141167
142-       6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
168+       4 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
143169
144170         ```bash 
145171         kubectl get httproute llm-route -o yaml 
@@ -151,25 +177,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
151177      [conformant](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/conformance/reports/v1.0.0/gateway/kgateway) 
152178      gateway. Follow these steps to run Kgateway: 
153179
154-       1. Requirements 
155- 
156-          - [Helm](https://helm.sh/docs/intro/install/) installed. 
157-          - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
158- 
159-       2. Set the Kgateway version and install the Kgateway CRDs. 
160- 
161-          ```bash 
162-          KGTW_VERSION=v2.2.0-main 
163-          helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds 
164-          ``` 
165- 
166-       3. Install Kgateway 
167- 
168-          ```bash 
169-          helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true 
170-          ``` 
171- 
172-       4. Deploy the Gateway 
180+       1. Deploy the Inference Gateway 
173181
174182         ```bash 
175183         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml 
@@ -182,13 +190,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
182190         inference-gateway   kgateway            <MY_ADDRESS>    True         22s 
183191         ``` 
184192
185-       5 . Deploy the HTTPRoute 
193+       2 . Deploy the HTTPRoute 
186194
187195         ```bash 
188196         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml 
189197         ``` 
190198
191-       6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
199+       3 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
192200
193201         ```bash 
194202         kubectl get httproute llm-route -o yaml 
@@ -200,25 +208,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
200208      Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. Follow these steps to run Kgateway with the agentgateway 
201209      data plane: 
202210
203-       1. Requirements 
204- 
205-          - [Helm](https://helm.sh/docs/intro/install/) installed. 
206-          - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. 
207- 
208-       2. Set the Kgateway version and install the Kgateway CRDs. 
209- 
210-          ```bash 
211-          KGTW_VERSION=v2.2.0-main 
212-          helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds 
213-          ``` 
214- 
215-       3. Install Kgateway 
216- 
217-          ```bash 
218-          helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true 
219-          ``` 
220- 
221-       4. Deploy the Gateway 
211+       1. Deploy the Inference Gateway 
222212
223213         ```bash 
224214         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml 
@@ -231,13 +221,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
231221         inference-gateway   agentgateway        <MY_ADDRESS>    True         22s 
232222         ``` 
233223
234-       5 . Deploy the HTTPRoute 
224+       2 . Deploy the HTTPRoute 
235225
236226         ```bash 
237227         kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml 
238228         ``` 
239229
240-       6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
230+       3 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: 
241231
242232         ```bash 
243233         kubectl get httproute llm-route -o yaml 
0 commit comments