Skip to content

Commit 5f7960f

Browse files
authored
Merge pull request #568 from seans3/vllm-deployment-update
For AI vLLM example, add more cloud provider specific information
2 parents 7021833 + a5b001f commit 5f7960f

File tree

2 files changed

+45
-2
lines changed

2 files changed

+45
-2
lines changed

ai/vllm-deployment/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ This example demonstrates how to deploy a server for AI inference using [vLLM](h
1717
- [Detailed Steps & Explanation](#detailed-steps--explanation)
1818
- [Verification / Seeing it Work](#verification--seeing-it-work)
1919
- [Configuration Customization](#configuration-customization)
20+
- [Platform-Specific Configuration](#platform-specific-configuration)
2021
- [Cleanup](#cleanup)
2122
- [Further Reading / Next Steps](#further-reading--next-steps)
2223

@@ -116,6 +117,35 @@ Expected output (or similar):
116117

117118
- Update `MODEL_ID` within deployment manifest to serve different model (ensure Hugging Face access token contains these permissions).
118119
- Change the number of `vLLM` pod replicas in the deployment manifest.
120+
121+
---
122+
123+
## Platform-Specific Configuration
124+
125+
Node selectors make sure vLLM pods land on Nodes with the correct GPU, and they are the main difference among the cloud providers. The following are node selector examples for three cloud providers.
126+
127+
- GKE
128+
This `nodeSelector` uses labels that are specific to Google Kubernetes Engine.
129+
- `cloud.google.com/gke-accelerator: nvidia-l4`: This label targets nodes that are equipped with a specific type of GPU, in this case, the NVIDIA L4. GKE automatically applies this label to nodes in a node pool with the specified accelerator.
130+
- `cloud.google.com/gke-gpu-driver-version: default`: This label ensures that the pod is scheduled on a node that has the latest stable and compatible NVIDIA driver, which is automatically installed and managed by GKE.
131+
```yaml
132+
nodeSelector:
133+
cloud.google.com/gke-accelerator: nvidia-l4
134+
cloud.google.com/gke-gpu-driver-version: default
135+
```
136+
- EKS
137+
This `nodeSelector` targets worker nodes of a specific AWS EC2 instance type. The label `node.kubernetes.io/instance-type` is automatically applied by Kubernetes on AWS. In this example, `p4d.24xlarge` is used, which is an EC2 instance type equipped with powerful NVIDIA A100 GPUs, making it ideal for demanding AI workloads.
138+
```yaml
139+
nodeSelector:
140+
node.kubernetes.io/instance-type: p4d.24xlarge
141+
```
142+
- AKS
143+
This example uses a common but custom label, `agentpiscasi.com/gpu: "true"`. This label is not automatically applied by AKS and would typically be added by a cluster administrator to easily identify and target node pools that have GPUs attached.
144+
```yaml
145+
nodeSelector:
146+
agentpiscasi.com/gpu: "true" # Common label for AKS GPU nodes
147+
```
148+
119149
---
120150
121151
## Cleanup

ai/vllm-deployment/vllm-deployment.yaml

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,20 @@ spec:
5353
- name: dshm
5454
emptyDir:
5555
medium: Memory
56-
# GKE specific node selectors to ensure a particular (Nvidia L4) GPU.
56+
# Node selectors are the main difference among the cloud providers,
57+
# making sure vLLM pods land on Nodes with the correct GPU. The
58+
# following are node selector examples for three cloud providers.
59+
#
60+
# - GKE
5761
# nodeSelector:
5862
# cloud.google.com/gke-accelerator: nvidia-l4
59-
# cloud.google.com/gke-gpu-driver-version: latest
63+
# cloud.google.com/gke-gpu-driver-version: default
64+
#
65+
# - EKS
66+
# nodeSelector:
67+
# node.kubernetes.io/instance-type: p4d.24xlarge
68+
#
69+
# - AKS
70+
# nodeSelector:
71+
# agentpiscasi.com/gpu: "true" # Common label for AKS GPU nodes
72+

0 commit comments

Comments
 (0)