Skip to content

Commit 5396689

Browse files
committed
Add Jaeger tracing integration to inferencepool chart
Signed-off-by: Cyclinder Kuo <[email protected]>
1 parent e4fe22d commit 5396689

File tree

4 files changed

+139
-1
lines changed

4 files changed

+139
-1
lines changed

config/charts/inferencepool/Chart.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,9 @@ type: application
77
version: 0.0.0
88

99
appVersion: "0.0.0"
10+
11+
dependencies:
12+
- name: jaeger
13+
version: "2.11.0"
14+
repository: "https://jaegertracing.github.io/helm-charts"
15+
condition: jaeger.enabled

config/charts/inferencepool/README.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,93 @@ inferenceExtension:
237237
Make sure that the `otelExporterEndpoint` points to your OpenTelemetry collector endpoint.
238238
Current only the `parentbased_traceidratio` sampler is supported. You can adjust the base sampling ratio using the `samplerArg` (e.g., 0.1 means 10% of traces will be sampled).
239239

240+
#### Jaeger Tracing Backend
241+
242+
GAIE provides an opt-in Jaeger all-in-one deployment as a sub-chart for easy trace collection and visualization. This is particularly useful for development, testing, and understanding how inference requests are processed (filtered, scored) and forwarded to vLLM models.
243+
244+
**Quick Start with Jaeger:**
245+
246+
To install the InferencePool with Jaeger tracing enabled:
247+
248+
```bash
249+
# Update Helm dependencies to fetch Jaeger chart
250+
helm dependency update ./config/charts/inferencepool
251+
252+
# Install with Jaeger enabled
253+
helm install vllm-llama3-8b-instruct ./config/charts/inferencepool \
254+
--set inferencePool.modelServers.matchLabels.app=vllm-llama3-8b-instruct \
255+
--set inferenceExtension.tracing.enabled=true \
256+
--set jaeger.enabled=true
257+
```
258+
259+
Or using a `values.yaml` file:
260+
261+
```yaml
262+
inferenceExtension:
263+
tracing:
264+
enabled: true
265+
sampling:
266+
sampler: "parentbased_traceidratio"
267+
samplerArg: "1.0" # 100% sampling for development
268+
269+
jaeger:
270+
enabled: true
271+
```
272+
273+
Then install:
274+
275+
```bash
276+
helm dependency update ./config/charts/inferencepool
277+
helm install vllm-llama3-8b-instruct ./config/charts/inferencepool -f values.yaml
278+
```
279+
280+
**Accessing Jaeger UI:**
281+
282+
Once deployed, you can access the Jaeger UI to visualize traces:
283+
284+
```bash
285+
# Port-forward to access Jaeger UI
286+
kubectl port-forward svc/vllm-llama3-8b-instruct-jaeger-query 16686:16686
287+
288+
# Open browser to http://localhost:16686
289+
```
290+
291+
In the Jaeger UI, you can:
292+
- Search for traces by service name (`gateway-api-inference-extension`)
293+
- View detailed span information showing filter and scorer execution
294+
- Analyze request routing decisions and latency
295+
- Understand the complete inference request flow
296+
297+
**Configuration Options:**
298+
299+
The Jaeger sub-chart supports the following configuration:
300+
301+
| **Parameter Name** | **Description** | **Default** |
302+
|---------------------------------------|-----------------------------------------------------------------------------------------------------|----------------------------------|
303+
| `jaeger.enabled` | Enable Jaeger all-in-one deployment | `false` |
304+
| `jaeger.allInOne.enabled` | Enable all-in-one deployment mode | `true` |
305+
| `jaeger.allInOne.image.repository` | Jaeger all-in-one image repository | `jaegertracing/all-in-one` |
306+
| `jaeger.allInOne.image.tag` | Jaeger image tag | `1.62` |
307+
| `jaeger.allInOne.resources.limits` | Resource limits for Jaeger pod | `cpu: 500m, memory: 512Mi` |
308+
| `jaeger.allInOne.resources.requests` | Resource requests for Jaeger pod | `cpu: 100m, memory: 128Mi` |
309+
| `jaeger.query.service.type` | Jaeger UI service type | `ClusterIP` |
310+
| `jaeger.query.service.port` | Jaeger UI port | `16686` |
311+
| `jaeger.collector.service.otlp.grpc.port` | OTLP gRPC collector port | `4317` |
312+
| `jaeger.storage.type` | Storage backend type (memory, elasticsearch, cassandra, etc.) | `memory` |
313+
314+
**Important Notes:**
315+
316+
1. **Development vs Production**: The all-in-one deployment uses in-memory storage and is suitable for development and testing. For production use, consider:
317+
- Using a persistent storage backend (Elasticsearch, Cassandra, etc.)
318+
- Deploying Jaeger components separately for better scalability
319+
- Refer to [Jaeger Production Deployment](https://www.jaegertracing.io/docs/latest/deployment/) for best practices
320+
321+
2. **Automatic Configuration**: When `jaeger.enabled=true`, the OTLP exporter endpoint is automatically configured to point to the Jaeger collector. You don't need to manually set `inferenceExtension.tracing.otelExporterEndpoint`.
322+
323+
3. **Sampling Rate**: For development, you may want to set `samplerArg: "1.0"` to capture all traces. For production, use a lower value like `"0.1"` (10%) to reduce overhead.
324+
325+
4. **Resource Requirements**: Adjust the resource limits based on your trace volume and cluster capacity.
326+
240327
## Notes
241328

242329
This chart will only deploy an InferencePool and its corresponding EndpointPicker extension. Before install the chart, please make sure that the inference extension CRDs are installed in the cluster. For more details, please refer to the [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/).

config/charts/inferencepool/templates/epp-deployment.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,11 @@ spec:
114114
- name: OTEL_SERVICE_NAME
115115
value: "gateway-api-inference-extension"
116116
- name: OTEL_EXPORTER_OTLP_ENDPOINT
117+
{{- if .Values.jaeger.enabled }}
118+
value: "http://{{ .Release.Name }}-jaeger-collector:4317"
119+
{{- else }}
117120
value: {{ .Values.inferenceExtension.tracing.otelExporterEndpoint | quote }}
121+
{{- end }}
118122
- name: OTEL_TRACES_EXPORTER
119123
value: "otlp"
120124
- name: OTEL_RESOURCE_ATTRIBUTES_NODE_NAME

config/charts/inferencepool/values.yaml

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ inferenceExtension:
5858
enabled: false
5959
tracing:
6060
enabled: false
61+
# When jaeger.enabled is true, this will automatically point to the Jaeger collector
62+
# Otherwise, you can specify your own OpenTelemetry collector endpoint
6163
otelExporterEndpoint: "http://localhost:4317"
6264
sampling:
6365
sampler: "parentbased_traceidratio"
@@ -94,4 +96,43 @@ istio:
9496
trafficPolicy: {}
9597
# connectionPool:
9698
# http:
97-
# maxRequestsPerConnection: 256000
99+
# maxRequestsPerConnection: 256000
100+
101+
# Jaeger tracing backend configuration
102+
# When enabled, deploys Jaeger all-in-one for trace collection and visualization
103+
jaeger:
104+
enabled: false
105+
# Use the all-in-one deployment mode for simplicity
106+
# For production, consider using a more robust deployment with separate components
107+
allInOne:
108+
enabled: true
109+
image:
110+
repository: jaegertracing/all-in-one
111+
tag: "2.11"
112+
pullPolicy: IfNotPresent
113+
resources:
114+
limits:
115+
cpu: 500m
116+
memory: 512Mi
117+
requests:
118+
cpu: 100m
119+
memory: 128Mi
120+
# Expose Jaeger UI service
121+
query:
122+
service:
123+
type: ClusterIP
124+
port: 16686
125+
# Collector configuration for OTLP
126+
collector:
127+
service:
128+
otlp:
129+
grpc:
130+
port: 4317
131+
http:
132+
port: 4318
133+
# Storage configuration - use in-memory for simplicity
134+
storage:
135+
type: memory
136+
# Agent configuration
137+
agent:
138+
enabled: false

0 commit comments

Comments
 (0)