You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: config/charts/inferencepool/README.md
+87Lines changed: 87 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -237,6 +237,93 @@ inferenceExtension:
237
237
Make sure that the `otelExporterEndpoint` points to your OpenTelemetry collector endpoint.
238
238
Current only the `parentbased_traceidratio` sampler is supported. You can adjust the base sampling ratio using the `samplerArg` (e.g., 0.1 means 10% of traces will be sampled).
239
239
240
+
#### Jaeger Tracing Backend
241
+
242
+
GAIE provides an opt-in Jaeger all-in-one deployment as a sub-chart for easy trace collection and visualization. This is particularly useful for development, testing, and understanding how inference requests are processed (filtered, scored) and forwarded to vLLM models.
243
+
244
+
**Quick Start with Jaeger:**
245
+
246
+
To install the InferencePool with Jaeger tracing enabled:
1. **Development vs Production**: The all-in-one deployment uses in-memory storage and is suitable for development and testing. For production use, consider:
317
+
- Using a persistent storage backend (Elasticsearch, Cassandra, etc.)
318
+
- Deploying Jaeger components separately for better scalability
319
+
- Refer to [Jaeger Production Deployment](https://www.jaegertracing.io/docs/latest/deployment/) for best practices
320
+
321
+
2. **Automatic Configuration**: When `jaeger.enabled=true`, the OTLP exporter endpoint is automatically configured to point to the Jaeger collector. You don't need to manually set `inferenceExtension.tracing.otelExporterEndpoint`.
322
+
323
+
3. **Sampling Rate**: For development, you may want to set `samplerArg: "1.0"` to capture all traces. For production, use a lower value like `"0.1"` (10%) to reduce overhead.
324
+
325
+
4. **Resource Requirements**: Adjust the resource limits based on your trace volume and cluster capacity.
326
+
240
327
## Notes
241
328
242
329
This chart will only deploy an InferencePool and its corresponding EndpointPicker extension. Before install the chart, please make sure that the inference extension CRDs are installed in the cluster. For more details, please refer to the [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/).
0 commit comments