Skip to content

Commit 9a190eb

Browse files
committed
feat: add OpenTelemetry operator service
- Add OpenTelemetry operator v0.98.0 to observability namespace - Include security-hardened configuration with: - High availability (2 replicas with PDB) - Non-root execution and read-only root filesystem - Resource limits and security contexts - cert-manager integration for webhook TLS - Prometheus monitoring enabled - Follow openCenter GitOps standards and patterns - Update main README.md with service documentation - Add comprehensive service-specific README Resolves: OpenTelemetry operator deployment for auto-instrumentation
1 parent d2e08ce commit 9a190eb

File tree

8 files changed

+407
-0
lines changed

8 files changed

+407
-0
lines changed

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ applications/
2929
| **kube-prometheus-stack** | Core Service | `observability` | Complete monitoring and alerting stack |
3030
| **metallb** | Core Service | `metallb-system` | Bare metal load balancer |
3131
| **olm** | Core Service | `olm` | Operator Lifecycle Manager |
32+
| **opentelemetry-operator** | Core Service | `observability` | OpenTelemetry operator for auto-instrumentation |
3233
| **sealed-secrets** | Core Service | `sealed-secrets` | Encrypted secrets management |
3334
| **velero** | Core Service | `velero` | Cluster backup and disaster recovery |
3435
| **alert-proxy** | Managed Service | `rackspace` | Rackspace alert aggregation |
@@ -103,6 +104,17 @@ applications/
103104
- Dependency resolution
104105
- Automatic updates
105106

107+
#### **opentelemetry-operator**
108+
- **Purpose**: OpenTelemetry operator for auto-instrumentation and collector management
109+
- **Source**: OpenTelemetry Helm repository (`https://open-telemetry.github.io/opentelemetry-helm-charts`)
110+
- **Namespace**: `observability`
111+
- **Features**:
112+
- Automatic OpenTelemetry instrumentation injection
113+
- OpenTelemetry Collector deployment and management
114+
- Custom resource definitions for OpenTelemetry configuration
115+
- Webhook-based sidecar injection
116+
- Multi-language auto-instrumentation support (Java, Node.js, Python, .NET, Go)
117+
106118
#### **sealed-secrets**
107119
- **Purpose**: Encrypted secrets management
108120
- **Namespace**: `sealed-secrets`
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# OpenTelemetry Operator
2+
3+
The OpenTelemetry Operator is a Kubernetes operator that manages OpenTelemetry Collector instances and auto-instrumentation of workloads using OpenTelemetry instrumentation libraries.
4+
5+
## Overview
6+
7+
This service deploys the OpenTelemetry Operator in the `observability` namespace with security hardening and high availability configuration.
8+
9+
## Features
10+
11+
- **Auto-instrumentation**: Automatically inject OpenTelemetry instrumentation into applications
12+
- **Collector Management**: Deploy and manage OpenTelemetry Collector instances
13+
- **CRD Management**: Provides custom resources for OpenTelemetry configuration
14+
- **Webhook Support**: Admission webhooks for sidecar injection and validation
15+
16+
## Configuration
17+
18+
### Chart Information
19+
- **Chart**: `opentelemetry/opentelemetry-operator`
20+
- **Version**: `0.98.0`
21+
- **App Version**: `0.137.0`
22+
- **Repository**: https://open-telemetry.github.io/opentelemetry-helm-charts
23+
24+
### Security Hardening
25+
26+
The deployment includes the following security measures:
27+
28+
- **Non-root execution**: All containers run as non-root user (65532)
29+
- **Read-only root filesystem**: Containers use read-only root filesystems
30+
- **Dropped capabilities**: All Linux capabilities are dropped
31+
- **Security profiles**: Uses RuntimeDefault seccomp profile
32+
- **Resource limits**: CPU and memory limits configured for all containers
33+
34+
### High Availability
35+
36+
- **Replica count**: 2 replicas for high availability
37+
- **Pod Disruption Budget**: Ensures minimum availability during disruptions
38+
- **Leader election**: Enabled to prevent split-brain scenarios
39+
- **Anti-affinity**: Distributes pods across nodes (when configured)
40+
41+
### Monitoring
42+
43+
- **ServiceMonitor**: Enabled for Prometheus metrics collection
44+
- **Metrics endpoint**: Exposes metrics on port 8080
45+
- **Health checks**: Readiness and liveness probes configured
46+
47+
## Custom Resources
48+
49+
The operator provides the following custom resources:
50+
51+
- **OpenTelemetryCollector**: Manages collector deployments
52+
- **Instrumentation**: Configures auto-instrumentation for applications
53+
- **OpAMPBridge**: Manages OpAMP bridge instances
54+
55+
## Dependencies
56+
57+
- **cert-manager**: Required for TLS certificate management of admission webhooks
58+
- **Prometheus Operator**: Optional, for ServiceMonitor support
59+
60+
## Usage
61+
62+
After deployment, you can create OpenTelemetry resources:
63+
64+
```yaml
65+
apiVersion: opentelemetry.io/v1alpha1
66+
kind: OpenTelemetryCollector
67+
metadata:
68+
name: otel-collector
69+
namespace: observability
70+
spec:
71+
config: |
72+
receivers:
73+
otlp:
74+
protocols:
75+
grpc:
76+
endpoint: 0.0.0.0:4317
77+
processors:
78+
batch:
79+
exporters:
80+
logging:
81+
loglevel: debug
82+
service:
83+
pipelines:
84+
traces:
85+
receivers: [otlp]
86+
processors: [batch]
87+
exporters: [logging]
88+
```
89+
90+
## Troubleshooting
91+
92+
### Common Issues
93+
94+
1. **Webhook failures**: Ensure cert-manager is deployed and healthy
95+
2. **CRD conflicts**: Check for existing OpenTelemetry CRDs if upgrading
96+
3. **RBAC issues**: Verify cluster-admin permissions during installation
97+
98+
### Useful Commands
99+
100+
```bash
101+
# Check operator status
102+
kubectl get pods -n observability -l app.kubernetes.io/name=opentelemetry-operator
103+
104+
# View operator logs
105+
kubectl logs -n observability -l app.kubernetes.io/name=opentelemetry-operator
106+
107+
# Check CRDs
108+
kubectl get crd | grep opentelemetry
109+
110+
# Verify webhooks
111+
kubectl get validatingwebhookconfiguration | grep opentelemetry
112+
kubectl get mutatingwebhookconfiguration | grep opentelemetry
113+
```
114+
115+
## References
116+
117+
- [OpenTelemetry Operator Documentation](https://opentelemetry.io/docs/kubernetes/operator/)
118+
- [Helm Chart Repository](https://github.com/open-telemetry/opentelemetry-helm-charts)
119+
- [OpenTelemetry Specification](https://opentelemetry.io/docs/specs/)
Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
# Hardened values for OpenTelemetry Operator v0.98.0
2+
# Security-focused configuration following openCenter standards
3+
4+
# High availability configuration
5+
replicaCount: 2
6+
7+
# Pod Disruption Budget for high availability
8+
pdb:
9+
create: true
10+
minAvailable: 1
11+
12+
# Manager configuration with security hardening
13+
manager:
14+
image:
15+
repository: ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator
16+
imagePullPolicy: IfNotPresent
17+
18+
# Collector image configuration
19+
collectorImage:
20+
repository: ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-k8s
21+
tag: 0.137.0
22+
23+
# Resource limits for production workloads
24+
resources:
25+
limits:
26+
cpu: 200m
27+
memory: 256Mi
28+
ephemeral-storage: 100Mi
29+
requests:
30+
cpu: 100m
31+
memory: 128Mi
32+
ephemeral-storage: 50Mi
33+
34+
# Environment variables
35+
env:
36+
ENABLE_WEBHOOKS: "true"
37+
38+
# ServiceAccount configuration
39+
serviceAccount:
40+
create: true
41+
annotations: {}
42+
43+
# Enable Prometheus monitoring
44+
serviceMonitor:
45+
enabled: true
46+
extraLabels: {}
47+
annotations: {}
48+
metricsEndpoints:
49+
- port: metrics
50+
51+
# Enable leader election for HA
52+
leaderElection:
53+
enabled: true
54+
55+
# Security context - hardened configuration
56+
securityContext:
57+
allowPrivilegeEscalation: false
58+
capabilities:
59+
drop:
60+
- ALL
61+
runAsNonRoot: true
62+
readOnlyRootFilesystem: true
63+
seccompProfile:
64+
type: RuntimeDefault
65+
66+
# Kube RBAC Proxy configuration
67+
kubeRBACProxy:
68+
enabled: true
69+
image:
70+
repository: quay.io/brancz/kube-rbac-proxy
71+
tag: v0.19.1
72+
73+
# Resource limits
74+
resources:
75+
limits:
76+
cpu: 100m
77+
memory: 128Mi
78+
requests:
79+
cpu: 10m
80+
memory: 64Mi
81+
82+
# Security context - hardened configuration
83+
securityContext:
84+
allowPrivilegeEscalation: false
85+
capabilities:
86+
drop:
87+
- ALL
88+
runAsNonRoot: true
89+
readOnlyRootFilesystem: true
90+
seccompProfile:
91+
type: RuntimeDefault
92+
93+
# Admission webhooks configuration
94+
admissionWebhooks:
95+
create: true
96+
servicePort: 443
97+
failurePolicy: Fail
98+
99+
# Pod injection policy
100+
pods:
101+
failurePolicy: Ignore
102+
103+
# Webhook timeout
104+
timeoutSeconds: 10
105+
106+
# Use cert-manager for TLS certificates
107+
certManager:
108+
enabled: true
109+
certificateAnnotations: {}
110+
issuerAnnotations: {}
111+
112+
# Disable auto-generated certificates since we use cert-manager
113+
autoGenerateCert:
114+
enabled: false
115+
116+
# CRDs management
117+
crds:
118+
create: true
119+
120+
# RBAC configuration
121+
role:
122+
create: true
123+
124+
clusterRole:
125+
create: true
126+
127+
# Node scheduling - Linux nodes only
128+
nodeSelector:
129+
kubernetes.io/os: linux
130+
131+
# Pod-level security context
132+
securityContext:
133+
runAsGroup: 65532
134+
runAsNonRoot: true
135+
runAsUser: 65532
136+
fsGroup: 65532
137+
138+
# Service account token mounting
139+
automountServiceAccountToken: true
140+
141+
# Test framework security hardening
142+
testFramework:
143+
image:
144+
repository: busybox
145+
tag: latest
146+
147+
securityContext:
148+
allowPrivilegeEscalation: false
149+
capabilities:
150+
drop:
151+
- ALL
152+
runAsNonRoot: true
153+
readOnlyRootFilesystem: true
154+
seccompProfile:
155+
type: RuntimeDefault
156+
157+
resources:
158+
limits:
159+
cpu: 100m
160+
memory: 128Mi
161+
requests:
162+
cpu: 10m
163+
memory: 64Mi
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
apiVersion: helm.toolkit.fluxcd.io/v2
3+
kind: HelmRelease
4+
metadata:
5+
name: opentelemetry-operator
6+
namespace: observability
7+
spec:
8+
releaseName: opentelemetry-operator
9+
interval: 5m
10+
timeout: 10m
11+
driftDetection:
12+
mode: enabled
13+
install:
14+
remediation:
15+
retries: 3
16+
remediateLastFailure: true
17+
upgrade:
18+
remediation:
19+
retries: 0
20+
remediateLastFailure: false
21+
targetNamespace: observability
22+
chart:
23+
spec:
24+
chart: opentelemetry-operator
25+
version: 0.98.0
26+
sourceRef:
27+
kind: HelmRepository
28+
name: opentelemetry
29+
namespace: observability
30+
valuesFrom:
31+
- kind: Secret
32+
name: opentelemetry-operator-values-base
33+
valuesKey: hardened.yaml
34+
- kind: Secret
35+
name: opentelemetry-operator-values-override
36+
valuesKey: override.yaml
37+
optional: true
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
apiVersion: kustomize.config.k8s.io/v1beta1
3+
kind: Kustomization
4+
resources:
5+
- "namespace.yaml"
6+
- "source.yaml"
7+
- "helmrelease.yaml"
8+
secretGenerator:
9+
- name: opentelemetry-operator-values-base
10+
type: Opaque
11+
files:
12+
- hardened.yaml=helm-values/hardened-values-v0.98.0.yaml
13+
options:
14+
disableNameSuffixHash: true
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
apiVersion: v1
3+
kind: Namespace
4+
metadata:
5+
name: observability
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
apiVersion: source.toolkit.fluxcd.io/v1
3+
kind: HelmRepository
4+
metadata:
5+
name: opentelemetry
6+
spec:
7+
url: https://open-telemetry.github.io/opentelemetry-helm-charts
8+
interval: 1h

0 commit comments

Comments
 (0)