Skip to content

Commit c96fefd

Browse files
committed
docs: update to documentation
- updated main README.md - add documentation for config guide
1 parent f8e58ff commit c96fefd

17 files changed

+3434
-183
lines changed

README.md

Lines changed: 111 additions & 183 deletions
Large diffs are not rendered by default.

docs/README.md

Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# OpenCenter Service Configuration Guides
2+
3+
This directory contains comprehensive configuration guides for all services available in the openCenter platform. Each guide provides detailed configuration examples, common pitfalls, troubleshooting steps, and best practices.
4+
5+
## Available Configuration Guides
6+
7+
### Core Infrastructure Services
8+
9+
| Service | Guide | Description |
10+
|---------|-------|-------------|
11+
| **Cert-manager** | [cert-manager-config-guide.md](cert-manager-config-guide.md) | TLS certificate management and automation |
12+
| **Harbor** | [harbor-config-guide.md](harbor-config-guide.md) | Container registry with security scanning |
13+
| **Keycloak** | [keycloak-config-guide.md](keycloak-config-guide.md) | Identity and access management |
14+
| **Kyverno** | [kyverno-config-guide.md](kyverno-config-guide.md) | Kubernetes-native policy engine |
15+
| **Longhorn** | [longhorn-config-guide.md](longhorn-config-guide.md) | Distributed block storage system |
16+
| **MetalLB** | [metallb-config-guide.md](metallb-config-guide.md) | Load balancer for bare-metal clusters |
17+
| **Sealed Secrets** | [sealed-secrets-config-guide.md](sealed-secrets-config-guide.md) | GitOps-friendly secret encryption |
18+
| **Velero** | [velero-config-guide.md](velero-config-guide.md) | Backup and disaster recovery |
19+
20+
### Observability Stack
21+
22+
| Component | Guide | Description |
23+
|-----------|-------|-------------|
24+
| **Kube-Prometheus-Stack** | [kube-prometheus-stack-config-guide.md](kube-prometheus-stack-config-guide.md) | Complete monitoring with Prometheus, Grafana, Alertmanager |
25+
| **Loki** | [loki-config-guide.md](loki-config-guide.md) | Log aggregation and storage system |
26+
| **Tempo** | [tempo-config-guide.md](tempo-config-guide.md) | Distributed tracing backend |
27+
| **OpenTelemetry** | [opentelemetry-kube-stack-config-guide.md](opentelemetry-kube-stack-config-guide.md) | Unified observability data collection |
28+
29+
## Guide Structure
30+
31+
Each configuration guide follows a consistent structure:
32+
33+
### 1. Overview
34+
Brief description of the service and its role in the Kubernetes cluster.
35+
36+
### 2. Key Configuration Choices
37+
Detailed examples of important configuration options with explanations of why specific choices were made.
38+
39+
### 3. Common Pitfalls
40+
Description of frequently encountered issues, their causes, and step-by-step solutions with verification commands.
41+
42+
### 4. Required Secrets
43+
Documentation of all secrets required by the service, including field descriptions and examples.
44+
45+
### 5. Verification
46+
Commands to verify the service is running correctly and functioning as expected.
47+
48+
### 6. Usage Examples
49+
Practical examples of common use cases and configuration patterns.
50+
51+
## Templates
52+
53+
### Service Documentation Templates
54+
55+
| Template | Purpose | Location |
56+
|----------|---------|----------|
57+
| **Service README Template** | Base template for service README files | [templates/service-readme-template.md](templates/service-readme-template.md) |
58+
| **Configuration Guide Template** | Template for detailed configuration guides | [templates/service-config-guide-template.md](templates/service-config-guide-template.md) |
59+
| **Service Standards Template** | Template for service standards documentation | [templates/service-standards-template.md](templates/service-standards-template.md) |
60+
61+
## Getting Started
62+
63+
1. **Choose Your Service**: Select the service you want to configure from the tables above
64+
2. **Read the Configuration Guide**: Follow the detailed configuration examples and explanations
65+
3. **Implement Configuration**: Apply the configurations to your cluster with appropriate customizations
66+
4. **Verify Deployment**: Use the verification steps to ensure the service is working correctly
67+
5. **Troubleshoot Issues**: Refer to the common pitfalls section if you encounter problems
68+
69+
## Best Practices
70+
71+
### Configuration Management
72+
- Use GitOps principles for all configuration changes
73+
- Store sensitive data in encrypted secrets (Sealed Secrets or SOPS)
74+
- Implement proper resource limits and requests
75+
- Follow security best practices for each service
76+
77+
### Monitoring and Observability
78+
- Enable monitoring for all services using the observability stack
79+
- Set up appropriate alerts for service health and performance
80+
- Implement proper logging and tracing for troubleshooting
81+
82+
### Security
83+
- Follow the principle of least privilege for RBAC
84+
- Use network policies to restrict traffic between services
85+
- Regularly update services and scan for vulnerabilities
86+
- Implement proper backup and disaster recovery procedures
87+
88+
### Maintenance
89+
- Regularly review and update configurations
90+
- Test backup and restore procedures
91+
- Monitor resource usage and scale as needed
92+
- Keep documentation up to date with configuration changes
93+
94+
## Contributing
95+
96+
When adding new services or updating existing ones:
97+
98+
1. Use the appropriate template from the `templates/` directory
99+
2. Follow the established structure and formatting
100+
3. Include comprehensive examples and troubleshooting information
101+
4. Test all configuration examples before documenting them
102+
5. Update this README to include the new service
103+
104+
## Support
105+
106+
For service-specific issues:
107+
1. Check the relevant configuration guide for troubleshooting steps
108+
2. Review the service's upstream documentation
109+
3. Check the service logs and Kubernetes events
110+
4. Consult the observability dashboards for metrics and alerts
111+
112+
For platform-wide issues, refer to the main [README](../README.md) and service standards documentation.

docs/cert-manager-config-guide.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
# Cert-manager Configuration Guide
2+
3+
## Overview
4+
Cert-manager automates the management and issuance of TLS certificates from various issuing sources. It ensures certificates are valid and up-to-date, and attempts to renew certificates at a configured time before expiry.
5+
6+
## Key Configuration Choices
7+
8+
### Certificate Issuers
9+
```yaml
10+
apiVersion: cert-manager.io/v1
11+
kind: ClusterIssuer
12+
metadata:
13+
name: letsencrypt-prod
14+
spec:
15+
acme:
16+
server: https://acme-v02.api.letsencrypt.org/directory
17+
18+
privateKeySecretRef:
19+
name: letsencrypt-prod
20+
solvers:
21+
- http01:
22+
ingress:
23+
class: nginx
24+
```
25+
**Why**:
26+
- ClusterIssuer allows certificate issuance across all namespaces
27+
- Let's Encrypt provides free, automated certificates
28+
- HTTP01 challenge works with most ingress controllers
29+
30+
### DNS Challenge Configuration
31+
```yaml
32+
apiVersion: cert-manager.io/v1
33+
kind: ClusterIssuer
34+
metadata:
35+
name: letsencrypt-dns
36+
spec:
37+
acme:
38+
server: https://acme-v02.api.letsencrypt.org/directory
39+
40+
privateKeySecretRef:
41+
name: letsencrypt-dns
42+
solvers:
43+
- dns01:
44+
cloudflare:
45+
46+
apiTokenSecretRef:
47+
name: cloudflare-api-token
48+
key: api-token
49+
```
50+
**Why**: DNS01 challenges enable wildcard certificates and work behind firewalls
51+
52+
### Certificate Resource
53+
```yaml
54+
apiVersion: cert-manager.io/v1
55+
kind: Certificate
56+
metadata:
57+
name: example-tls
58+
namespace: default
59+
spec:
60+
secretName: example-tls
61+
issuerRef:
62+
name: letsencrypt-prod
63+
kind: ClusterIssuer
64+
dnsNames:
65+
- example.com
66+
- www.example.com
67+
```
68+
**Why**: Explicit certificate management provides fine-grained control over certificate lifecycle
69+
70+
## Common Pitfalls
71+
72+
### Certificate Stuck in Pending State
73+
**Problem**: Certificate remains in pending state and is never issued
74+
75+
**Solution**: Check the CertificateRequest and Order resources for detailed error messages
76+
77+
**Verification**:
78+
```bash
79+
kubectl describe certificate <cert-name> -n <namespace>
80+
kubectl get certificaterequest -n <namespace>
81+
kubectl describe order <order-name> -n <namespace>
82+
```
83+
84+
### HTTP01 Challenge Failures
85+
**Problem**: ACME HTTP01 challenges fail due to ingress misconfiguration
86+
87+
**Solution**: Ensure ingress controller can route /.well-known/acme-challenge/ paths to cert-manager solver pods
88+
89+
### Rate Limiting Issues
90+
**Problem**: Let's Encrypt rate limits prevent certificate issuance
91+
92+
**Solution**: Use staging environment for testing, implement proper retry logic
93+
94+
```bash
95+
# Check rate limit status
96+
kubectl logs -n cert-manager deployment/cert-manager | grep "rate limit"
97+
```
98+
99+
## Required Secrets
100+
101+
### DNS Provider API Tokens
102+
For DNS01 challenges, API tokens for your DNS provider are required
103+
104+
```yaml
105+
apiVersion: v1
106+
kind: Secret
107+
metadata:
108+
name: cloudflare-api-token
109+
namespace: cert-manager
110+
type: Opaque
111+
stringData:
112+
api-token: your-cloudflare-api-token
113+
```
114+
115+
**Key Fields**:
116+
- `api-token`: Cloudflare API token with Zone:Read and DNS:Edit permissions (required)
117+
118+
### ACME Account Private Key
119+
Automatically generated but can be pre-created for account portability
120+
121+
```yaml
122+
apiVersion: v1
123+
kind: Secret
124+
metadata:
125+
name: letsencrypt-prod
126+
namespace: cert-manager
127+
type: Opaque
128+
data:
129+
tls.key: <base64-encoded-private-key>
130+
```
131+
132+
**Key Fields**:
133+
- `tls.key`: ACME account private key (automatically generated if not provided)
134+
135+
## Verification
136+
```bash
137+
# Check cert-manager pods are running
138+
kubectl get pods -n cert-manager
139+
140+
# Verify ClusterIssuer is ready
141+
kubectl get clusterissuer
142+
143+
# Check certificate status
144+
kubectl get certificates -A
145+
146+
# View certificate details
147+
kubectl describe certificate <cert-name> -n <namespace>
148+
```
149+
150+
## Usage Examples
151+
152+
### Automatic Certificate with Ingress Annotations
153+
```yaml
154+
apiVersion: networking.k8s.io/v1
155+
kind: Ingress
156+
metadata:
157+
name: example-ingress
158+
annotations:
159+
cert-manager.io/cluster-issuer: letsencrypt-prod
160+
spec:
161+
tls:
162+
- hosts:
163+
- example.com
164+
secretName: example-tls
165+
rules:
166+
- host: example.com
167+
http:
168+
paths:
169+
- path: /
170+
pathType: Prefix
171+
backend:
172+
service:
173+
name: example-service
174+
port:
175+
number: 80
176+
```
177+
178+
### Wildcard Certificate
179+
```yaml
180+
apiVersion: cert-manager.io/v1
181+
kind: Certificate
182+
metadata:
183+
name: wildcard-example-com
184+
spec:
185+
secretName: wildcard-example-com-tls
186+
issuerRef:
187+
name: letsencrypt-dns
188+
kind: ClusterIssuer
189+
dnsNames:
190+
- "*.example.com"
191+
- example.com
192+
```
193+
194+
Certificate renewal is automatic and occurs when certificates are within 30 days of expiry. Monitor certificate expiry dates and renewal events through Prometheus metrics and Kubernetes events.

0 commit comments

Comments
 (0)