-
Notifications
You must be signed in to change notification settings - Fork 316
Open
Labels
enhancementNew feature or requestNew feature or requestpythonPython / backend development (FastAPI)Python / backend development (FastAPI)triageIssues / Features awaiting triageIssues / Features awaiting triage
Milestone
Description
🧭 Epic — Configuration Validation & Schema Enforcement using Pydantic V2 models
Field | Value |
---|---|
Title | Gateway-wide Configuration Validation & Schema Enforcement |
Goal | Fail fast on any malformed setting, API payload, or DB row by validating against explicit schemas at every layer (startup, ingress, persistence). |
Why now | Multi-tenant RBAC, LDAP sync, and soon-to-be-public app-templates dramatically raise the blast-radius of bad configs. Early rejection prevents privilege-escalation, data leakage, and “half-up” nodes. |
Depends on | RBAC Multi-Tenancy epic (scope columns), LDAP Integration epic (user/group tables). |
Use Pydantic V2 for validation and models.
Adds a configuration validator tool to the mcpgateway
CLI.
See also: helm values.schema.json
🧭 Type of Feature
- Security hardening
- Reliability / DX
- Developer tooling
🙋♂️ User Stories
# | Persona & Need | Acceptance Criteria (summarised) |
---|---|---|
1 | Platform engineer — “Start-up must fail if env vars are bad.” | • Launch with BAD_PORT=99999 exits ≠ 0 with stack-trace & clear message.• Systemd Restart= sees non-zero exit. |
2 | API client — “Get precise error when sending invalid payload.” | • POST /tools with bad JSONSchema returns 422 ➜ {"loc":["body","input_schema"],"msg":"Unexpected keyword"} . |
3 | DBA / Security — “Direct DB writes can’t bypass checks.” | • INSERT INTO tools … name='bad name' raises CHECK constraint failed . |
4 | DevOps — “Validate config files before deploy.” | • make validate-config CONFIG=tool.yml exits 0 only when YAML matches central JSONSchema. |
📐 Validation Pipeline
flowchart LR
Start["Startup (config.py<br/>Pydantic Settings)"]
API["Ingress (FastAPI + Pydantic models)"]
DBHook["SQLAlchemy<br/>before_insert/update"]
DB
Start --> GatewayReady
API --> DBHook --> DB
style Start fill:#B7E4C7,stroke:#333
style API fill:#B6D7FF,stroke:#333
style DBHook fill:#FFD6A5,stroke:#333
Green = process dies on failure • Blue = 4xx to caller • Orange = transaction rolled back
🔄 Roll-out Plan
Phase | Flag | Deliverable |
---|---|---|
0 | – | Schema inventory of every model & env var (living in /schemas/ ). |
1 | EXPERIMENTAL_SCHEMA_STARTUP |
Pydantic BaseSettings for all env vars; node crashes on invalid. |
2 | EXPERIMENTAL_SCHEMA_INGRESS |
All POST/PUT payloads upgraded to strict Pydantic models. |
3 | EXPERIMENTAL_SCHEMA_DB |
SQLAlchemy listeners + DB CHECK /JSON_VALID constraints. |
4 | on | Remove flags, block legacy lenient paths, publish docs. |
🔧 Task Break-down
-
Model & Spec Inventory
- Enumerate tools, prompts, servers, gateways JSONSchemas (semver tag).
- Store in
schemas/
with CI diff-checker.
-
Startup Validators
- Replace ad-hoc
os.getenv
withPydantic BaseSettings
. - Add regex/enum validators for URLs, lists, ports.
- Replace ad-hoc
-
Ingress Validators
- Promote all
schemas.py
models toConfig(strict=True)
. - Custom validators:
validate_json
,validate_cron
,merge_auth_fields
.
- Promote all
-
Persistence Guards
- SQLAlchemy
before_insert / before_update
hooks calljsonschema.validate
. - Postgres
CHECK (jsonb_valid(...))
oninput_schema
,headers
, etc. - Regex check on
name
columns.
- SQLAlchemy
-
CLI / CI Tooling
-
scripts/validate_config.py
(accepts YAML/JSON, prints error path). - GitHub Action fails PR if schema changes lack semver bump.
-
-
Fuzz & Property Tests
- Hypothesis tests feed random invalid payloads → expect 422.
- Load 10 k valid rows → ensure no slow-down > 5 %.
-
Docs & Samples
- “Writing a Tool JSONSchema” guide + playground link.
- Redoc/OpenAPI auto-regenerated nightly.
-
Monitoring
- Counter
config_validation_fail_total{stage=startup|ingress|db}
. - Alert if ingress errors spike > 1 % in 5 min.
- Counter
🔄 Alternatives Considered
Option | Pros | Cons | Decision |
---|---|---|---|
Rely only on ingress validation | Simpler code | DB back-doors stay open | ❌ |
Use SCIM 2.0 external schema registry | Standards-based | Overkill / new infra | 🔄 Future |
“Trust clients” (no validation) | Zero dev effort | High risk | ❌ |
📓 Additional Context
- Startup settings already partially use Pydantic; this formalises & extends.
- JSONSchema draft-2020-12 chosen for forward compatibility (AJV & VS Code have good plug-ins).
- Roll-back: disable flags + truncate
validator_errors
log table.
Delivering this epic means: every config mistake surfaces immediately, with a tight error message, before it can threaten tenant isolation or bring a node up in an undefined state.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestpythonPython / backend development (FastAPI)Python / backend development (FastAPI)triageIssues / Features awaiting triageIssues / Features awaiting triage