Skip to content

[Feature Request]: Configuration Validation & Schema Enforcement using Pydantic V2 models, config validator cli flag #285

@crivetimihai

Description

@crivetimihai

🧭 Epic — Configuration Validation & Schema Enforcement using Pydantic V2 models

Field Value
Title Gateway-wide Configuration Validation & Schema Enforcement
Goal Fail fast on any malformed setting, API payload, or DB row by validating against explicit schemas at every layer (startup, ingress, persistence).
Why now Multi-tenant RBAC, LDAP sync, and soon-to-be-public app-templates dramatically raise the blast-radius of bad configs. Early rejection prevents privilege-escalation, data leakage, and “half-up” nodes.
Depends on RBAC Multi-Tenancy epic (scope columns), LDAP Integration epic (user/group tables).

Use Pydantic V2 for validation and models.

Adds a configuration validator tool to the mcpgateway CLI.

See also: helm values.schema.json


🧭 Type of Feature

  • Security hardening
  • Reliability / DX
  • Developer tooling

🙋‍♂️ User Stories

# Persona & Need Acceptance Criteria (summarised)
1 Platform engineer — “Start-up must fail if env vars are bad.” • Launch with BAD_PORT=99999 exits ≠ 0 with stack-trace & clear message.
• Systemd Restart= sees non-zero exit.
2 API client — “Get precise error when sending invalid payload.” POST /tools with bad JSONSchema returns 422{"loc":["body","input_schema"],"msg":"Unexpected keyword"}.
3 DBA / Security — “Direct DB writes can’t bypass checks.” INSERT INTO tools … name='bad name' raises CHECK constraint failed.
4 DevOps — “Validate config files before deploy.” make validate-config CONFIG=tool.yml exits 0 only when YAML matches central JSONSchema.

📐 Validation Pipeline

flowchart LR
    Start["Startup (config.py<br/>Pydantic Settings)"]
    API["Ingress (FastAPI + Pydantic models)"]
    DBHook["SQLAlchemy<br/>before_insert/update"]
    DB

    Start --> GatewayReady
    API --> DBHook --> DB

    style Start fill:#B7E4C7,stroke:#333
    style API  fill:#B6D7FF,stroke:#333
    style DBHook fill:#FFD6A5,stroke:#333
Loading

Green = process dies on failure • Blue = 4xx to caller • Orange = transaction rolled back


🔄 Roll-out Plan

Phase Flag Deliverable
0 Schema inventory of every model & env var (living in /schemas/).
1 EXPERIMENTAL_SCHEMA_STARTUP Pydantic BaseSettings for all env vars; node crashes on invalid.
2 EXPERIMENTAL_SCHEMA_INGRESS All POST/PUT payloads upgraded to strict Pydantic models.
3 EXPERIMENTAL_SCHEMA_DB SQLAlchemy listeners + DB CHECK/JSON_VALID constraints.
4 on Remove flags, block legacy lenient paths, publish docs.

🔧 Task Break-down

  1. Model & Spec Inventory

    • Enumerate tools, prompts, servers, gateways JSONSchemas (semver tag).
    • Store in schemas/ with CI diff-checker.
  2. Startup Validators

    • Replace ad-hoc os.getenv with Pydantic BaseSettings.
    • Add regex/enum validators for URLs, lists, ports.
  3. Ingress Validators

    • Promote all schemas.py models to Config(strict=True).
    • Custom validators: validate_json, validate_cron, merge_auth_fields.
  4. Persistence Guards

    • SQLAlchemy before_insert / before_update hooks call jsonschema.validate.
    • Postgres CHECK (jsonb_valid(...)) on input_schema, headers, etc.
    • Regex check on name columns.
  5. CLI / CI Tooling

    • scripts/validate_config.py (accepts YAML/JSON, prints error path).
    • GitHub Action fails PR if schema changes lack semver bump.
  6. Fuzz & Property Tests

    • Hypothesis tests feed random invalid payloads → expect 422.
    • Load 10 k valid rows → ensure no slow-down > 5 %.
  7. Docs & Samples

    • “Writing a Tool JSONSchema” guide + playground link.
    • Redoc/OpenAPI auto-regenerated nightly.
  8. Monitoring

    • Counter config_validation_fail_total{stage=startup|ingress|db}.
    • Alert if ingress errors spike > 1 % in 5 min.

🔄 Alternatives Considered

Option Pros Cons Decision
Rely only on ingress validation Simpler code DB back-doors stay open
Use SCIM 2.0 external schema registry Standards-based Overkill / new infra 🔄 Future
“Trust clients” (no validation) Zero dev effort High risk

📓 Additional Context

  • Startup settings already partially use Pydantic; this formalises & extends.
  • JSONSchema draft-2020-12 chosen for forward compatibility (AJV & VS Code have good plug-ins).
  • Roll-back: disable flags + truncate validator_errors log table.

Delivering this epic means: every config mistake surfaces immediately, with a tight error message, before it can threaten tenant isolation or bring a node up in an undefined state.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestpythonPython / backend development (FastAPI)triageIssues / Features awaiting triage

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions