Skip to content

[SECURITY FEATURE]: Gateway-Level Input Validation & Output Sanitization (prevent traversal) #221

@crivetimihai

Description

@crivetimihai

🧭 Epic

Title: Gateway-Level Input Validation & Output Sanitization
Goal: Add a first-class validation/sanitization layer to our MCP Gateway so that every inbound parameter (tool args, resource URIs, prompt vars) is validated and every outbound payload is sanitized before execution or delivery.
Why now: We've seen live PoCs where malicious strings passed straight into shells or SQL. Building a reference implementation lets us battle-test these controls and ship an experimental proof-of-concept that we can upstream into the MCP spec as a formal security enhancement.


🧭 Type of Feature

  • Security hardening
  • New functionality (experimental)

🙋‍♂️ User Story 1 — Path Traversal Defense

As a: Platform security engineer
I want: the gateway to normalize and confine all resource paths to declared roots
So that: traversal payloads like ../../../etc/passwd are blocked before any file I/O.

✅ Acceptance Criteria

Scenario: Reject resource path traversal
Given MCP_GW_ROOT="/srv/data"
When a client requests "/srv/data/../../secret.txt"
Then respond 400 "invalid_path"
And MUST NOT read files outside "/srv/data"

🙋‍♂️ User Story 2 — Dangerous-Sink Parameter Validation

As a: Tool developer
I want: the runtime to escape or reject shell/SQL metas in parameters
So that: "bobbytables.jpg; cat /etc/passwd" cannot trigger command injection.

✅ Acceptance Criteria

Scenario: Prevent command injection via filename
Given tool "image.convert" shells out with a filename arg
When filename == "bobbytables.jpg; cat /etc/passwd"
Then the runtime MUST
  * escape the value per safe-exec rules OR
  * reject with 422 "validation_failed"
And no unintended command runs

🙋‍♂️ User Story 3 — Output Sanitization Guard

As a: Client integrator
I want: control chars & mismatched MIME types stripped or fixed on every response
So that: hostile escape sequences aren't fed back into UIs or LLMs.

✅ Acceptance Criteria

Scenario: Sanitize tool output
Given a tool returns text containing ASCII 0x1B
When the gateway serializes the JSON-RPC response
Then remove/encode unsafe control chars
And ensure Content-Type matches sanitized payload

📐 Design Sketch

flowchart TD
    subgraph ValidationLayer
        A[Inbound JSON-RPC] --> V{Validate<br/>JSONSchema, allow-list}
        V --✔--> H[Handler]
        V --✖--> E[HTTP 400 / 422]
    end
    H --> S[Sanitize Response]
    S --> O[Outbound JSON-RPC]
Loading
Component / Area Change Detail
validation_middleware.py NEW Parse params; JSON-Schema or regex allow-list; length & charset limits
Resource Service UPDATE read_resource()normalize_path(), root-confine
Tool Exec Wrapper UPDATE subprocess.run(args=list,shell=False); escape or abort on metas
Response Pipeline NEW sanitize_output() removes C0 controls & verifies MIME
Config NEW ALLOWED_ROOTS, VALIDATION_STRICT, SANITIZE_OUTPUT toggles
Docs ADD Validation rules table, escaping cookbook

🔄 Roll-out Plan

  1. Phase 0: Feature-flag via EXPERIMENTAL_VALIDATE_IO (off by default).
  2. Phase 1: Log-only "warn" mode in dev/staging.
  3. Phase 2: Enforce 4xx on violations in staging.
  4. Phase 3: Enable in prod; gather metrics & propose upstream spec text.

📝 Spec-Draft Clauses (to upstream later)

  1. Validation Clause – "Servers MUST treat all inbound values as untrusted and validate them against JSON Schema or allow-lists."
  2. Path-Safety Clause – "Resource paths MUST resolve inside configured roots; otherwise reject."
  3. Dangerous-Sink Clause – "Parameters passed to shells/SQL MUST be escaped or rejected."
  4. Output-Sanitization Clause – "Before emission, servers SHOULD strip control chars and MUST ensure MIME correctness."

📣 Next Steps

  • Spike the middleware + unit tests (see tests/security/test_validation.py).
  • Draft JSON Schemas for core built-in tools/resources.
  • Open a follow-up PR to toggle EXPERIMENTAL_VALIDATE_IO in CI.

Once merged, we'll share results with the MCP working groups and iterate on the spec language.


Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestexperimentalExperimental features, test proposed MCP Specification changespythonPython / backend development (FastAPI)securityImproves securitytriageIssues / Features awaiting triage

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions