-
Notifications
You must be signed in to change notification settings - Fork 316
Description
🧭 Epic
Title: Gateway-Level Input Validation & Output Sanitization
Goal: Add a first-class validation/sanitization layer to our MCP Gateway so that every inbound parameter (tool args, resource URIs, prompt vars) is validated and every outbound payload is sanitized before execution or delivery.
Why now: We've seen live PoCs where malicious strings passed straight into shells or SQL. Building a reference implementation lets us battle-test these controls and ship an experimental proof-of-concept that we can upstream into the MCP spec as a formal security enhancement.
🧭 Type of Feature
- Security hardening
- New functionality (experimental)
🙋♂️ User Story 1 — Path Traversal Defense
As a: Platform security engineer
I want: the gateway to normalize and confine all resource paths to declared roots
So that: traversal payloads like ../../../etc/passwd
are blocked before any file I/O.
✅ Acceptance Criteria
Scenario: Reject resource path traversal
Given MCP_GW_ROOT="/srv/data"
When a client requests "/srv/data/../../secret.txt"
Then respond 400 "invalid_path"
And MUST NOT read files outside "/srv/data"
🙋♂️ User Story 2 — Dangerous-Sink Parameter Validation
As a: Tool developer
I want: the runtime to escape or reject shell/SQL metas in parameters
So that: "bobbytables.jpg; cat /etc/passwd"
cannot trigger command injection.
✅ Acceptance Criteria
Scenario: Prevent command injection via filename
Given tool "image.convert" shells out with a filename arg
When filename == "bobbytables.jpg; cat /etc/passwd"
Then the runtime MUST
* escape the value per safe-exec rules OR
* reject with 422 "validation_failed"
And no unintended command runs
🙋♂️ User Story 3 — Output Sanitization Guard
As a: Client integrator
I want: control chars & mismatched MIME types stripped or fixed on every response
So that: hostile escape sequences aren't fed back into UIs or LLMs.
✅ Acceptance Criteria
Scenario: Sanitize tool output
Given a tool returns text containing ASCII 0x1B
When the gateway serializes the JSON-RPC response
Then remove/encode unsafe control chars
And ensure Content-Type matches sanitized payload
📐 Design Sketch
flowchart TD
subgraph ValidationLayer
A[Inbound JSON-RPC] --> V{Validate<br/>JSONSchema, allow-list}
V --✔--> H[Handler]
V --✖--> E[HTTP 400 / 422]
end
H --> S[Sanitize Response]
S --> O[Outbound JSON-RPC]
Component / Area | Change | Detail |
---|---|---|
validation_middleware.py |
NEW | Parse params; JSON-Schema or regex allow-list; length & charset limits |
Resource Service | UPDATE | read_resource() → normalize_path() , root-confine |
Tool Exec Wrapper | UPDATE | subprocess.run(args=list,shell=False) ; escape or abort on metas |
Response Pipeline | NEW | sanitize_output() removes C0 controls & verifies MIME |
Config | NEW | ALLOWED_ROOTS , VALIDATION_STRICT , SANITIZE_OUTPUT toggles |
Docs | ADD | Validation rules table, escaping cookbook |
🔄 Roll-out Plan
- Phase 0: Feature-flag via
EXPERIMENTAL_VALIDATE_IO
(off by default). - Phase 1: Log-only "warn" mode in dev/staging.
- Phase 2: Enforce 4xx on violations in staging.
- Phase 3: Enable in prod; gather metrics & propose upstream spec text.
📝 Spec-Draft Clauses (to upstream later)
- Validation Clause – "Servers MUST treat all inbound values as untrusted and validate them against JSON Schema or allow-lists."
- Path-Safety Clause – "Resource paths MUST resolve inside configured roots; otherwise reject."
- Dangerous-Sink Clause – "Parameters passed to shells/SQL MUST be escaped or rejected."
- Output-Sanitization Clause – "Before emission, servers SHOULD strip control chars and MUST ensure MIME correctness."
📣 Next Steps
- Spike the middleware + unit tests (see
tests/security/test_validation.py
). - Draft JSON Schemas for core built-in tools/resources.
- Open a follow-up PR to toggle
EXPERIMENTAL_VALIDATE_IO
in CI.
Once merged, we'll share results with the MCP working groups and iterate on the spec language.