|
| 1 | +# End-to-End Encryption (E2EE) Design — Option A Sign-off |
| 2 | + |
| 3 | +Status: Draft for review |
| 4 | +Owner: Kyaw |
| 5 | + |
| 6 | +## 1. Scope and Goals |
| 7 | +- Web-first PWA with future native iOS/Android. |
| 8 | +- Protect messages, files/assets (3D/logo/media), docs, and analytics events. |
| 9 | +- Privacy-preserving analytics client-side; server-side limited to minimized metadata. |
| 10 | +- Retrofit behind feature flags; backward-compatible bridging for non-E2EE content. |
| 11 | + |
| 12 | +## 2. Threat Model |
| 13 | +- Adversaries: curious/compromised servers, external attackers (MITM), malicious clients, stolen devices. |
| 14 | +- Trust boundaries: Only clients see plaintext/keys. Servers handle ciphertext, minimal metadata, and capability tokens. IdP proves identity only. |
| 15 | +- Security goals: Confidentiality & integrity; forward secrecy and post-compromise security; deniable auth for messages; verifiable membership/state. |
| 16 | +- Out of scope initial: traffic analysis resistance; plaintext content scanning; hardware tamper beyond platform enclaves. |
| 17 | + |
| 18 | +## 3. Cryptographic Primitives and Libraries |
| 19 | +- Ed25519 (signing) for identities and devices. |
| 20 | +- X25519 (ECDH) for key agreement; sealed boxes for key wrapping (HPKE-ready abstraction). |
| 21 | +- Messaging: Signal/Double Ratchet via libsignal-client (WASM) for 1:1/small groups. |
| 22 | +- Large groups: Signal sender keys with periodic rotation; MLS on roadmap. |
| 23 | +- Files: AES-256-GCM streaming with HKDF-derived per-chunk nonces; BLAKE3 for chunk and whole-file digests. |
| 24 | +- KDF: HKDF-SHA256; Password KDF: Argon2id (high-memory, salted). |
| 25 | +- Hashing: BLAKE3 for content addressing and integrity. |
| 26 | + |
| 27 | +## 4. Identity & Device State Machines |
| 28 | +### 4.1 Identity Keys |
| 29 | +States: uninitialized -> generated -> backed_up (optional) -> compromised(revoked) |
| 30 | +Transitions: |
| 31 | +- generate: create Ed25519 identity key pair |
| 32 | +- backup: wrap private key with Argon2id-derived KEK; store vault in IndexedDB |
| 33 | +- revoke: mark identity compromised; re-enroll devices |
| 34 | + |
| 35 | +### 4.2 Device Enrollment |
| 36 | +States: new -> pending_attestation -> verified -> revoked |
| 37 | +Transitions: |
| 38 | +- new: device generates Ed25519 (sign) + X25519 (DH) |
| 39 | +- provision: QR shows {device_pubkeys, nonce}; trusted device scans and verifies SAS |
| 40 | +- attest: trusted device signs attestation binding device to identity |
| 41 | +- verify: server records attestation; device becomes verified |
| 42 | +- revoke: immediate revocation; triggers rotations |
| 43 | + |
| 44 | +## 5. Messaging Sessions |
| 45 | +- 1:1 and small groups: Double Ratchet with prekeys from libsignal. |
| 46 | +- Device revocation: peers refuse messages from revoked devices. |
| 47 | +- Group sender keys: per-room sender key rotated on membership change and periodically; per-recipient key wraps. |
| 48 | + |
| 49 | +## 6. File Encryption and Sharing |
| 50 | +### 6.1 Streaming Encryption |
| 51 | +- Per-file random DEK (256-bit). |
| 52 | +- Chunk size 512KB–2MB (adaptive). |
| 53 | +- Nonce derivation: nonce_i = HKDF(DEK, info="file-chunk" || chunk_index)[0..12] |
| 54 | +- AES-256-GCM over each chunk; produce per-chunk BLAKE3 and cumulative whole-file BLAKE3. |
| 55 | + |
| 56 | +### 6.2 Manifest Format (signed by device Ed25519) |
| 57 | +``` |
| 58 | +version: 1 |
| 59 | +algo: aes-256-gcm |
| 60 | +chunk_size: <bytes> |
| 61 | +length: <bytes> |
| 62 | +blake3_file: <hex> |
| 63 | +chunks: |
| 64 | + - index: 0 |
| 65 | + offset: 0 |
| 66 | + size: <bytes> |
| 67 | + blake3: <hex> |
| 68 | + - ... |
| 69 | +key_wraps: omitted in manifest; stored adjacent by object_id |
| 70 | +sig: ed25519(signing_device_pubkey, canonical_json(manifest_without_sig)) |
| 71 | +``` |
| 72 | + |
| 73 | +### 6.3 DEK Sharing |
| 74 | +- For each recipient device X25519 pubkey, create sealed box of DEK. |
| 75 | +- Store wraps: key_wraps(object_id, device_id, wrap_ciphertext) |
| 76 | +- Rekey on membership change; rewrap to active devices. |
| 77 | + |
| 78 | +## 7. Capability Tokens |
| 79 | +- Format: PASETO v4.public (Ed25519-signed by server capability key). |
| 80 | +Claims: |
| 81 | +- sub: user or device id |
| 82 | +- scope: [object:get|put, room:read, room:write, membership:manage] |
| 83 | +- resource: URI or prefix (e.g., s3://bucket/path/object-id) |
| 84 | +- exp: expiry; iat/nbf |
| 85 | +- region: data residency constraint |
| 86 | +- tid/nonce: unique token id to prevent replay |
| 87 | + |
| 88 | +## 8. APIs (Server) |
| 89 | +- POST /devices/attest |
| 90 | +- POST /devices/revoke |
| 91 | +- POST /rooms |
| 92 | +- POST /rooms/:id/members |
| 93 | +- POST /rooms/:id/rotate |
| 94 | +- POST /capabilities |
| 95 | +- PUT /objects/:id (requires capability) |
| 96 | +- GET /objects/:id (requires capability) |
| 97 | +- WS /events |
| 98 | + |
| 99 | +## 9. Storage Schema (Postgres + S3-compatible) |
| 100 | +- users(id, identity_pubkey_hash, oidc_sub, region) |
| 101 | +- devices(id, user_id, ed25519_pub, x25519_pub, attestation_sig, status) |
| 102 | +- rooms(id, created_by, policy) |
| 103 | +- memberships(room_id, device_id, role, since, status) |
| 104 | +- sender_keys(room_id, epoch, key_id, wrapped_keys jsonb, created_at) |
| 105 | +- objects(id, owner, room_id, bucket, path, blake3_digest, size, manifest_sig, created_at) |
| 106 | +- key_wraps(object_id, device_id, wrap_ciphertext) |
| 107 | +- audit_events(id, actor, type, target, ts, meta) |
| 108 | + |
| 109 | +## 10. Backup & Recovery |
| 110 | +- Key vault: private keys wrapped by Argon2id-derived KEK; IndexedDB on web; Secure Enclave/Keystore on mobile. |
| 111 | +- Optional Shamir 2-of-3 recovery (user + admin escrow + HSM) with approvals and audit. |
| 112 | + |
| 113 | +## 11. Metadata Minimization |
| 114 | +- Store hashed identity references; coarse timestamps; encrypted membership maps when feasible. |
| 115 | +- Avoid plaintext titles/tags. No plaintext in logs. |
| 116 | + |
| 117 | +## 12. Request Signing & Replay Protection |
| 118 | +- Client signs sensitive requests with device Ed25519 over canonical payload + timestamp. |
| 119 | +- Server enforces skew window and tid uniqueness. |
| 120 | + |
| 121 | +## 13. Performance Targets |
| 122 | +- p95 decrypt < 120 ms for 10 MB on desktop. |
| 123 | +- Streaming crypto in Web Workers; backpressure-managed I/O. |
| 124 | + |
| 125 | +## 14. Rollout & Kill Switch |
| 126 | +- Feature flags per tenant/room. |
| 127 | +- Canary cohorts; schema uses sidecar tables for isolation. |
| 128 | +- Instant kill-switch disables capability issuance for E2EE objects/rooms; existing ciphertext remains intact. |
| 129 | + |
| 130 | +## 15. CI/CD and Supply Chain |
| 131 | +- Renovate/Dependabot with grouped patch/minor; majors manual. |
| 132 | +- GitHub Actions: lint/typecheck/tests, CodeQL, SCA, SBOM (Syft), container scanning. |
| 133 | +- Signed commits and releases. |
| 134 | + |
| 135 | +## 16. Test Plan (Acceptance Gates) |
| 136 | +- Unit and property tests for: keygen, provisioning, sealed box wraps, AES-GCM streaming (vectors), manifest sign/verify, PASETO claims/validation, rotation flows. |
| 137 | +- Integration: 1:1 E2EE chat, file upload/download, membership change triggers rewrap/rotation. |
| 138 | +- Data residency pinning tests; GDPR DSR exercises. |
| 139 | + |
| 140 | +## 17. Open Items / Future Work |
| 141 | +- Evaluate MLS migration path for large rooms. |
| 142 | +- HPKE support behind wrapping abstraction. |
| 143 | +- Privacy-preserving analytics with DP budget management per org. |
0 commit comments