Fix email-thread HTML injection, attachment path traversal, and missing authz (#475)

Hardens issues found in a security review of the current tree (separate from the cookbook SSH PR): - Email thread rendering (static/js/emailLibrary.js): the flat read path runs inbound HTML through the allowlist sanitizer, but the two threaded paths (_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected server-parsed `body_html` raw into the DOM. A crafted inbound email could inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS if a deployment relaxes the script CSP). Now sanitized on all paths. - Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with user-controlled folder/uid and no containment, so a folder like `../../tmp` could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a single safe segment and asserts containment. - Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats, /api/rag/stats, /api/test/youtube, /api/test-research relied only on the global session check (any logged-in user). Now require_admin-gated. - Defense-in-depth HTML escaping: session HTML export escapes the session name (routes/session_routes.py); the MCP OAuth page escapes the reflected Host header / server_id (routes/mcp_routes.py). - Internal-tool token now compared with secrets.compare_digest (constant time) in core/middleware.py and app.py. Adds regression tests in tests/test_security_regressions.py.
2026-06-17 10:15:27 -04:00 · 2026-06-01 23:20:17 +10:00
parent 9e8de43f25
commit 171c29dcf3
9 changed files with 113 additions and 16 deletions
@@ -1,5 +1,6 @@
 # routes/session_routes.py
 import re
+import html
 import json
 import uuid
 from datetime import datetime
@@ -587,15 +588,16 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
            )

        if fmt == "html":
+            safe_title = html.escape(session.name or "")
            html_parts = [
                "<!DOCTYPE html><html><head>",
-                f"<meta charset='utf-8'><title>{session.name}</title>",
+                f"<meta charset='utf-8'><title>{safe_title}</title>",
                "<style>body{font-family:monospace;max-width:800px;margin:2rem auto;padding:0 1rem;background:#111;color:#ddd}",
                ".msg{margin:1rem 0;padding:0.8rem;border-radius:6px;border:1px solid #333}",
                ".user{background:#1a1a2e}.ai{background:#1a2e1a}",
                ".role{font-weight:bold;margin-bottom:0.4rem;opacity:0.7;text-transform:uppercase;font-size:0.85em}",
                "pre{background:#000;padding:0.5rem;border-radius:4px;overflow-x:auto}</style></head><body>",
-                f"<h1>{session.name}</h1>",
+                f"<h1>{safe_title}</h1>",
            ]
            for m in session.history:
                cls = "user" if m.role == "user" else "ai"