fix: SSRF hardening for the custom embedding endpoint URL (#132) (#1206)

POST /api/embeddings/endpoint takes a user-supplied URL and immediately
makes an outbound httpx request to it with no validation. The admin gate
added earlier (PR #80) closed the unauthenticated-access part of #132; this
addresses the remaining request: validate the URL before fetching it.

Odysseus is local-first, so pointing the embedding endpoint at a loopback or
LAN server (local vLLM / llama.cpp / Ollama) is a normal setup — a blanket
private-IP block would break the primary use case. So the guard:

  - always rejects non-HTTP(S) schemes (file://, gopher://, ftp:// …),
  - always rejects the link-local range (169.254.0.0/16, incl. the cloud
    instance-metadata 169.254.169.254 exfil vector) plus multicast /
    reserved / unspecified, and IPv4-mapped-IPv6 forms of the above,
  - keeps loopback/LAN allowed by default, and
  - adds EMBEDDING_BLOCK_PRIVATE_IPS=true for full SSRF lockdown on exposed
    multi-tenant deployments.

Logic lives in src/url_safety.py (stdlib only, resolver injectable) so it is
unit-testable without real DNS; the route calls it before the health-check
request. Covered by tests/test_url_safety.py (8 cases).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
lekt8
2026-06-02 22:46:33 +08:00
committed by GitHub
parent 258e6fc0d4
commit 87babb58d5
3 changed files with 170 additions and 0 deletions
+12
View File
@@ -242,6 +242,18 @@ def setup_embedding_routes():
if not url:
raise HTTPException(400, "URL is required")
# SSRF hardening: validate the user-supplied URL before any outbound
# request. Local-first means loopback/LAN endpoints are allowed by
# default; non-HTTP(S) schemes and the cloud metadata range are always
# rejected. Set EMBEDDING_BLOCK_PRIVATE_IPS=true for full lockdown.
from src.url_safety import check_outbound_url
ok, reason = check_outbound_url(
url,
block_private=os.getenv("EMBEDDING_BLOCK_PRIVATE_IPS", "false").lower() == "true",
)
if not ok:
raise HTTPException(400, f"Rejected endpoint URL: {reason}")
# Quick health check
try:
import httpx