mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-15 17:25:26 -04:00
fix(search): catch HTTPStatusError so 403/404 URLs degrade gracefully instead of 500 (#2203)
raise_for_status() raises httpx.HTTPStatusError for 4xx/5xx responses, but the surrounding try/except only caught httpx.RequestError (network errors) and RateLimitError (429). Any other HTTP error code propagated uncaught up through chat_processor -> chat_helpers -> chat_routes and surfaced as a 500 Internal Server Error. Added an explicit except httpx.HTTPStatusError clause that logs a warning and returns an empty result, matching the behaviour already in place for network errors. Also adds focused regression tests that exercise the real fetch_webpage_content() path with a mocked _get_public_url: - 403/404 responses return the standard empty-result shape instead of raising, proving the new HTTPStatusError handling works end to end. - 429 responses still take their own dedicated rate-limit branch (the status_code == 429 check runs before raise_for_status() is reached), keeping that behaviour distinct from the new generic HTTPStatusError handling. Dropped the unrelated builtin_mcp.py change that had been carried over from a rebase; that fix is tracked separately in #2018 and this branch should stay scoped to the search content fetch path. Closes #2148
This commit is contained in:
@@ -259,6 +259,9 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
|
||||
raise RateLimitError(f"Rate limit hit for {url} (attempt {retry_attempt})")
|
||||
|
||||
response.raise_for_status()
|
||||
except httpx.HTTPStatusError as e:
|
||||
error_logger.warning(f"HTTP {e.response.status_code} fetching {url}: {e}")
|
||||
return _empty_result(url, f"HTTP {e.response.status_code}: {e}")
|
||||
except httpx.RequestError as e:
|
||||
error_logger.error(f"NetworkError fetching {url} (attempt {retry_attempt}): {e}")
|
||||
return _empty_result(url, f"NetworkError: {e}")
|
||||
|
||||
Reference in New Issue
Block a user