Both get_default_chat and _recover_empty_session_model picked the
first model from cached_models[0] without checking hidden_models.
If the first cached model was hidden (e.g. minimax-m3), it was
returned as the default or used to repair empty session models,
even though the model list endpoints already filter hidden_models.
- Add _visible_models() helper that filters cached_models by
hidden_models (mirrors the filtering in list_model_endpoints)
- Use _visible_models() in get_default_chat fallback (when no
explicit default_model is saved)
- Use _visible_models() in _recover_empty_session_model (when
repairing a session whose model field is empty before chat send)
- Add regression tests for hidden-model filtering in default chat
resolution, and unit tests for _visible_models helper
In Docker, a model-endpoint URL pointing at loopback (e.g. the LM Studio
default http://localhost:1234/v1) targets the Odysseus container itself, not
the host running the server, so the probe gets a connection error and the
endpoint is rejected with a misleading 'No models found for that provider/key'.
Rewrite loopback to host.docker.internal (which compose already maps to
host-gateway) for the probe and the saved URL, mirroring the existing Ollama
handling. Gated on actually being in a container with the gateway reachable, so
native installs and gateway-less deploys are untouched.
Fixes#25
Co-authored-by: Claude <noreply@anthropic.com>
_ping_endpoint() is the reachability fallback the model-endpoint POST
handler invokes when _probe_endpoint() returns no model ids. It GETs
base + "/models" and, on any sub-500 response, returns immediately with
`reachable = (status < 400)`. That early return runs before the
Ollama-native /api/version / /api/tags fallback below it.
For an Ollama URL without /v1 (the quickstart accepts both
http://localhost:11434 and http://127.0.0.1:11434, and the reporter
on #1025 explicitly tried both), the OpenAI-style probe target is
http://127.0.0.1:11434/models. Ollama returns 404 there because /models
only lives under /v1. _ping_endpoint then returned reachable=False and
the picker showed "Added (offline — will retry on next load)" on an
install that was running fine. /api/version was never tried.
Same shape for http://127.0.0.1:11434/api (the native Ollama root):
/api/models is also 404, same premature offline verdict.
_probe_endpoint() does fall through to /api/tags on a 4xx (the response
raises via raise_for_status), so the endpoint quietly recovers once
cached_models becomes non-empty on the next background refresh —
matching the second commenter's "had to disconnect manually then
reconnect for it to be detected" note. The bug is most visible while
no models are pulled yet (cached_models stays empty, _ping_endpoint
keeps voting offline).
Fix:
- Hoist the Ollama-shaped-URL test (port == 11434 or "ollama" in
hostname — the same condition _probe_endpoint already uses) to the
top of the function so both code paths share it.
- Stop short-circuiting on 4xx when the URL looks like Ollama: fall
through to the existing /api/version + /api/tags reachability loop
so an alive Ollama gets recognised even when its OpenAI surface has
the wrong prefix for the user's input.
- Fix the `root` computation in that loop to strip a trailing /api as
well as /v1, so http://127.0.0.1:11434/api no longer gets probed at
/api/api/version.
- 4xx on non-Ollama hosts keeps the current semantics: a 401 from
api.openai.com/v1/models is still a definitive offline verdict, not
a reason to GET /api/version on OpenAI.
Closes#1025.
When the operator sets AUTH_ENABLED=false, three owner-scoped endpoints still
returned 401 (api/models, api/research/*, api/email/*), so the front-end
redirected the browser to /login and the app was unusable despite auth being
turned off. require_user() in src/auth_helpers.py already documents and honors
this contract (issue #622) via 'if _auth_disabled(): return ""', but these
endpoints did their own get_current_user/is_configured check without it.
Make _require_user (research), the /api/models anti-leak guard, and
email_helpers._require_auth consult _auth_disabled() and let anonymous through
(owner='') only when the operator explicitly disabled auth. The 401 protection
is fully intact when AUTH_ENABLED=true. Verified end-to-end: with
AUTH_ENABLED=false the SPA now loads instead of bouncing to /login.
The "don't wipe endpoint_url/model on endpoint delete" half of #587 landed
in 6a78b02 (Fix endpoint model preservation for tasks). The three remaining
follow-up pieces from the original PR — flagged in the review on #786 —
are:
- routes/model_routes.py: toggle_model_endpoint (PATCH) now accepts
api_key and base_url, so the admin UI can rotate a key or fix a typo'd
URL without going through delete+recreate. base_url is normalized the
same way the POST handler does (strip /models, /chat/completions,
/completions, /v1/messages, then _normalize_base). Cache invalidation
matches the POST/DELETE paths and the response includes base_url so the
frontend can confirm what was saved.
- routes/chat_routes.py: new _recover_empty_session_model picks
cached_models[0] from the endpoint that matches sess.endpoint_url and
persists it onto the Session row before the LLM call goes out. Wired
into both /api/chat and /api/chat_stream after the existing
_clear_orphaned_session_endpoint guard, so the order is: drop
truly-orphaned sessions first, then heal the "picker showed it, session
never knew" case.
- routes/chat_routes.py: when recovery fails (no endpoint, no cached
models) raise HTTP 400 with a clear message instead of letting
model="" reach the upstream as 401/503.
Closes#587.
* Dedupe URL routing helpers and tighten adjacent hostname checks
* Match providers by hostname, not substring, in _detect_provider
_detect_provider used `"anthropic.com" in url`-style substring checks, so a URL
that merely contained a provider's domain in its path or query — or a look-alike
host like `anthropic.com.example` — was misclassified and picked the wrong
auth-header/payload shape. Switch it to the existing `_host_match` helper
(hostname exact/subdomain match), the same way the human-readable labels and
curated model lists already work, finishing that migration. Also harden
`_host_match` against trailing-dot FQDNs.
Not a credential-leak fix: _detect_provider only classifies a URL the admin
already configured next to its key, and the URL — not this function — decides
where the request goes. This is a correctness/consistency cleanup.
Adds tests that import the real helpers (test_endpoint_resolver.py tests local
copies, so it can't catch this) covering the substring false-positives.
Refs #768.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Import build_headers under its real name in model_routes
It was imported as `build_headers as _provider_headers`, which collides with
the unrelated llm_core._provider_headers(provider, headers) — same name,
different signature. Use the real name to remove the confusion.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Use hostname matching in URL builders, not raw suffix checks
PR review flagged that _detect_provider() was hardened to match on
hostname, but several helpers still used raw host.endswith("anthropic.com")
/ host.endswith("ollama.com"), which match adjacent hosts like
notanthropic.com / notollama.com.
Route the remaining checks through _host_match(): _is_ollama_native_url
and _ollama_api_root in llm_core, and _anthropic_api_root / _ollama_api_root
in endpoint_resolver. With _detect_provider already hostname-correct, the
trailing "or host.endswith(...)" clauses in build_chat_url / build_models_url
are redundant, so drop them rather than fix the substring match in place.
Add builder-level tests asserting look-alike and domain-in-path hosts route
to the OpenAI-compatible default. They import the real builders and fail on
the pre-fix code.
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Background tasks (e.g. the Email Tags / check_email_urgency action)
resolve their model through resolve_endpoint("utility") → Default Chat.
When the configured model is one the user has since disabled on the
endpoint, the resolver still dispatched to it — on Groq that surfaces as
every email failing with "HTTP 400: model ... requires terms acceptance".
Two paths fed this:
- The auto-pick fallback selected from cached_models without excluding
the endpoint's hidden_models, so a disabled model listed first won.
- A stale default_model left pointing at a now-disabled model (seeded at
endpoint registration from raw model_ids[0]) was used verbatim.
Fix resolve_endpoint / resolve_endpoint_by_id to drop a configured model
that's in hidden_models and to pick the first ENABLED chat model. Also
seed default_model on registration via _first_chat_model so we never pin
the global default to an embedding/tts entry a provider lists first.
Checks: python -m pytest tests/test_endpoint_resolver.py
tests/test_model_routes.py tests/test_model_context.py (all pass);
python -m py_compile app.py routes/model_routes.py
src/endpoint_resolver.py.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Require admin access before serving provider discovery data from
GET /api/providers. This prevents normal authenticated users from
triggering provider discovery or receiving cached provider host data.
Keep GET /api/models available to normal users and leave the existing
admin-only GET /api/discover behavior unchanged.
Add a focused regression test to ensure unauthorized callers cannot
trigger discovery and cannot receive cached provider data.
POST /api/model-endpoints always inserted a new row, so Settings -> Add
Models -> Scan for Servers re-added any endpoint a user had already
registered manually — once under its model name (from the earlier
manual add) and again under its host:port (auto-generated when scan
posts without a name). The success toast then misreported the result
as "added N new".
Look up an existing endpoint with the same base_url accessible to the
caller (shared or owned by them) before inserting. If found, return it
with `existing: true` so the client can tell the difference between
an actual add and a dedupe hit. Toast now reads, e.g.,
"Found 1 server with 1 model — 1 already added".
Tested: POSTing the same base_url three times (incl. trailing-slash
variation) returns the same id each time; only one row exists.