odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-15 17:25:26 -04:00

Author	SHA1	Message	Date
KYDNO	955455b797	fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549 ) * fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(kimi): omit temperature for kimi-for-coding API calls Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-15 15:56:54 +09:00
Muhammed Midlaj	4b0a977988	fix(models): probe /v1/models for path-less LM Studio endpoints Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.	2026-06-15 15:09:50 +09:00
Max Hsu	66c25cbc2f	fix(models): reassign default endpoint when current default is disabled (#3649 ) Adding a new endpoint only auto-set the global default chat endpoint when none was configured (`if not settings.get("default_endpoint_id")`). When the existing default pointed at an endpoint the user had since disabled, it was never reassigned, so features that read the raw `default_endpoint_id` setting (notably Memory → Tidy) failed with "No default model configured — set one in Settings" even though an enabled endpoint existed. Reassign the default when the configured endpoint is missing/disabled, via a new pure `_default_endpoint_needs_assignment` helper. Adds unit coverage for the helper plus route-level regression tests for the disabled/enabled cases. Fixes #3586 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 13:17:31 +02:00
Maruf Hasan	c3fcaf15b7	feat(providers): add NVIDIA AI provider endpoint support (#3456 ) * feat: add NVIDIA as an AI provider (integrate.api.nvidia.com) * feat: add NVIDIA option to provider settings dropdown and aliases * test: add NVIDIA provider detection and endpoint tests * Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering - nvidia.com -> 'nvidia' curated key for proper provider routing - _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed - _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip, kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse, -embedqa, -nemoretriever * Expand non-chat model filtering for NVIDIA embedding/guard/video models Add _NON_CHAT_PREFIXES: embed, recurrent Add _NON_CHAT_CONTAINS: topic-control, guard, calibration, ai-synthetic-video, cosmos-reason2 Catches remaining unfiltered non-chat models from NVIDIA catalog: embedding (llama-nemotron-embed, embed-qa), guard (llama-guard, nemoguard-topic-control), calibration (ising-calibration), video (ai-synthetic-video-detector, cosmos-reason2), recurrent (recurrentgemma-2b) * Filter non-chat models in _probe_endpoint via _is_chat_model() Previously _is_chat_model() was only used in the per-model probe and _first_chat_model(), so non-chat models still appeared in the model picker even though they were filtered in those specific paths. Applying the filter at _probe_endpoint() return ensures non-chat models (embeddings, safety guards, reward, calibration, video detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never enter cached_models and never appear in the picker. * Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models Prefix checks (mid.startswith) miss models with org prefixes like baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc. Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught regardless of the org prefix. Adds: embed, bge, recurrent, starcoder, gemma-2b * fix(model-routes): drop collision-prone substrings from global non-chat filter The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES and _NON_CHAT_CONTAINS tuples. These are intended to filter out embedding, retrieval, safety, and vision models from NVIDIA's catalog that are not chat-completions-capable. However, four of the added substrings collide with legitimate chat models served by other providers: - gemma-2b matches google/gemma-2b-it (instruct chat model) - starcoder matches bigcode/starcoder2-15b (code completion model) - recurrent matches google/recurrentgemma-2b (language model) - guard matches meta-llama/Llama-Guard-3-8B (safety classifier) Removing these four from the global tuples keeps the NVIDIA-specific filtering intact (safety, embedding, retrieval, and vision models are still caught by other tokens such as content-safety, -safety, -reward, embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while preventing false negatives for instruct/code models on other providers. Tests added for gemma-2b-it, google/gemma-2b-it, and bigcode/starcoder2-15b-instruct asserting they are recognized as chat models. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS entries redundant since the prefix check runs first. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * style: fix indentation of groq and xai test cases in test_provider_endpoints.py --------- Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>	2026-06-09 11:06:12 +02:00
pewdiepie-archdaemon	d397b3db2f	Restore dropped regression fixes	2026-06-09 10:31:43 +09:00
pewdiepie-archdaemon	37c573d865	Fix model endpoint route test regressions	2026-06-09 10:16:38 +09:00
pewdiepie-archdaemon	fa8c93ec0a	Cookbook UI: Ollama browser, advanced serve fold, API tokens form, diagnosis toolbar, polish Surface a lot of accumulated cookbook + UI work as a single non-agent commit so the agent rework lands cleanly. Highlights: - Ollama as a first-class backend in the Cookbook: * Download input accepts ollama-style names (name:tag) → backend=ollama * /api/cookbook/ollama/library (cached scrape of ollama.com + curated fallback so classic models like qwen2.5 stay reachable) * "Browse Ollama library" toggle below Download with size chips * Engine=Ollama in hwfit toolbar merges the Ollama library into the main scan list as per-tag rows with the same Fit/Param/Quant/VRAM columns; click → fills Download input - API Tokens form added to Integrations panel (matching wired loadTokens()/initTokenForm() that had no HTML) - Serve panel polish: Advanced fold tightening (-8px nudges on vLLM checks, Extra args, Spec row), n_cpu_moe + Split Mode controls pulled up 8px to align with the row's checkboxes, GGUF File dropdown exposed for Ollama backend, GPU re-render on Edit serve restore, _forceBackend flag so saved serveState wins over backend detection, cookbook:servers-changed CustomEvent so panels don't need refresh - Models page redesign: Add Models row (URL + hidden API key reveal + Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All + Clear-offline buttons in Added Models toolbar, offline-pill removed (opacity already conveys state), Engine dropdown gains Ollama option - _ping_endpoint probes /v1/models then base, accepts 4xx as reachable (vLLM returns 404 on bare /v1, fully working endpoints were showing offline) - Diagnosis card: × dismiss + Copy bundle buttons restored on the serve error feedback card - Orphan tmux sweep re-enabled behind a 60s rate-limit + background Thread (off the main event loop) so dead serves get discovered - cookbook_routes auto-register watchdog: drops the endpoint if the serve session exits non-zero within the first ~3min - ollama-rocm sidecar awareness in download wrapper (`docker exec ollama-rocm ollama pull` when host ollama isn't installed) - Skill extractor sets initial_status="published" when auto_approve_skills pref is on (audit demotes later) - Skill list / model list / cookbook scan misc polish	2026-06-09 09:46:19 +09:00
Ocean Bennett	e7c1d75884	fix(models): query v1 models for llama-server endpoints (#3380 ) * fix(models): query v1 models for llama-server endpoints * test(models): accept owner kwargs in llama-server regression	2026-06-09 01:09:02 +02:00
Giuseppe Castelluccio	095c74b985	fix(security): fail closed in /api/models auth gate on unexpected errors (#3489 ) GET /api/models swallowed any non-HTTPException raised while checking whether the caller is authenticated (bare except Exception: pass), so a broken auth_manager or an exception from get_current_user silently granted the full model list to an anonymous caller instead of rejecting the request. Now any unexpected exception logs and returns HTTP 500. Split out of #2360 per reviewer request to keep the deny-list and the auth-gate fix as separate, single-purpose PRs. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 20:23:39 +02:00
stocky789	1e0d9b92af	feat: add ChatGPT Subscription provider (#2876 ) * feat: Add ChatGPT Subscription support and related features - Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI. - Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status. - Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance. - Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints. - Added new slash commands for managing skills, including listing, searching, and invoking skills. - Implemented caching for skill catalog to optimize performance. - Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing. - Refactored existing code to accommodate new features and improve maintainability. * refactor: share provider device-flow setup - reuse one device-flow backend for Copilot and ChatGPT Subscription - add one frontend device-flow helper for Settings and /setup - put GitHub Copilot back into Add Models, now as a dropdown option - make provider selection just select; clicking Add starts sign-in - stop ChatGPT Subscription setup from opening auth tabs automatically - make /setup copilot and /setup chatgpt-subscription work from chat - show ChatGPT Subscription in the /setup suggestions - show the real error message when setup fails - add focused tests for the shared flow and setup UI * feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX Backend: - Resolve runtime bearer for provider-auth endpoints at probe time via a shared _resolve_probe_key() that delegates to resolve_endpoint_runtime, applied across all probe/refresh call sites. - Skip live completion probes and health pings for discovery-only providers (centralized behind _is_discovery_only_provider) — the Codex/Responses API has no such endpoints, so status is derived from cached models. - Never persist the short lived ChatGPT bearer to the plaintext sessions table; proactively clear any stale bearer left by an earlier code path. - Revoke orphaned ProviderAuthSession credentials when the last endpoint backing them is deleted (_delete_orphaned_provider_auth), surfaced via cleared_provider_auth in the delete response. Frontend (admin.js): - Auto-start the device-auth flow on provider selection so the authorization panel (code + Authorize) shows immediately instead of behind a "Sign in" click. - Remove the redundant top button for device auth providers, move retry into the panel via an inline "Try again". - Drop the self-evident hint text and add an execCommand clipboard fallback so Copy works in non-secure (HTTP/LAN) contexts. * fix: harden chatgpt subscription provider * chore: remove PR media from branch * Fix chatgpt subscription recovery and token handling --------- Co-authored-by: 5p00kyy <admin@5p00ky.dev>	2026-06-08 10:19:18 +02:00
Sebastian Andres El Khoury Seoane	8d9d4ec9c6	feat(platform): Add support for APFEL as part of the dependencies and models for the Cookbook. (#2657 ) * feat(platform): add support for Apple Silicon detection in platform compatibility test(tests): enhance shell_routes tests for Apple Silicon compatibility * fix issues with missing import * fix: correct package name in package-lock.json and enhance package installation commands in shell_routes.py and cookbook.js * feat: add Apfel startup and health checks on macOS - bootstrap Apfel via Homebrew on arm64 macOS - start `apfel --serve --port 11435` detached for Odysseus - verify readiness via `/health` - clean up the Apfel process on exit or Ctrl+C * fix: duplicate variable declaration post-merge conflict - Should fix `node` CI issues. * fix: issues with the update status of the APFEL dependency. - fixed by changing the main conditional that determines the update. * Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file. * Fix: whitespace issues with the model_routes file * Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file. Final * Fix: Fixed updates using PIP for APFEL instead of custom cmd	2026-06-07 17:28:02 +02:00
M57	12cb39cbd9	feat: add OpenCode Zen and Go as provider options (#26 ) - Add OpenCode Zen (https://opencode.ai/zen/v1) and Go (https://opencode.ai/zen/go/v1) - Add provider detection via _host_match() in llm_core.py - Add curated model list entries in model_routes.py - Add webhook provider URLs - Add provider icon (providers.js) and dropdown options (index.html) - Add auto-detection patterns and setup URLs (slashCommands.js) - Whitelist opencode.ai in URL validation (admin.js) - Rebased on main to fix merge conflicts with _HOST_TO_CURATED refactor Co-authored-by: M57 <hy4ri@users.noreply.github.com>	2026-06-07 16:43:00 +02:00
michaelxer	bdf4ec8b24	fix: fall back to /models probe when base URL returns 404 (#3205 ) _ping_endpoint() probes the bare base URL for non-Ollama endpoints. OpenAI-compatible servers like llama-swap return 404 on the /v1 prefix but 200 on /v1/models, causing endpoints to appear offline despite being fully functional. Add a /models fallback when the base URL returns a non-auth 4xx. Auth failures (401/403) are treated as definitive — probing /models would just repeat the same rejection. Fixes #3181 Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>	2026-06-07 16:09:33 +02:00
Ocean Bennett	5911b8c0dc	fix(models): allow same endpoint URL with different keys (#2758 ) * fix(models): allow same endpoint URL with different keys * fix(models): show endpoint key fingerprints	2026-06-05 21:12:14 +02:00
Giuseppe	bc83479f94	fix: bool('false') is True coerces endpoint toggles incorrectly (#2361 ) Python's bool('false') returns True because the string is non-empty. A JS client serialising a boolean as the string 'false' would have supports_tools or is_enabled silently flipped to True — so 'disable tool support' would actually enable it. Use an explicit lookup dict for supports_tools and a case-insensitive string check for is_enabled so both string and native bool inputs are handled correctly. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-04 19:43:38 +02:00
WasserEsser	20cc23c9bd	fix(models): make pinned models visible in chat UI (#2481 ) Two bugs prevented pinned models from appearing in the chat model picker: 1. _fetch_models() only used _cached_model_ids(), ignoring pinned_models. Since Fireworks AI doesn't list kimi-k2p6-turbo in /v1/models, the cached list was empty, so the endpoint showed as offline with no models. 2. _curate_models() filtered unknown pinned IDs into models_extra, but the chat UI only reads models (primary list). Pinned models stayed invisible. Fix: use _visible_models() to merge cached + pinned, then promote pinned IDs from models_extra to models so they appear in the dropdown. Closes #1521 follow-up	2026-06-04 19:17:37 +02:00
tanmayraut45	f59edee611	Support extra CA bundle for private-CA LLM providers (#769 ) Adding GigaChat (Sber) or an on-premise enterprise LLM gateway as a model endpoint fails on first probe with CERTIFICATE_VERIFY_FAILED: self-signed certificate in certificate chain (_ssl.c:1000) because their TLS chain is signed by a private root CA (Russian Trusted Root CA for GigaChat; corporate CA for on-prem) that isn't part of the default system / certifi trust store. The endpoint shows offline in the picker even though the URL and API key are correct (issue #722). The right fix is to extend the trust store, not to weaken verification. This change: - src/tls_overrides.py: new module that resolves an opt-in env var LLM_CA_BUNDLE at import time, builds a shared SSLContext via ssl.create_default_context() (so the system / certifi bundle is loaded first) and layers the operator's PEM on top with load_verify_locations(). Exposes llm_verify() returning a value suitable for httpx `verify=`. Defaults to True (httpx built-in trust) when the env var is unset, when the file is missing, or when the PEM fails to load — verification is never silently disabled, the warning is logged and we fall back to the safe path. - src/llm_core.py: thread llm_verify() into the shared AsyncClient used by stream_llm / streaming completions. - routes/model_routes.py: thread llm_verify() into the five httpx.get call sites in _probe_endpoint / _ping_endpoint so adding a private-CA endpoint goes green on the very first probe and the picker stops showing it offline. - .env.example: document LLM_CA_BUNDLE with the GigaChat case as the concrete example. Deliberately NOT included: a verify=False knob (global or per-host). Disabling verification exposes the affected endpoint to MITM, and the operator-supplied bundle is the correct fix for legitimate private-CA providers — so the only switch in this PR is the safe one. Closes #722.	2026-06-04 13:18:50 +01:00
Giuseppe	f6a5f6592f	fix: log warnings on silently swallowed agent and endpoint failures (#2367 ) get_builtin_overrides() was swallowing all exceptions with a bare `except Exception: pass`, so misconfigured tool-description overrides would silently produce wrong agent behaviour with no log trace. The background endpoint refresh loop had the same pattern: any probe failure was silently ignored, giving operators no signal that the refresh was broken. Also removes a circular self-import (`from src.agent_loop import _build_base_prompt`) inside _build_system_prompt; the function is already in scope and the import created a latent circular reference risk. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-04 12:29:31 +01:00
Yuri	a2e691da2b	fix(models): stabilize proxy endpoint refresh behavior * fix: support large proxy model endpoint refresh Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline. Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails. Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage. * fix: preserve endpoint ping status semantics	2026-06-04 04:56:11 +01:00
pewdiepie-archdaemon	6861c41580	Reapply "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus" This reverts commit `cc8fe2f6e3`.	2026-06-03 22:47:00 +09:00
pewdiepie-archdaemon	cc8fe2f6e3	Revert "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus" This reverts commit `8161c1253d`, reversing changes made to `8c2705b42a`.	2026-06-03 22:46:19 +09:00
Alexandre Teixeira	145f4fd2b4	feat(models): support pinned endpoint model IDs	2026-06-03 13:00:07 +01:00
Michael Gerber	e392be0d65	fix: Cookbook local GGUF serving inside Docker (#1264 ) * fix: Cookbook local GGUF serving inside Docker Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint. Fixes included here: use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal Before this change, Docker local serves could fail in several ways: Cookbook pointed llama.cpp at the wrong GGUF path generated serve runner scripts crashed before launch with a shell syntax error successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1 This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration * test: add test for docker-local endpoint rewrites	2026-06-03 02:08:09 +09:00
ghreprimand	1fda906407	Fix Cookbook container-local model endpoints (#1223 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 00:09:48 +09:00
red person	42ae905df7	fix(models): clear deleted endpoint fallback refs (#1207 )	2026-06-02 23:41:04 +09:00
red person	76a7685105	fix(models): clear stale speech endpoint settings (#1196 )	2026-06-02 23:32:01 +09:00
Robin Fröhlich	3c6ae3713e	Models: add Z.AI coding endpoint and GLM vision detection	2026-06-02 20:59:17 +09:00
SurprisedDuck	934bca9e48	Providers: omit temperature for OpenAI reasoning models * fix: omit temperature for OpenAI reasoning models (o1/o3/o4/gpt-5) These models only accept the default temperature; sending any explicit value (even 0.0) returns HTTP 400 "Only the default (1) value is supported". This broke two paths: - Endpoint probing in _probe_single_model hardcodes temperature: 0.0, so a perfectly valid o3/gpt-5 endpoint is reported as failing in the Model Endpoints health check. - Chat/stream payloads send temperature unconditionally, so a non-default temperature preset 400s on these models. The code already special-cases the same model family for max_completion_tokens, so this adds a sibling _restricts_temperature() helper and omits the field for those models, letting the API use its required default. gpt-4.5 is intentionally excluded (not a reasoning model; accepts temperature normally). Adds tests/test_llm_core_temperature.py covering the predicate and the synchronous payload builder. * fix: also omit temperature for reasoning models on the direct-POST paths The first commit only covered llm_call/llm_call_async/stream_llm and the endpoint probe. Email auto-summary, urgency-less spam classification, the email reply-summary endpoint, and gallery vision tagging build their OpenAI payloads inline and POST them directly (requests/httpx), bypassing llm_core — so a reasoning model configured there would still 400 on the temperature field. These sites already branch on _uses_max_completion_tokens, so they're the same class; added the matching _restricts_temperature guard. gallery_routes also gains the max_completion_tokens branch it was missing, so gpt-5 vision tagging works end to end. Note: email_pollers urgency scoring goes through llm_call_async and was already covered.	2026-06-02 20:58:33 +09:00
Yavor Ivanov	7cc8fdb2f5	Models: avoid hidden models in default fallback Both get_default_chat and _recover_empty_session_model picked the first model from cached_models[0] without checking hidden_models. If the first cached model was hidden (e.g. minimax-m3), it was returned as the default or used to repair empty session models, even though the model list endpoints already filter hidden_models. - Add _visible_models() helper that filters cached_models by hidden_models (mirrors the filtering in list_model_endpoints) - Use _visible_models() in get_default_chat fallback (when no explicit default_model is saved) - Use _visible_models() in _recover_empty_session_model (when repairing a session whose model field is empty before chat send) - Add regression tests for hidden-model filtering in default chat resolution, and unit tests for _visible_models helper	2026-06-02 20:37:14 +09:00
Hayk Arzumanyan	514050d098	Models: rewrite Docker loopback endpoints to host gateway In Docker, a model-endpoint URL pointing at loopback (e.g. the LM Studio default http://localhost:1234/v1) targets the Odysseus container itself, not the host running the server, so the probe gets a connection error and the endpoint is rejected with a misleading 'No models found for that provider/key'. Rewrite loopback to host.docker.internal (which compose already maps to host-gateway) for the probe and the saved URL, mirroring the existing Ollama handling. Gated on actually being in a container with the gateway reachable, so native installs and gateway-less deploys are untouched. Fixes #25 Co-authored-by: Claude <noreply@anthropic.com>	2026-06-02 20:34:40 +09:00
tanmayraut45	6c654fb0ef	Models: detect bare Ollama URLs as online _ping_endpoint() is the reachability fallback the model-endpoint POST handler invokes when _probe_endpoint() returns no model ids. It GETs base + "/models" and, on any sub-500 response, returns immediately with `reachable = (status < 400)`. That early return runs before the Ollama-native /api/version / /api/tags fallback below it. For an Ollama URL without /v1 (the quickstart accepts both http://localhost:11434 and http://127.0.0.1:11434, and the reporter on #1025 explicitly tried both), the OpenAI-style probe target is http://127.0.0.1:11434/models. Ollama returns 404 there because /models only lives under /v1. _ping_endpoint then returned reachable=False and the picker showed "Added (offline — will retry on next load)" on an install that was running fine. /api/version was never tried. Same shape for http://127.0.0.1:11434/api (the native Ollama root): /api/models is also 404, same premature offline verdict. _probe_endpoint() does fall through to /api/tags on a 4xx (the response raises via raise_for_status), so the endpoint quietly recovers once cached_models becomes non-empty on the next background refresh — matching the second commenter's "had to disconnect manually then reconnect for it to be detected" note. The bug is most visible while no models are pulled yet (cached_models stays empty, _ping_endpoint keeps voting offline). Fix: - Hoist the Ollama-shaped-URL test (port == 11434 or "ollama" in hostname — the same condition _probe_endpoint already uses) to the top of the function so both code paths share it. - Stop short-circuiting on 4xx when the URL looks like Ollama: fall through to the existing /api/version + /api/tags reachability loop so an alive Ollama gets recognised even when its OpenAI surface has the wrong prefix for the user's input. - Fix the `root` computation in that loop to strip a trailing /api as well as /v1, so http://127.0.0.1:11434/api no longer gets probed at /api/api/version. - 4xx on non-Ollama hosts keeps the current semantics: a 401 from api.openai.com/v1/models is still a definitive offline verdict, not a reason to GET /api/version on OpenAI. Closes #1025.	2026-06-02 20:27:41 +09:00
Tatlatat	ffb77d7ff2	fix(auth): honor AUTH_ENABLED=false on owner-scoped endpoints (no /login loop) (#880 ) When the operator sets AUTH_ENABLED=false, three owner-scoped endpoints still returned 401 (api/models, api/research/, api/email/), so the front-end redirected the browser to /login and the app was unusable despite auth being turned off. require_user() in src/auth_helpers.py already documents and honors this contract (issue #622) via 'if _auth_disabled(): return ""', but these endpoints did their own get_current_user/is_configured check without it. Make _require_user (research), the /api/models anti-leak guard, and email_helpers._require_auth consult _auth_disabled() and let anonymous through (owner='') only when the operator explicitly disabled auth. The 401 protection is fully intact when AUTH_ENABLED=true. Verified end-to-end: with AUTH_ENABLED=false the SPA now loads instead of bouncing to /login.	2026-06-02 12:26:26 +09:00
tanmayraut45	0e31c38be0	Support in-place endpoint updates and recover empty-model sessions (#786 ) The "don't wipe endpoint_url/model on endpoint delete" half of #587 landed in `6a78b02` (Fix endpoint model preservation for tasks). The three remaining follow-up pieces from the original PR — flagged in the review on #786 — are: - routes/model_routes.py: toggle_model_endpoint (PATCH) now accepts api_key and base_url, so the admin UI can rotate a key or fix a typo'd URL without going through delete+recreate. base_url is normalized the same way the POST handler does (strip /models, /chat/completions, /completions, /v1/messages, then _normalize_base). Cache invalidation matches the POST/DELETE paths and the response includes base_url so the frontend can confirm what was saved. - routes/chat_routes.py: new _recover_empty_session_model picks cached_models[0] from the endpoint that matches sess.endpoint_url and persists it onto the Session row before the LLM call goes out. Wired into both /api/chat and /api/chat_stream after the existing _clear_orphaned_session_endpoint guard, so the order is: drop truly-orphaned sessions first, then heal the "picker showed it, session never knew" case. - routes/chat_routes.py: when recovery fails (no endpoint, no cached models) raise HTTP 400 with a clear message instead of letting model="" reach the upstream as 401/503. Closes #587.	2026-06-02 11:26:38 +09:00
LittleLlama	54ecfa39cf	Provider detection: match by hostname instead of substring (re #768 ) (#815 ) * Dedupe URL routing helpers and tighten adjacent hostname checks * Match providers by hostname, not substring, in _detect_provider _detect_provider used `"anthropic.com" in url`-style substring checks, so a URL that merely contained a provider's domain in its path or query — or a look-alike host like `anthropic.com.example` — was misclassified and picked the wrong auth-header/payload shape. Switch it to the existing `_host_match` helper (hostname exact/subdomain match), the same way the human-readable labels and curated model lists already work, finishing that migration. Also harden `_host_match` against trailing-dot FQDNs. Not a credential-leak fix: _detect_provider only classifies a URL the admin already configured next to its key, and the URL — not this function — decides where the request goes. This is a correctness/consistency cleanup. Adds tests that import the real helpers (test_endpoint_resolver.py tests local copies, so it can't catch this) covering the substring false-positives. Refs #768. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Import build_headers under its real name in model_routes It was imported as `build_headers as _provider_headers`, which collides with the unrelated llm_core._provider_headers(provider, headers) — same name, different signature. Use the real name to remove the confusion. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Use hostname matching in URL builders, not raw suffix checks PR review flagged that _detect_provider() was hardened to match on hostname, but several helpers still used raw host.endswith("anthropic.com") / host.endswith("ollama.com"), which match adjacent hosts like notanthropic.com / notollama.com. Route the remaining checks through _host_match(): _is_ollama_native_url and _ollama_api_root in llm_core, and _anthropic_api_root / _ollama_api_root in endpoint_resolver. With _detect_provider already hostname-correct, the trailing "or host.endswith(...)" clauses in build_chat_url / build_models_url are redundant, so drop them rather than fix the substring match in place. Add builder-level tests asserting look-alike and domain-in-path hosts route to the OpenAI-compatible default. They import the real builders and fail on the pre-fix code. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 11:11:17 +09:00
wundervrc	3f6d630b56	Never resolve to a disabled endpoint model (#861 ) Background tasks (e.g. the Email Tags / check_email_urgency action) resolve their model through resolve_endpoint("utility") → Default Chat. When the configured model is one the user has since disabled on the endpoint, the resolver still dispatched to it — on Groq that surfaces as every email failing with "HTTP 400: model ... requires terms acceptance". Two paths fed this: - The auto-pick fallback selected from cached_models without excluding the endpoint's hidden_models, so a disabled model listed first won. - A stale default_model left pointing at a now-disabled model (seeded at endpoint registration from raw model_ids[0]) was used verbatim. Fix resolve_endpoint / resolve_endpoint_by_id to drop a configured model that's in hidden_models and to pick the first ENABLED chat model. Also seed default_model on registration via _first_chat_model so we never pin the global default to an embedding/tts entry a provider lists first. Checks: python -m pytest tests/test_endpoint_resolver.py tests/test_model_routes.py tests/test_model_context.py (all pass); python -m py_compile app.py routes/model_routes.py src/endpoint_resolver.py. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 11:10:43 +09:00
pewdiepie-archdaemon	6a78b02976	Fix endpoint model preservation for tasks	2026-06-02 09:44:24 +09:00
Alexandre Teixeira	26483661da	Restrict provider discovery to admins Require admin access before serving provider discovery data from GET /api/providers. This prevents normal authenticated users from triggering provider discovery or receiving cached provider host data. Keep GET /api/models available to normal users and leave the existing admin-only GET /api/discover behavior unchanged. Add a focused regression test to ensure unauthorized callers cannot trigger discovery and cannot receive cached provider data.	2026-06-02 05:54:40 +09:00
Prakhya	a96593a99b	Improve Ollama endpoint error messages	2026-06-02 05:53:50 +09:00
Abhinav	9e8de43f25	fix: clear session headers on endpoint deletion (#477 )	2026-06-01 22:19:54 +09:00
Alexander Kenley	2c4b8b57dd	feat(ai): add OpenRouter and Ollama Cloud providers (#231 ) Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>	2026-06-01 14:26:10 +09:00
Sirsyorrz	09acf955f1	models: dedupe endpoints by base_url on create (#266 ) POST /api/model-endpoints always inserted a new row, so Settings -> Add Models -> Scan for Servers re-added any endpoint a user had already registered manually — once under its model name (from the earlier manual add) and again under its host:port (auto-generated when scan posts without a name). The success toast then misreported the result as "added N new". Look up an existing endpoint with the same base_url accessible to the caller (shared or owned by them) before inserting. If found, return it with `existing: true` so the client can tell the difference between an actual add and a dedupe hit. Toast now reads, e.g., "Found 1 server with 1 model — 1 already added". Tested: POSTing the same base_url three times (incl. trailing-slash variation) returns the same id each time; only one row exists.	2026-06-01 14:22:06 +09:00
pewdiepie-archdaemon	fc7f107b22	Improve Ollama setup and model endpoint handling	2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

43 Commits