mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-28 15:45:22 -04:00
5b8bfdabab92355523742d5dcd05b4fca048e119
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
2dfc83ee22 |
fix(models): accept bare-list /models responses (Together AI) (#4761)
* fix(api): handle varying response formats for model IDs from compatible providers merge conflict for pr-2204 resolved * fix(modal): keep body-portaled dropdowns above their tool modal at any stack depth (#4720) (#4724) * fix(memory): keep the Brain memory item menu above the modal at any stack depth The memory item "⋮" dropdown is portaled to <body> with a hardcoded z-index of 10001. Tool modals, however, get a monotonically increasing z-index from modalManager's bring-to-front counter (_modalTopZ), which climbs unbounded as modals are opened/restored over a session. Once that counter passes 10001, the Brain modal stacks above the body-portaled dropdown, so the menu renders behind the panel — visible only where it spills past the modal's edge (#4720). Derive the dropdown's z-index from the owning modal's current z-index (+1), keeping 10001 as a floor for the common low-counter case, so the menu always sits just above its modal however high the counter has climbed. Verified with document.elementFromPoint at the dropdown's location: with a high modal z-index the old build returns the modal at every sampled point (menu behind); the fixed build returns the dropdown (menu on top). The default low-counter case is unchanged (z stays 10001). * refactor(modal): route body-portaled dropdowns through a shared topPortalZ() helper The hardcoded z-index:10001 the Brain memory menu used (#4720) is the same literal shared by ~16 body-portaled dropdowns across calendar, cookbook, cookbookServe, documentLibrary, emailLibrary, gallery, notes, emojiPicker and memory — each renders behind its owning tool modal once modalManager's bring-to-front counter climbs past the literal over a long session. Promote the per-dropdown fix into a single topPortalZ() helper in toolWindowZOrder.js — the existing source of truth for tool-window z, already imported by modalManager's _bringToFront and notes.js — returning max(topToolWindowZ(), dock-chip floor) + 1, so a portaled dropdown always sits just above the live tool-window stack however high the counter has climbed. Route all 16 sites through it. The slashCommands tour tooltips and the cookbookServe VRAM dialog are intentionally left out (neither is a modal-owned portaled dropdown). Add tests/test_portal_dropdown_z_js.py covering the helper, including the #4720 scenario (modal counter at 99999 -> dropdown at 100000). Existing test_notes_z_order_js.py stays green. * fix(llm): detect mistral.ai provider and support reasoning_effort (#4698) * fix(llm): detect mistral.ai provider and support reasoning_effort Four coupled bugs broke Mistral thinking model support: 1. _detect_provider() had no mistral.ai host check, so all Mistral endpoints fell through to the generic 'openai' provider string. _provider_display_name() correctly identified them as 'Mistral', making any 'if provider == "Mistral"' check elsewhere dead code. 2. reasoning_effort parameter was never sent in the request payload, so Mistral never activated thinking mode even when the user configured a thinking-capable model (mistral-small-latest, mistral-medium-latest, magistral-*). 3. Mistral returns content as a typed array ([{"type":"thinking",...},{"type":"text",...}]) when reasoning is on, not as a plain string. Both the streaming and non-streaming parsers expected strings and silently dropped the thinking content. 4. _THINKING_MODEL_PATTERNS didn't include magistral or mistral-* model prefixes, so the frontend wouldn't tag reasoning output as thinking even after the above were fixed. Fix: - Add mistral.ai to _detect_provider() host checks - Add a _normalize_mistral_content() helper that splits the typed array into (text, thinking) strings - Inject payload["reasoning_effort"] = "high" when provider is Mistral and _supports_thinking(model) is true, in both stream_llm and llm_call_async payload construction - Wire the normalizer into both response parsers - Extend _THINKING_MODEL_PATTERNS to include magistral, mistral-small, mistral-medium, mistral-large Tested on Docker install with mistral-small-latest + reasoning_effort=high. Reasoning streams correctly into the thinking panel after the fix. Fixes #4678 * fix(llm): address review — lowercase provider id, configurable effort, tests Addresses vdmkenny's review on PR #4698: 1. Removed duplicate 'if provider == "mistral"' block in stream_llm — two back-to-back copies, one was dead-redundant. 2. Dropped personal-context comment ('free-tier limits are generous for this user') and made reasoning_effort configurable via env var ODYSSEUS_MISTRAL_REASONING_EFFORT (high / medium / low / none). Default remains 'high' for backward compat with the tested behavior. 3. Recased provider id from 'Mistral' to 'mistral' to match the lowercase convention used by every other provider id in the file (openai, anthropic, ollama, copilot, ...). _provider_display_name() still returns the Title-Case 'Mistral' for UI labels — only the runtime id used in 'if provider == ...' checks was recased. 4. Added tests/test_llm_core_mistral_content.py with 13 tests pinning _normalize_mistral_content()'s contract: string passthrough, the Mistral array format (thinking + text blocks), and edge cases (empty, garbage, None, wrong types, missing fields, string-vs-array inner thinking field). Also fixed a gap the review didn't catch: the non-streaming paths (llm_call sync + llm_call_async) were missing the reasoning_effort injection entirely. Added the same injection to both, so Deep Research and agent tool calls also activate Mistral thinking. All 13 new tests pass. Existing reasoning/streaming/ollama-thinking tests still pass (38 tests, no regressions). Fixes #4678 * fix: Images cannot be seen by model that is vision capable (#4726) * fix: Images cannot be seen by model that is vision capable * fix: skip http(s) image_url for Ollama (images[] is base64-only) --------- Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> * fix(chat): strip executed email tool fences from the live stream (#3993) (#4275) * fix(chat): strip executed email tool fences from the live stream (#3993) The backend strips every fenced tool block from persisted text (the regex in src/tool_parsing.py is built from the full TOOL_TAGS set, which includes the email tools), so a reloaded session renders cleanly. The live frontend path uses a separate hardcoded EXEC_FENCE_RE in static/js/chatRenderer.js that only listed web_search/read_file/write_file/create_document/edit_document/ update_document — so executed email tool fences (list_emails, etc.) lingered as raw code blocks in the live assistant bubble until the user reloaded. Add the nine email tool tags to EXEC_FENCE_RE so the live render settles into the same clean layout as the history reload. bash/python stay excluded on purpose: those are languages a user may legitimately have asked the model to show as code, not tool invocations. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(chat): single-source live exec-fence tool list from TOOL_TAGS (#3993) Per review: EXEC_FENCE_RE was a second, hand-maintained copy of the executable-tool list, so any tool not in it — and every future tool added to TOOL_TAGS — would leave its executed fence lingering in the live bubble until reload (the original #3993 bug, recurring one tool at a time). EXEC_FENCE_RE is now built from an explicit EXEC_TOOL_TAGS list that mirrors TOOL_TAGS (src/agent_tools/__init__.py) minus bash/python, which stay excluded as legitimate code-example languages. A new regression test (test_exec_fence_re_covers_all_executable_tools) extracts both lists from source and fails if they drift, so the whole class is caught in CI instead of by a user — the "minimum acceptable middle ground" from the review, made exact (set equality, not just coverage). Verified: pytest tests/test_live_strip_email_tool_fences.py (5 passed); node --check static/js/chatRenderer.js; and a node run of the built regex confirms email/generate_image/manage_memory/ls fences strip while bash/python/sh are preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(chat): build live exec-fence list from /api/tools at runtime (#3993) Make TOOL_TAGS the single source for live exec-fence stripping. chatRenderer.js no longer hard-codes a tool list; it fetches the backend's authoritative set once from GET /api/tools (sorted(TOOL_TAGS)) and builds EXEC_FENCE_RE from it at load, minus bash/python. No second list to drift, and a future tool added to TOOL_TAGS is covered automatically — without touching the streaming path. Until the fetch resolves EXEC_FENCE_RE is null and exec fences aren't stripped (a sub-second window before the first stream); the backend already strips persisted history, so a reload always renders clean. Drop test_exec_fence_re_covers_all_executable_tools (no hand-maintained list to guard) and add source-level guards: the frontend keeps no hard-coded list and fetches /api/tools, and the endpoint serves the full sorted(TOOL_TAGS). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CVCKth4g8pWh7pwFDVm4iL * fix(chat): warn on /api/tools fetch failure instead of swallowing it (#3993) A fresh-context review flagged that loadExecFenceRegex's catch silently discarded errors: if the one-shot fetch fails, EXEC_FENCE_RE stays null for the whole session and live exec fences go unstripped until reload, with zero signal. console.warn it, and correct the comment to describe the failure mode honestly (was understated as just a sub-second startup window). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CVCKth4g8pWh7pwFDVm4iL --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(routes): log and cleanly 500 on unreadable HTML page (#4637) * fix(routes): serve 404 instead of 500 when an HTML page file is missing _serve_html_with_nonce opened the HTML file with no error handling, and callers such as /backgrounds and /login pass their paths in with no existence check, so a missing or unreadable file raised an unhandled OSError that surfaced as a 500. Wrap the read and raise HTTPException(404) instead; the normal render path (CSP-nonce substitution) is unchanged. Fixes #4594 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(routes): distinguish missing page (404) from read failure (500) The previous fix caught a broad OSError and returned 404 for every failure, which masks real server-side problems (permission errors, I/O failures) as "not found" and lets them slip past error alerting. Split FileNotFoundError (genuine 404) from other OSError, which now logs the exception and returns a generic 500 — without leaking the OS error string or file path into the response body. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(routes): treat unreadable bundled HTML page as logged 500, not 404 Per PR #4637 review: every caller of the page-render helper serves a fixed, server-owned template (index/login/backgrounds), never a client-supplied path. So a missing or unreadable file is a server fault (broken deployment), not a client "not found" — a 404 there mislabels a server error and hides a missing core template from 5xx alerting, contradicting the OSError->500 rationale this PR is built on. Collapse both branches into a single logged, leak-free 500. Move the helper to src.app_helpers.serve_html_with_nonce so the behavior can be unit-tested without importing the whole app (app.py is the slim orchestrator; the test harness stubs src.database, so importing app in tests is not viable). Add tests pinning missing/unreadable -> 500 (not 404) and nonce injection on the happy path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * feat(catalog): add Gemma 4 12B/QAT entries and RTX 3050 bandwidth (#4728) Add official Gemma 4 12B-it plus QAT-INT4/INT8 catalog entries (with their GGUF sources), QAT quantization support across the quant tables and the prequantized-prefix list, and the missing RTX 3050 / 3050 Ti memory bandwidth so speed estimates stop falling back to the generic cuda value. * fix debugging on windows (#4679) * fix: Real-ESRGAN install + Cookbook deps-panel crash on the Python 3.14 image (#4694) * fix(docker): make Real-ESRGAN installable on the Python 3.14 image realesrgan's deps basicsr/gfpgan/facexlib (unmaintained since 2022) read their version in setup.py via `exec(...); locals()['__version__']`, which raises KeyError on Python 3.13+ — PEP 667 made locals() in a function an independent snapshot that exec() can no longer mutate. That fails the Cookbook "install realesrgan" sdist build on the python:3.14 base. Add a `realesrgan-wheels` builder stage that fetches the pinned sdists, patches get_version() to exec into an explicit namespace dict, and builds wheels; the final stage installs them --no-deps so a later `pip install realesrgan` resolves from wheels instead of rebuilding the broken sdists. torch stays a runtime pull to keep the base image lean. Also add the runtime libs opencv-python (cv2) needs — libgl1, libglib2.0-0t64, libxcb1 — which the slim base omits; without them the install succeeds but `import cv2` dies with `libxcb.so.1: cannot open shared object file`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(cookbook): don't let a package's sys.exit() on import hang the deps panel The local optional-dependency probe imports each package in-process and catches ImportError / Exception. But a package can call sys.exit() at import time — e.g. rembg does `sys.exit(1)` when no onnxruntime backend loads. SystemExit is a BaseException, not Exception, so it escaped the probe, propagated out of the list_packages endpoint, and hung the whole Dependencies panel / worker (the UI loads forever). Catch (Exception, SystemExit) so one broken optional package is reported as not-usable instead of taking down the panel. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * fix(routes): 500 (not 404) when the app-shell index.html is missing (#4791) Follow-up to #4637. serve_index — the handler for / and the SPA deep-link routes (/notes, /calendar, /cookbook, /email, /memory, /gallery, /tasks, /library) — pre-checked os.path.exists and raised its own HTTPException(404, "index.html not found") when the bundle was missing. So a missing core template returned 404 before serve_html_with_nonce's 500 could fire, the one inconsistency left after #4637. index.html is a fixed, app-bundled template; a missing one is a broken deployment (server fault), not a client "not found", so it should surface as a logged 500 in 5xx alerting rather than a 404. Keep the static->root fallback, drop the redundant existence guard and the dead-end 404, and let the shared helper handle the missing case. Verified against the running app: / and /notes return 200 with the bundle present and a logged 500 when index.html is absent. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> * fix(setup): load .env so a pre-seeded admin password is honored on native installs (#4787) setup.py read ODYSSEUS_ADMIN_USER / ODYSSEUS_ADMIN_PASSWORD via os.getenv() but never loaded .env, so on native Linux/macOS installs a password pre-seeded in .env (documented in docs/setup.md and .env.example) was silently ignored and a random one generated, breaking the first login. Docker was unaffected because compose passes the vars into the container env. Call load_dotenv(BASE_DIR/.env, encoding="utf-8-sig") at the top of main(), mirroring app.py (utf-8-sig tolerates a Notepad UTF-8 BOM). load_dotenv does not override already-exported OS vars, so the existing precedence is kept. python-dotenv is already a required dependency. Adds a regression test that pre-seeds credentials only in .env (not the shell) and asserts the stored bcrypt hash matches the pre-seeded password. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: email poller marks calendar extraction processed on LLM failure (#4622) Move calendar processed-marker insert into the LLM success path (else branch). Previously, the INSERT ran even after a transient LLM failure, causing the poller to skip retrying calendar extraction on subsequent runs. Minimal change: only touches the try/except/else control flow in _auto_summarize_pass_single() — preserves existing formatting and line endings. * feat(ui): add toggle for padding around chat area (#4691) * feat: Allow admins to choose if they want to share defaults (#4752) * First bare fix * Adding the option toggle * toggle function fix * Final fix, added missing /auth/ * Extended toggle text & added tests * Comments change * Description toggle change * br tag fix * description change based on suggestion * fix(agent): parse misfenced read_file calls (#4799) * fix: use atomic write in APIKeyManager.save() to prevent credential data loss (#4591) (#4597) * fix: use atomic write in APIKeyManager.save() to prevent data loss Opening api_keys.json with 'w' truncates the file before writing, so a crash, disk-full, or mid-write error leaves all stored provider API keys corrupted. Switch to atomic write (temp file + fsync + os.replace) so the original file is always intact on any failure. Fixes #4591 * chore: trigger CI re-run * chore: update PR description * chore: fix how-to-test section for description check --------- Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> * feat(discovery): detect llama.cpp servers and label local providers (#4729) * feat(discovery): detect llama.cpp servers and label local providers Scan port 8080 (llama-server) and 11435 (APFEL) during discovery, fingerprint llama.cpp via its native /props endpoint, and label well-known local serving ports (8080 llama.cpp, 8000 vLLM, 1234 LM Studio, 11434 Ollama) consistently in both the Python provider helper and the JS endpoint UI. Adds a llama.cpp hint to the /setup slash command. * fix(discovery): don't infer the serving tool from the port alone Per review: vLLM, SGLang, llama.cpp and plain OpenAI-compatible servers all share 8000/8080, so labeling by port mislabels real setups (a vLLM box on 8080 shown as llama.cpp). Drop the port->tool assertions from _provider_label and providerLabel; the authoritative signal is the /props fingerprint done during discovery, which is unchanged. Loopback now reads a neutral 'local endpoint' / 'Local'. Tests updated to assert the neutral labels. * refactor(tools): migrate config/integration admin tools to the registry (#4742) Part of #3629 (the `admin_tools.py` bullet). Moves the config/integration admin tools off the legacy elif dispatch chain in tool_implementations.py onto the agent_tools registry: manage_endpoints, manage_mcp, manage_webhooks, manage_tokens, manage_settings The do_* implementations (and manage_mcp's command-allowlist / RCE guard: _validate_mcp_command, _mcp_allowed_commands, and the _MCP_* constants) move verbatim into the new src/agent_tools/admin_tools.py. They register through a single ADMIN_TOOL_HANDLERS map that TOOL_HANDLERS.update()s, and the five elif branches plus their imports are dropped from tool_execution.py, so these tools now flow through _direct_fallback like the other migrated clusters. The names are re-exported from src.agent_tools for back-compat. Dedup: - _parse_tool_args was duplicated in tool_implementations.py and document_tools.py. It now lives once in src.tool_utils (which imports nothing from the project beyond src.constants, so this introduces no cycle) and both call sites import it from there. The orphaned `import json` in document_tools is removed with it. - The five tools share one _owner_adapter(fn) factory that threads ctx["owner"] into the owner-taking do_* signature, instead of five near-identical wrappers. Tests: new tests/test_admin_tools_registry.py pins the registration, the re-export back-compat, the owner-threading adapter, and the single-source _parse_tool_args (across admin_tools and document_tools). Existing MCP / settings / webhook suites are repointed at the new module. * refactor(exceptions): dedupe src/exceptions via core re-export (#4785) src/exceptions.py was a byte-for-byte duplicate of the canonical core/exceptions.py. Replace its class bodies with a re-export shim (mirroring the core/constants.py -> src/constants.py pattern) so the exception classes are defined in exactly one place. Also fix the stale "# src/exceptions.py" header comment in core/exceptions.py. No behavior change: both import paths resolve to the same class objects (verified by identity), so `except SessionNotFoundError` works regardless of which module it was imported from. Ran py_compile and pytest tests/test_app.py (12 passed). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(tasks): normalize task endpoint URL to /chat/completions before model call (#4619) Upstream bug (present in pewdiepie-archdaemon/odysseus main): the task executor passes task.endpoint_url VERBATIM to the model HTTP call, unlike the chat path which stores build_chat_url(normalize_base(base)) on the session. A task carrying an explicit bare OpenAI-compatible base such as "http://host:11434/v1" therefore POSTs to a 404 ("page not found"); the agent loop swallows the empty body into "The model returned an empty response" and marks the run success, so nothing surfaces the failure. Tasks that omit an endpoint dodge this only because _resolve_defaults() cribs an already-full URL from a recent chat session. The API/token path (e.g. an external client that POSTs /api/tasks with endpoint_url=".../v1") hits it every time. Fix: route every resolved task endpoint through _normalize_chat_endpoint() at the three resolution sites (_execute_llm_task, the persona/research session path, and _execute_research_task). The helper is idempotent (strips any existing chat suffix, re-appends the correct one) and leaves native-Ollama (/api...) and already-concrete URLs untouched, so other providers are unaffected. Proven via isolated repro: ".../v1" -> 404 -> empty; ".../v1/chat/completions" -> 200 -> real gemma4:31b output. Regression test asserts the bare-/v1 -> full-chat-URL mapping, idempotency, and the native-Ollama/empty passthroughs. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(model-routes): harden _probe_endpoint against malformed model-list responses (#4789) * fix(model-routes): harden _probe_endpoint against malformed model-list responses _probe_endpoint parsed model lists with data.get(...) at four sites without checking that data is a dict, and built the list with a truthiness-only filter. A /models (or /api/tags) endpoint returning HTTP 200 with valid but non-dict JSON ([], "x", null, 123) made data.get(...) raise AttributeError, and a non-string id like 123 passed the filter and then hit .startswith() / .lower() in the Z.AI/Kimi curated merge and _is_chat_model(). Both errors are swallowed by the broad except Exception, but the comprehension dies mid-list so the ENTIRE probed model list is discarded and the endpoint silently degrades — masking a misconfigured/non-compliant upstream as "no models". - Guard each data.get(...) with isinstance(data, dict) so a non-dict body falls through the existing `or []` default. - Restrict the OpenAI and Ollama model-list comprehensions to non-empty str values, protecting the .startswith() merges and both _is_chat_model calls. - Add an isinstance guard at the top of _is_chat_model (defense in depth for all four call sites). No behavior change for well-formed {"data":[...]} / {"models":[...]} responses. Adds regression tests (non-dict body via caplog, mixed/all non-string ids, _is_chat_model boundary) that fail before the fix and pass after. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(model-routes): extract _openai_model_ids / _ollama_model_names helpers Per review on #4789: the malformed-response guards were inlined four times in _probe_endpoint (two OpenAI-id comprehensions, two Ollama-name comprehensions). Pull each into a small, directly-testable helper so the security-relevant parsing lives in one place and a future malformed-shape fix doesn't have to be applied in four spots (CONTRIBUTING flags repeated logic for this reason). Behavior is unchanged. Adds direct unit tests for both helpers (non-dict body, non-string ids, non-dict entries, name>model precedence). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(cookbook): only block model launch on real port collisions (#4760) * Fix #4507: only block model launch on real port collisions Quick-run hardcoded port 8000 and never called _nextAvailablePort(), so every launch collided. Both pre-launch guards (serve panel + quick-run) were count-based and fired regardless of port. - quick-run now auto-assigns a free port (8080 for llama.cpp) - both guards parse the new port and only prompt on a real overlap, stopping only the colliding serve - dialog reports the actual port instead of a hardcoded 8000 * refactor(cookbook): share _taskPort for port parsing; auto-assign llama.cpp port Addresses review on #4760: - _taskPort regex now matches --port= as well as --port (space) - _nextAvailablePort and both launch guards reuse _taskPort instead of inline regex - quick-run llama.cpp no longer pins 8080, so two can run concurrently * fix(cookbook): _taskPort also parses -p; add port-parsing tests Addresses review on #4760: - _taskPort now matches -p <n> too, so it's the complete single reader (was missing the short flag that other readers already handle) - add tests/test_cookbook_port_parsing_js.py covering the port forms, shared-reader reuse, and llama.cpp auto-assign * test(cookbook): extract pure port helpers and test behavior Addresses review on #4760: the prior tests only asserted source strings. - extract portOf() and nextFreePort() into static/js/cookbookPorts.js - cookbookRunning.js imports them; _taskPort and _nextAvailablePort delegate - tests run the helpers via node and assert real behavior: all port forms (--port, --port=, -p, -p=), next-free-port skipping taken ports, and the same-port-clash / different-port-coexist outcome --------- Co-authored-by: samy <samy@odysseus.boukouro.com> * fix(ui): route tasks.js + skills.js dropdowns through topPortalZ() (#4768) Fixes #4767. #4724 routed 16 body-portaled dropdowns through the shared topPortalZ() helper so they always render just above the currently-raised tool modal, but two were missed and still used a hardcoded z-index, so they hit the same #4720 bug once a modal's bring-to-front counter climbed past the literal: - tasks.js _showTaskDropdown(): inline z-index:100000 on .task-dropdown - skills.js kebab menu (.skill-kebab-menu): z-index:100002 in style.css Both now set zIndex from topPortalZ() after they are appended to the body, matching the other migrated sites. The dead CSS z-index on .skill-kebab-menu is removed (the inline value always wins). test_portal_dropdown_z_js.py gains a source guard asserting both files use topPortalZ() and that no hardcoded 100000/100002 portal literal survives in either file or style.css. * do_list_models in ai_interaction.py dropped --------- Co-authored-by: Max Hsu <maxmilian@users.noreply.github.com> Co-authored-by: aubrey <kyuhex@gmail.com> Co-authored-by: Michael <52305679+michaelxer@users.noreply.github.com> Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Ahmed Dlshad <ahmed.dlshad.m@gmail.com> Co-authored-by: Joel Alejandro Escareño Fernández <52678667+TheAlexz@users.noreply.github.com> Co-authored-by: Kalin Stoyanov <kgs.void@gmail.com> Co-authored-by: Pedro Barbosa <devpedrobarbosa@gmail.com> Co-authored-by: Solanki Sumit <125974181+YAMRAJ13y@users.noreply.github.com> Co-authored-by: Rudra Sarker <78224940+rudra496@users.noreply.github.com> Co-authored-by: Skoh <101289702+SkohTV@users.noreply.github.com> Co-authored-by: Jakub Grula <ramsters110@gmail.com> Co-authored-by: Dividesbyzer0 <54127744+zoomdbz@users.noreply.github.com> Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> Co-authored-by: Magiomakes <114195802+Magiomakes@users.noreply.github.com> Co-authored-by: Samy <12219635+touzenesmy@users.noreply.github.com> Co-authored-by: samy <samy@odysseus.boukouro.com> |
||
|
|
2196869c86 |
fix(webhooks): route public emitters through fire_and_forget (#3964) (#4336)
The three public webhook emitters in chat_helpers and webhook_routes schedule deliveries via asyncio.create_task(webhook_manager.fire(...)), which bypasses WebhookManager._bg_tasks. asyncio only holds a weak reference to the outer task, so the GC can collect it mid-delivery and the webhook is silently dropped. Route all three through webhook_manager.fire_and_forget() so the task is tracked by _spawn_tracked() and the manager owns the full lifecycle. Adds an AST-level guard test that scans routes/ for direct asyncio.create_task wrapping webhook_manager.fire(...) to prevent regressions. |
||
|
|
955455b797 |
fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549)
* fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(kimi): omit temperature for kimi-for-coding API calls Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
1e0d9b92af |
feat: add ChatGPT Subscription provider (#2876)
* feat: Add ChatGPT Subscription support and related features - Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI. - Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status. - Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance. - Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints. - Added new slash commands for managing skills, including listing, searching, and invoking skills. - Implemented caching for skill catalog to optimize performance. - Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing. - Refactored existing code to accommodate new features and improve maintainability. * refactor: share provider device-flow setup - reuse one device-flow backend for Copilot and ChatGPT Subscription - add one frontend device-flow helper for Settings and /setup - put GitHub Copilot back into Add Models, now as a dropdown option - make provider selection just select; clicking Add starts sign-in - stop ChatGPT Subscription setup from opening auth tabs automatically - make /setup copilot and /setup chatgpt-subscription work from chat - show ChatGPT Subscription in the /setup suggestions - show the real error message when setup fails - add focused tests for the shared flow and setup UI * feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX Backend: - Resolve runtime bearer for provider-auth endpoints at probe time via a shared _resolve_probe_key() that delegates to resolve_endpoint_runtime, applied across all probe/refresh call sites. - Skip live completion probes and health pings for discovery-only providers (centralized behind _is_discovery_only_provider) — the Codex/Responses API has no such endpoints, so status is derived from cached models. - Never persist the short lived ChatGPT bearer to the plaintext sessions table; proactively clear any stale bearer left by an earlier code path. - Revoke orphaned ProviderAuthSession credentials when the last endpoint backing them is deleted (_delete_orphaned_provider_auth), surfaced via cleared_provider_auth in the delete response. Frontend (admin.js): - Auto-start the device-auth flow on provider selection so the authorization panel (code + Authorize) shows immediately instead of behind a "Sign in" click. - Remove the redundant top button for device auth providers, move retry into the panel via an inline "Try again". - Drop the self-evident hint text and add an execCommand clipboard fallback so Copy works in non-secure (HTTP/LAN) contexts. * fix: harden chatgpt subscription provider * chore: remove PR media from branch * Fix chatgpt subscription recovery and token handling --------- Co-authored-by: 5p00kyy <admin@5p00ky.dev> |
||
|
|
12cb39cbd9 |
feat: add OpenCode Zen and Go as provider options (#26)
- Add OpenCode Zen (https://opencode.ai/zen/v1) and Go (https://opencode.ai/zen/go/v1) - Add provider detection via _host_match() in llm_core.py - Add curated model list entries in model_routes.py - Add webhook provider URLs - Add provider icon (providers.js) and dropdown options (index.html) - Add auto-detection patterns and setup URLs (slashCommands.js) - Whitelist opencode.ai in URL validation (admin.js) - Rebased on main to fix merge conflicts with _HOST_TO_CURATED refactor Co-authored-by: M57 <hy4ri@users.noreply.github.com> |
||
|
|
6861c41580 |
Reapply "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit
|
||
|
|
cc8fe2f6e3 |
Revert "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus"
This reverts commit |
||
|
|
b1a4ed13b0 |
Harden API-token chat endpoint selection
Validate only token-supplied direct base_url values for API-token chat requests, while keeping admin-configured endpoints available for local/LAN providers. Scope configured endpoint fallback selection to the API token owner, fail closed for unknown token owners, and preserve strict session ownership checks when resuming sessions from chat-scoped API tokens. Add focused regression coverage for direct base_url SSRF rejection, configured endpoint fallback behavior, token-owner scoping, URL validation, and null-owner session/endpoint handling. |
||
|
|
c0c1ceb36d |
Treat Venice as a tool-capable SOTA cloud provider (#1173)
Follow-up to the Venice provider PR. Wire api.venice.ai into the three host allowlists so Venice behaves like the other paid OpenAI-compatible clouds: - agent_loop: add api.venice.ai to _API_HOSTS so the agent sends native OpenAI tool-call schemas (Venice supports function calling) instead of degrading to fenced-block parsing. - teacher_escalation: add api.venice.ai to _SOTA_HOSTS so the escalation loop stays OFF for Venice (it's a paid top-tier API; no need to add teacher-model latency). - webhook_routes: add venice to KNOWN_PROVIDERS so the sync chat webhook can auto-resolve base_url from provider=venice. Tests: tests/test_venice_hosts.py pins tool-host matching + SOTA classification for Venice; py_compile on touched modules. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
280c29d572 |
Security: owner-scope v1 chat endpoint fallback
The sync-chat endpoint's Case 3 fallback selected a ModelEndpoint with an unscoped `query(ModelEndpoint).filter(is_enabled == True).first()` and then used that row's decrypted `api_key` for the LLM call. ModelEndpoint is a per-user resource (owner non-null = private to that user), so a chat-scoped API token for user A that sent no session and no api_key could fall back onto user B's PRIVATE endpoint — spending B's API key/quota and reaching whatever internal base_url B configured. This is the same multi-tenant owner-scoping class already fixed for the session gate on this very endpoint (_caller_owns_session) and for companion/models. Scope the fallback to the token owner's own rows plus legacy null-owner (shared) rows via the existing owner_filter helper, matching routes/model_routes.py and companion/routes.py. A null/empty owner stays a no-op, preserving single-user/legacy behaviour. Add regression tests pinning the scoped fallback (cross-owner, shared-only, no-visible-row, disabled-owned, and the legacy null-owner no-op). |
||
|
|
bc00a9fc7f |
fix(security): fail closed on null-owner session in sync-chat endpoint (#870)
POST /api/v1/chat (the n8n/Make/Activepieces sync-chat endpoint) verified session ownership with `_tok_user and _sess_owner and _sess_owner != _tok_user`. The `_sess_owner and` clause skipped the check entirely whenever the session's owner was null — so any chat-scoped API token (e.g. a token minted for a paired mobile device) could pass a legacy/migrated null-owner session id, inject a message into that session, and read back its conversation history plus reuse the owner's endpoint credentials. This is the same `if owner and owner != user` null-owner-bypass pattern that was already hardened in the gallery, calendar, and notes routes (see test_null_owner_gates.py) and in session_routes._verify_session_owner. Make this gate strict and fail closed too: require a resolvable caller and an exact owner match, mirroring _verify_session_owner. Extract the decision into _caller_owns_session() and pin it with regression tests. |
||
|
|
2c4b8b57dd |
feat(ai): add OpenRouter and Ollama Cloud providers (#231)
Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com> |
||
|
|
e5c99a5eee | Odysseus v1.0 |