117 Commits

Author SHA1 Message Date
nopoz 7e5db9a3c6 fix(security): redact credential-bearing URLs and PII from logs (#4750)
* fix(security): redact credential-bearing URLs and PII from logs

Several log statements emitted sensitive data in clear text:

- model_routes / chat_routes / contacts_routes logged endpoint URLs raw.
  Admin-configured URLs can embed credentials in userinfo or query
  (e.g. https://user:pass@host, ?api_key=...). Route them through a
  shared core.log_safety.redact_url() that drops userinfo/query/fragment.
- note_routes / task_scheduler logged operator email addresses (smtp_user,
  recipient). Replaced with presence booleans, which keeps the diagnostic
  ("why didn't this send") without writing PII to logs.

model_routes already had a local redactor on its HTTPStatusError branch;
the generic except branch was missed, so reuse the existing helper there.

Clears CodeQL py/clear-text-logging-sensitive-data alerts 264, 317, 324,
325, 343, 344, 528.

* fix(security): re-bracket IPv6 hosts and single-source the URL redactor

Address review on #4750:
- redact_url now re-brackets IPv6 literals so host:port stays
  unambiguous (https://[2001:db8::1]:8443/v1, not the bracket-less
  ambiguous form).
- point model_routes._redact_url_for_log at the shared helper so the
  two redactors are single-sourced (also picks up the IPv6 fix).
2026-06-22 23:12:39 +02:00
nopoz 2f246c7779 fix(security): escape backslashes in calendar bg-image CSS url() (#4712)
* fix(security): escape backslashes in calendar bg-image CSS url()

The calendar event-background CSS escaped ' -> \' for a bg: image URL but
not backslashes first. Inside a single-quoted url('...'), \ is the CSS
escape char, so a URL value ending in/containing a backslash escapes the
closing quote and breaks out of the string, injecting arbitrary CSS. The
bg:<url> value is per-event and CalDAV-syncable, hence untrusted (CodeQL
js/incomplete-sanitization).

Add a single canonical _cssUrlEscape() in calendar/utils.js that escapes
backslashes FIRST, then quotes, and route all four sinks through it:
calendar.js:416 / :1263 (the flagged #463/#464), the event-form preview
(:2931), and _calBgCss() in utils.js — the latter two share the identical
bug but were unflagged. Output is byte-identical to the old escaping for
legitimate URLs (which contain no backslashes); only malicious input differs.

Resolves CodeQL js/incomplete-sanitization #463, #464.

* fix(security): route remaining calendar bg url() sinks through _cssUrlEscape

Review (vdmkenny) flagged that the centralization missed an injectable
sibling sink: the edit-form color-picker swatch (calendar.js:2856) built
`url('${url}')` from `existing.color` (a CalDAV-syncable, untrusted `bg:`
value) raw, then interpolated it into `style="background:..."` via innerHTML
- the same `'`/`\` breakout class as the sinks already fixed. The custom-dot
preview (:2953) was likewise raw (non-exploitable - a CSSOM `.style`
assignment of a URL the current user just picked - but it broke the invariant).

Route both through `_cssUrlEscape`, and normalize the two pre-escaped-variable
sites (_calItemBgStyle, _renderWeek) to the same inline form so all five
url() interpolations in calendar.js follow one rule. Add a whole-file
invariant test asserting every `url('${...}')` calls `_cssUrlEscape` - this
catches a future missed sink, the exact failure mode here. Behavior-identical
for legitimate URLs (no visual change).
2026-06-22 21:17:52 +02:00
Rudra Sarker 8ec27fd903 fix: document read fails with 403 when auth is disabled (#4623)
* fix: document read fails with 403 when auth is disabled

Add _auth_disabled() bypass in _verify_doc_owner() and the
/api/documents/{session_id} route guard so documents remain accessible
in single-user / no-auth mode.

Minimal change: only adds the auth-disabled check alongside existing
403 raises — preserves existing formatting and line endings.

* refactor: hoist _auth_disabled import to module level

Address reviewer feedback on PR #4623 — no circular import exists
(src.auth_helpers only imports stdlib + fastapi), so the inline
imports are unnecessary. Moves the import to module top in both
document_helpers.py and document_routes.py.

* test: add regression tests for auth-disabled document access (PR #4623)
2026-06-22 21:01:11 +02:00
MACKAT05 b57989f08c fix(hwfit): repair remote Windows hardware scan over SSH (#4674)
Remote Cookbook hwfit probes failed on Windows hosts because the PowerShell script was sent as nested -Command quoting through OpenSSH. Use -EncodedCommand for remote probes, auto-detect platform when omitted (including Darwin for Mac SSH hosts), and return a clearer error when SSH works but the probe fails.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 20:59:09 +02:00
Gabriel Peña 91bba117c1 fix ask-user choices across reloads (#4669) 2026-06-22 20:49:49 +02:00
Mocchibird 4c82e4a172 fix(ui): route transient dropdown menus through escMenuStack to stop listener leaks (#4684)
The app's ad-hoc dropdown/context menus each wire their own document-level
outside-click listener, but that listener only removes itself on an *outside*
click. Every other dismissal path -- clicking a menu item (which calls
el.remove() directly), a Cancel button, Escape, or the "close the
previously-open menu" reopen sweep -- tears the node down without
unregistering the listener, orphaning it on `document`. The stranded listener
then lingers and can break the next menu interaction: the recurring "the
button stops working until I refresh the page" class of bug (e.g. delete an
email, then the kebab/more button is dead on the other rows).

Route all 16 of these menus through the existing escMenuStack helper
(bindMenuDismiss / dismissOrRemove), exactly as documentLibrary.js
_showLibDropdown, cookbookRunning.js, and research/panel.js already do: a
single idempotent close() owns the teardown and is released on every dismissal
path, reopen sweeps use dismissOrRemove() instead of a bare .remove(), and
Escape flows through the central LIFO esc-stack arbiter. Net -49 lines.

Menus migrated: cookbook _showDepMenu; document export menu and
_openDocAiReplyChoice; emailInbox _showEmailMenu; emailLibrary
_showReaderMoreMenu / _showCardMenu / _showBulkActionsMenu; gallery
_showGalleryBulkMenu; notes _pickCustomDate / _openNoteCornerMenu; settings
(3 unified-integrations dropdowns); skills _openSkillMenu; tasks
_showTaskDropdown; compare _toggleExportMenu.

Per-menu semantics preserved (anchor-as-inside tests, the tasks 250ms
ghost-click guard, emailLibrary's reader-more-active anchor class and the
bulk-Cancel select-mode reset, settings' reused-vs-recreated lifecycles).

Six menus with custom lifecycles (notes _openReminderMenu, sessions
long-press, document markdown-toolbar, emojiPicker, compare model selector)
are intentionally left for a follow-up -- each needs individual review.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 20:40:56 +02:00
Ahmed Dlshad b899095f18 docs(setup): note -BindHost flag for LAN access on native Windows (#4636)
The native Windows launcher binds to 127.0.0.1 via its own -BindHost
parameter and does not read APP_BIND/ODYSSEUS_HOST from .env, so editing
.env alone leaves the server on loopback. Document the -BindHost flag in
the Native Windows setup section, with the existing keep-auth-on /
don't-expose-publicly caveats.

Fixes #4552

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 20:29:55 +02:00
Mostafa Eid 888e25624f fix(sessions): prevent Backspace/Delete from deleting session while renaming (#4662) 2026-06-22 20:22:52 +02:00
comatrix-1 c062c27648 Fix link to CONTRIBUTING.md in setup documentation (#4677) 2026-06-22 20:12:04 +02:00
holden093 93ec7cbb52 fix(contacts): verify UID removal after CardDAV DELETE (#4642)
Add a post-delete verification step: after the CardDAV server returns
2xx/404, force-re-fetch the contact list and confirm the UID is gone.
If the UID is still present, log a warning and return False instead of
silently reporting success.

This catches the case where _resolve_resource_url falls back to the
guessed {uid}.vcf URL but the contact's real resource URL differs —
the DELETE hits the wrong URL, server returns 404 (treated as success),
but the contact remains. Previously this caused silent persistence
failures and agent loops.
2026-06-22 18:39:44 +02:00
ooovenenoso c12b8ab6c9 fix: add OpenCode setup provider aliases (#4700)
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-22 17:33:02 +02:00
Ashvin e812a29233 fix(markdown): preserve URLs inside inline code spans (#4681)
Inline backtick spans were converted to <code> only at the end of
mdToHtml, after the bare-URL autolink and <a>/allowed-HTML passes. A URL
inside inline code is preceded by a space, so the autolink wrapped it in
an <a> tag and swapped it for an ___ALLOWED_HTML_ placeholder, corrupting
commands like `irm http://127.0.0.1:3000/x`.

Extract inline code into placeholders before the link passes, mirroring
the existing fenced-code-block handling, and restore them last so
placeholders carried inside restored <a> blocks resolve. Escape the code
at extraction time since it now bypasses the global escape pass.
2026-06-22 17:23:55 +02:00
nopoz ca4973c41f fix(security): prevent exponential ReDoS in email→calendar extract regex (#4708)
The fallback regex in email_pollers.py that recovers a
[{"action": ...}, ...] JSON array from raw model output used lazy
[^[\]]*? runs inside a (?:,\s*\{...\}\s*)* repetition, which backtracks
exponentially (CodeQL py/redos) on inputs like [{"action"},{ + }},{{ * N.
It runs on the LLM reply to an email→calendar prompt embedding the
untrusted email body, so a crafted email can stall the background poller.

Extract the pattern to a module-level _CAL_ACTION_ARRAY_RE and rewrite the
object-content class from the lazy [^[\]]*? to a greedy brace-delimited
[^{}], which removes the quantifier ambiguity. The match is linear (a 500KB
adversarial input now resolves in <1ms) and equivalent on well-formed
arrays; it is also strictly more robust for values containing '[' or ']'
(the old class bailed on those and extracted nothing).

Resolves CodeQL py/redos #198.
2026-06-22 17:18:34 +02:00
Tom 91b4171b3f feat(a11y): add a Text size control and an OpenDyslexic font option (#4210)
* feat(a11y): add a Text size control and an OpenDyslexic font option

Text size: a Theme > Font & Layout control (Default / Larger) that scales the whole UI via CSS zoom, so the many hard-coded px sizes scale too (density only moves the root font-size). Stored globally so it persists across theme switches; applied early in the boot script to avoid a flash. OpenDyslexic: a dyslexia-friendly self-hosted font (SIL OFL 1.1), bundled as woff2 alongside Fira Code/Inter and wired into the Font select. Reuses the existing density/font pattern end to end; no new colours, spacing, or component styles.

* fix(a11y): keep modals on-screen at Larger text size

Inline vh heights on .modal-content overrode the ui-scale-125 max-height
compensation, so Cookbook (and the email/doc/skills/PDF modals) overflowed
the viewport at 125% — pushing the header and close button off-screen.
Let the compensation own those heights.

* fix(a11y): keep PDF export modal at its original 86vh on Default size
2026-06-22 13:53:46 +02:00
PewDiePie d36879bd50 Merge pull request #4706 from pewdiepie-archdaemon/sync-readme-screenshot-dev
docs: refresh README screenshot
2026-06-22 14:02:35 +09:00
pewdiepie-archdaemon b51656770d Refresh README screenshot 2026-06-22 04:54:15 +00:00
PewDiePie 5f63a3d3bd Merge pull request #4701 from pewdiepie-archdaemon/sync-dev-from-main-20260622
chore(dev): sync main cookbook and model workflow fixes
2026-06-22 11:52:26 +09:00
pewdiepie-archdaemon 993d504de3 Clear remaining CodeQL path and parser alerts 2026-06-22 02:45:05 +00:00
pewdiepie-archdaemon fbdec22dcb CodeQL hardening for cookbook sync 2026-06-22 02:39:18 +00:00
pewdiepie-archdaemon 19dd82b8f6 CI test fixes for dev sync 2026-06-22 02:20:15 +00:00
pewdiepie-archdaemon 57e7229219 CI fixes for cookbook workflow sync 2026-06-22 02:08:25 +00:00
pewdiepie-archdaemon 92daf4e560 Cookbook launch and gallery upload fixes 2026-06-22 01:49:15 +00:00
pewdiepie-archdaemon 75f04bc088 Merge origin/dev into main 2026-06-21 11:08:50 +00:00
pewdiepie-archdaemon c504214925 Cookbook model workflow fixes 2026-06-21 11:02:35 +00:00
nopoz 160267417e fix(personal): scope RAG file delete to the caller's own upload dir (#4602)
The DELETE /api/personal/file disk-delete containment check used the
shared PERSONAL_UPLOADS_DIR root, so one admin could delete another
user's personal upload by passing its path (uploads are partitioned per
owner under <root>/<owner>/). Confine the check to the caller's own
per-owner subdir via _personal_upload_dir_for_owner(owner). RAG removal
and listing exclusion are unchanged (they still serve non-upload indexed
sources). Adds a regression test for the cross-owner case.
2026-06-20 00:50:15 +02:00
Kenny Van de Maele ed18192a8e refactor(tools): move session tools to the agent_tools registry (#4454)
Moves create_session, list_sessions, send_to_session and manage_session out of
ai_interaction.py into src/agent_tools/session_tools.py (the do_ prefix
dropped) and registers them in TOOL_HANDLERS, so dispatch flows through the
registry instead of the dispatch_ai_tool elif in tool_execution.py. Same
pattern as the model-interaction move.

The bodies move verbatim; each fetches the runtime-set session manager via a
get_session_manager() shim, and reuses _resolve_model / AI_CHAT_TIMEOUT from
ai_interaction. manage_session's internal 'list' alias is repointed from the
old do_list_sessions to the moved list_sessions. stream_ai_tool (dead, no
callers) and do_pipeline stay put. dispatch_ai_tool loses its four now-unused
branches.

Tests: test_session_tools_registry covers registration, owner threading, the
manage_session->list_sessions delegation, graceful no-manager handling, and
registry dispatch. Verified end-to-end against a live SessionManager.
2026-06-19 11:55:22 +02:00
nopoz 076e8c93c9 fix(ui): escape model name in model-info popup (DOM-XSS) + two latent sinks (#4605)
chatRenderer.js built the model-info popup HTML by concatenating the
model name (from the LLM response's model/answered_by field) into
popup.innerHTML without escaping, so a model advertised as an HTML/script
payload executed when the user clicked the role label. Wrap both
insertions with the uiModule.esc() helper the same function already uses.

Also apply existing escape helpers at two latent sinks flagged by CodeQL,
fed only by self-authored/server values today: document-tab title via
_esc(), and the calendar event background URL (escape the double quote
that would otherwise break out of the style="..." attribute).
2026-06-19 11:03:44 +02:00
Kenny Van de Maele a226c94df7 chore(deps): remove unused @anthropic-ai/sdk dependency (#4566)
Never imported anywhere in the codebase (unused since v1.0); it is the only
root dependency and nothing depends on it. Removing it also drops 6 transitive
packages from the lockfile.

Fixes #4565
2026-06-19 09:40:35 +02:00
RaresKeY 057ec0552c fix(cookbook): stop Windows process trees (#4283) 2026-06-19 00:28:25 -07:00
Kenny Van de Maele cdae9879f2 feat(agent): add manage_bg_jobs tool to inspect and kill background bash jobs (#4577)
Detached bash jobs (#!bg) could be launched and auto-reported on completion,
but the agent had no way to act on a running one: no on-demand output read and
no kill (it blocked until the 1h max-runtime). bg_jobs had the pieces
(_read_output, list_for_session, internal _kill) but none was exposed.

Adds:
- bg_jobs.kill(job_id): tears down the process tree, marks the job killed, and
  sets followed_up so the monitor does not also auto-continue a deliberate kill.
- manage_bg_jobs registry tool with actions list / output / kill, scoped to the
  chat that launched the job (cross-session access reads as not found).
- Wiring: TOOL_HANDLERS/TAGS, function schema, RAG index + keyword hints, parser
  name map, dispatch (threads session_id via _direct_fallback). Gated like bash
  (NON_ADMIN_BLOCKED_TOOLS; plan-mode mutator).
- agent_loop: background-job intent regex maps to the files domain (and the tool
  joins _DOMAIN_TOOL_MAP[files]) so short commands like 'kill that job' are not
  dropped by the low-signal gate that skips tool retrieval.
- bg launch message tells the model to call manage_bg_jobs itself for check/stop
  rather than printing raw tool syntax to the user.

Tests: tests/test_bg_job_tools.py (kill semantics, per-chat scoping, actions,
and the intent classifier).
2026-06-19 00:28:22 -07:00
pewdiepie-archdaemon 8c46172e87 Sidebar + theme: drop hamburger cycle no-op branch; add brandMixTo CSS var to themes for logo-gradient end color 2026-06-19 00:35:08 +00:00
pewdiepie-archdaemon e442cc859d Research panel: inline Library-link hint when there are no past runs (replaces the standalone past-research column) 2026-06-19 00:35:02 +00:00
pewdiepie-archdaemon a01c3da75f Notes: checklist/todo/goal classification + agent-stream-complete state class for done indicator 2026-06-19 00:34:57 +00:00
pewdiepie-archdaemon 23ed92d965 Email Library: render tag chips + spam verdict pill on the email row 2026-06-19 00:34:52 +00:00
pewdiepie-archdaemon 8cc76b53a2 Chat: first-token wait timer cleanup so per-pane timeouts dont leak when a response finishes mid-wait 2026-06-19 00:34:47 +00:00
pewdiepie-archdaemon 20c6ba8f32 Bump APP_VERSION to 1.0.1 2026-06-19 00:34:37 +00:00
pewdiepie-archdaemon 9adb940ef9 Agent stream: 10s heartbeat keepalive on the SSE subscribe so long-running thinking models dont drop the connection 2026-06-19 00:34:30 +00:00
pewdiepie-archdaemon 2fbfd22946 Agent loop: compact one-line tool-usage hints for local/small models so the system prompt doesnt eat the context budget 2026-06-19 00:34:24 +00:00
pewdiepie-archdaemon a10bfc466b Model endpoints: per-category probe timeouts (15s local / 3s ollama / 2s api) so slow first-token launches arent killed 2026-06-19 00:34:19 +00:00
pewdiepie-archdaemon 18f29bdfd8 Email send: normalize address fields to strip trailing commas + stray whitespace before MIME encoding 2026-06-19 00:34:13 +00:00
pewdiepie-archdaemon 63d9b12b22 Cookbook Running: short-circuit polls for Ollama sidecar tasks so status stays running
Three different background loops (_reconnectTask reachability poll,
_checkServeReachability, _pollBackgroundStatus) each independently
flipped Ollama sidecar tasks between running and stopped because the
`docker exec ollama-rocm ollama show <tag>` cmd exits cleanly after
its verification print, which the loops misread as the serve dying.

Added _isOllamaSidecarTask(task) and an early-bail in each of the
three loops so the task stays pinned to running once the show-cmd
exits 0. Also the tmux-graceful-kill path prepends a
`docker exec ollama-rocm ollama stop <tag>` before tearing down
the tmux session, so the Ollama-side model load gets unloaded too
(was leaving the model resident in the daemon after Stop).
2026-06-19 00:33:48 +00:00
pewdiepie-archdaemon ee6fd8ffe8 Cookbook UI: backend-aware env vars, always-show MoE/EP/Reasoning toggles, GPU default, Firefox-mobile expand
Frontend half of the backend-detection + per-OS install command work,
plus a pile of mobile/UX fixes:

Backend awareness:
- _gpuEnvPrefix() picks CUDA_VISIBLE_DEVICES / HIP_VISIBLE_DEVICES /
  nothing based on detected hwfit backend + scanned-host match (so a
  stale ajax scan does not leak CUDA env vars into a kierkegaard
  Vulkan launch). Replaces 6 hardcoded CUDA_VISIBLE_DEVICES sites.
- GGML_CUDA_ENABLE_UNIFIED_MEMORY only emitted when backend is
  actually CUDA (was leaking onto Vulkan/ROCm via saved presets).

Per-target install command:
- Dep rows render a single mono command box + Copy button when the
  server resolved pkg.install_cmd_for_target. Reused in the build-deps
  install failure toast so the toast and the row show the same line.
- Diagnosis patterns split cmake/g++/git out of the generic
  llama-cpp-python catch-all so a missing-cmake failure surfaces a
  cmake-specific message + per-distro Copy buttons.

Form toggles always visible:
- Reasoning Parser, Expert Parallel, MoE Env Vars no longer gated on
  model-family detection. Detection still hints (parser tag shown when
  matched); toggle works with sensible defaults otherwise. MiniMax M-
  series added to MoE family detector so the auto-fill is right.

Mobile + GPU default:
- Launch tab cached-list flex collapsed to 0px on mobile because the
  desktop `flex: 1 1 0` had no parent height to grow into. Override
  to `flex: 0 0 auto` in the cookbook mobile @media block.
- doclib-card expand on mobile (Firefox no :has() support) pins
  explicit px heights so the launch form actually appears.
- llama_mode defaults to gpu when hwfit detected cuda/rocm/vulkan/
  metal on the current target, instead of always cpu (which was
  forcing -ngl 0 on first-open and burning 35GB models on CPU).
2026-06-19 00:33:37 +00:00
pewdiepie-archdaemon f01465e87f Cookbook Dependencies: per-OS+backend install command + install-system-deps endpoint
When a llama.cpp launch needs cmake/build-essential/git the user used to
get a four-distro dump ("apt: x / pacman: y / dnf: z / brew: w") and
had to pick the right one. Now:

- shell_routes /api/cookbook/packages probes /etc/os-release on the
  target in the same SSH round-trip as the existing system-prereq
  check, classifies into debian / arch / fedora / alpine / suse /
  macos, and builds a single install_cmd_for_target string from the
  (os_family, backend) matrix. CUDA hosts get nvidia-cuda-toolkit;
  ROCm gets rocm-dev / rocm-hip-sdk; Vulkan gets libvulkan-dev /
  vulkan-headers; etc.

- llama_cpp catalog entry gets system_prereqs: [cmake, g++, git].
  When any of those are missing on the target, the row picks up
  pkg.build_deps_missing + pkg.install_cmd_for_target for the
  frontend to render.

- New POST /api/cookbook/install-system-deps endpoint runs the right
  package manager via passwordless sudo on the target. Allowlisted to
  {cmake, build-essential, g++, gcc, git, tmux, make}; sudo -n only
  so it can never hang waiting for a password (returns a clear
  "passwordless sudo unavailable" error via stderr instead).
2026-06-19 00:33:19 +00:00
pewdiepie-archdaemon 1324e1b0d5 Cookbook backend detection: report Vulkan on AMD hosts without ROCm; gate CUDA build on actual NVIDIA hardware
Three classes of incorrect detection fixed:

(1) AMD GPU + no ROCm installed (e.g. Strix Halo) was reported as
    backend=rocm everywhere, so launch commands emitted
    HIP_VISIBLE_DEVICES (silent no-op on Vulkan) and the from-source
    build path failed. Both _probe_amd_sysfs (routes/cookbook_routes)
    and _detect_amd (services/hwfit/hardware) now probe rocminfo /
    hipconfig / vulkaninfo at detection time and report vulkan when
    only Vulkan is present.

(2) Build helper was picking the CUDA branch on AMD hosts whenever a
    stray pip-installed nvcc was on PATH (vLLM wheels carry one
    without libcudart). Added _odysseus_has_nvidia_hw() that checks
    nvidia-smi / /dev/nvidia* / lspci, and gates both the nvcc PATH
    augmentation and the CUDA elif branch on real hardware.

(3) Build chain reordered to ROCm/HIP > CUDA > Vulkan > CPU. Vulkan
    tier added between CUDA and CPU as a portable fallback for hosts
    with a GPU but no native toolchain (the common Strix Halo case).
    Same _append_llama_cpp_linux_accel_build_lines also auto-attempts
    sudo -n apt/pacman/dnf install of cmake/build-essential/git when
    they are missing, surfacing a clear no-passwordless-sudo warning
    otherwise.
2026-06-19 00:33:07 +00:00
pewdiepie-archdaemon b3e186746a Docker compose: mount docker.sock + install Docker CLI so Cookbook can reach sibling containers
Cookbook now needs to docker-exec into ollama-rocm (and any other sibling
container holding a model server) from inside its own container, so:

- Dockerfile installs the Docker CLI from the static binary tarball
  (the Debian docker.io package ships dockerd but not the client on slim)
- docker-compose.yml bind-mounts /var/run/docker.sock and adds group_add
  for the host docker group (default GID 963)
- entrypoint.sh detects the socket GID, creates a local group with that
  GID, and runs usermod -aG before gosu-dropping to the app user so the
  supplementary group propagates through (gosu strips by default)
2026-06-19 00:32:47 +00:00
Michael 39a802bea2 fix(tools): prune skipped dirs before descending in glob tool (#4538)
* fix(tools): prune skipped dirs before descending in glob tool

GlobTool used pathlib.Path.rglob which descends into every directory
(including node_modules, .git, dist, etc.) and filters AFTER the walk.
On repos with large junk directories this causes the glob tool to hang
for minutes.

Replace rglob with os.walk that prunes _CODENAV_SKIP_DIRS before
descending — matching the approach GrepTool already uses. Also add a
fast path for literal patterns (no wildcards → direct path lookup).

Fixes #4493

* fix(tools): use regex glob matching to fix * semantics and literal fallback

Replace fnmatch with _glob_to_regex so that * stays within a single
path segment (matching pathlib/rglob semantics) and **/ spans zero or
more directories.  Literal patterns now fall through to os.walk when
the direct path lookup misses, so e.g. 'foo.py' still finds files at
any depth.

Add tests for:
- bare literal matching in subdirectories
- multi-segment single-star patterns (sub/*.txt)
- * not crossing / boundaries
- ** matching at arbitrary depth

Closes #4493

---------

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-18 22:02:29 +02:00
RaresKeY 1cc8a373b0 fix(cookbook): validate agent SSH targets (#4429) 2026-06-18 21:41:33 +02:00
Wei Hong a52ac6822b fix(cookbook): pull llama.cpp from the ggml-org GHCR namespace (#4457) (#4490)
The Dependencies tab's llama.cpp docker recipe surfaced
\`docker pull ghcr.io/ggerganov/llama.cpp:server-cuda\`. The upstream
repo moved from github.com/ggerganov/llama.cpp to
github.com/ggml-org/llama.cpp and the old GHCR namespace no longer
publishes images, so copying the recipe failed with:

  failed to resolve reference "ghcr.io/ggerganov/llama.cpp:server-cuda":
  not found

Point the recipe at \`ghcr.io/ggml-org/llama.cpp:server-cuda\`, which is
already the namespace routes/cookbook_routes.py uses for the source
clone. Adds a regression test in the same shape as
test_cookbook_diagnosis_js.py asserting the new namespace and forbidding
the dead one.

No CSS/HTML/SVG/style changes — the file is a pure data module
(no DOM access) consumed by other renderers; only the displayed command
text changes.
2026-06-18 21:29:47 +02:00
Wei Hong 7475779b7c fix(chat): track chat hot-path background tasks for strong references (#4443) (#4444)
Two background tasks scheduled on every chat completion in
routes/chat_helpers.py — the memory/skill extraction dispatch and the
session auto-namer — are created via bare asyncio.create_task(...).
asyncio only holds a weak reference to the outer task, so the GC can
collect it mid-execution and the work silently never runs.

Add a module-private _BG_TASKS set and a _spawn_bg() helper that mirrors
WebhookManager._spawn_tracked (the pattern #3964 / #4336 established for
the webhook emitters two lines apart in the same function). Route both
call sites through it so the lifecycle owner is explicit.

Adds an AST-level guard test that fails on any bare
asyncio.create_task(...) statement in routes/chat_helpers.py to prevent
a regression — same shape as test_webhook_emitters_use_manager.py from
#4336.

The same bare pattern exists in routes/email_routes.py and
routes/cookbook_routes.py; left out of this PR per CONTRIBUTING.md's
"one fix per PR" and tracked in #4443's "Additional Information" for a
follow-up.
2026-06-18 21:26:11 +02:00
Christian Eriksson e7ffc69729 fix(cookbook): scope the "Kill vLLM" diagnosis to actual vLLM tracebacks (#4517)
The diagnosis panel offered a "Kill vLLM processes" (pkill -f vllm) recovery
for ANY Python traceback — including pip build failures and other tracebacks
that have nothing to do with vLLM. That advice is useless for a build failure
and harmful if an unrelated vLLM server happens to be running.

ERROR_PATTERNS in static/js/cookbook-diagnosis.js had one catch-all traceback
matcher that always attached the vLLM-kill fix. Split it into three (all
keeping the existing healthy-server suppression):
- pip build failure (Failed to build / metadata-generation-failed /
  subprocess-exited-with-error / Could not build wheels) -> "a dependency
  failed to build" message, no kill.
- vLLM-specific traceback (tail mentions vllm) -> keeps the kill, now scoped.
- any other traceback -> neutral "check the captured output" message, no kill.

How to test:
- node --check static/js/cookbook-diagnosis.js
- Trigger a wheel-build failure (old package on a newer Python) or a non-vLLM
  traceback and open the diagnosis. Before: generic traceback message + "Kill
  vLLM processes" button. After: a build-failure / neutral message with no kill;
  only a real vLLM traceback still offers it.

Fixes #4516

Co-authored-by: Claude
2026-06-18 21:18:14 +02:00
Karl Jussila 396e26b4bf fix(auth): tie remember-me cookie lifetime to TOKEN_TTL (#4472)
The persistent login cookie's max_age hardcoded 60 * 60 * 24 * 7, an
independent copy of the session token lifetime that core/auth.py already
defines once as TOKEN_TTL (and reports to the frontend via /api/auth/policy
as session_days). If TOKEN_TTL changes, the cookie silently drifts: the
browser keeps a cookie for a token whose lifetime no longer matches.

Import TOKEN_TTL and use it for the cookie max_age so the session lifetime
has a single source of truth. No behaviour change at the current value.

Fixes #4471
2026-06-18 21:15:48 +02:00
nubs 0bfc7750a2 fix(llm): route gpt-oss harmony commentary channel without leaking markers/tool-args (#4523)
The harmony stream router only recognized the analysis and final channels, so
gpt-oss's standard `commentary` channel (tool-call preambles / function-arg
bodies) was unhandled: the literal `<|channel|>commentary` marker, the
`to=functions.*` recipient, and the commentary body all leaked into the
visible answer. Add commentary to the marker regex + the suffix-hold table, and
route its body to thinking (only `final` is user-facing). Adds a regression
test (split-chunk + recipient + body), verified to fail without the fix.
2026-06-18 21:12:25 +02:00
Rolly Calma 790ef81b06 fix: use aware UTC in health timestamp (#4503) 2026-06-18 20:58:25 +02:00
Victor 804691501f test: stop test_skill_index_prompt_injection leaking a stub prefs_routes (#4387)
_patch_prefs installs a fake routes.prefs_routes with a bare
sys.modules[...] = assignment that is never undone. The stub is an empty
ModuleType without _save_for_user, so a later test whose code path runs
`from routes.prefs_routes import _save_for_user` (e.g. test_backup_import_skills)
fails with ImportError under an unfavorable test order.

Install the stub with monkeypatch.setitem instead (the helper already takes
monkeypatch and uses it for DATA_DIR) so it is reverted at teardown.

Repro: pytest tests/test_skill_index_prompt_injection.py tests/test_backup_import_skills.py
(1 failed before, 5 passed after).
2026-06-18 20:54:15 +02:00
dependabot[bot] 8e6a2e89f8 chore(deps): bump actions/checkout in the actions group (#4559)
Bumps the actions group with 1 update: [actions/checkout](https://github.com/actions/checkout).


Updates `actions/checkout` from 6.0.3 to 7.0.0
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/df4cb1c069e1874edd31b4311f1884172cec0e10...9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 20:49:58 +02:00
dependabot[bot] dbcc7874bf chore(deps): bump the npm group with 2 updates (#4558)
Bumps the npm group with 2 updates: [@anthropic-ai/sdk](https://github.com/anthropics/anthropic-sdk-typescript) and [@antithesishq/bombadil](https://github.com/antithesishq/bombadil).


Updates `@anthropic-ai/sdk` from 0.104.1 to 0.105.0
- [Release notes](https://github.com/anthropics/anthropic-sdk-typescript/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-typescript/compare/sdk-v0.104.1...sdk-v0.105.0)

Updates `@antithesishq/bombadil` from 0.5.0 to 0.6.1
- [Release notes](https://github.com/antithesishq/bombadil/releases)
- [Changelog](https://github.com/antithesishq/bombadil/blob/main/CHANGELOG.md)
- [Commits](https://github.com/antithesishq/bombadil/compare/v0.5.0...v0.6.1)

---
updated-dependencies:
- dependency-name: "@anthropic-ai/sdk"
  dependency-version: 0.105.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: npm
- dependency-name: "@antithesishq/bombadil"
  dependency-version: 0.6.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: npm
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 20:42:49 +02:00
RaresKeY 16e660ad09 fix(hwfit): normalize CPU arch for fallback estimates (#4441) 2026-06-18 20:26:22 +02:00
Mazen Tamer Salah b51d83b16d fix(agent): index api_call so RAG tool selection can retrieve it (#3923)
* fix(agent): index api_call so RAG tool selection can retrieve it

api_call exists in FUNCTION_TOOL_SCHEMAS and the agent's system prompt
advertises configured API integrations, but the tool had no entry in
BUILTIN_TOOL_DESCRIPTIONS. RAG tool selection embeds those descriptions and
retrieves the top-K per message, so a tool without one can never be selected:
the agent claims it can call Home Assistant/Miniflux/Gitea/etc. and then
never receives the api_call schema (unless the Personal Assistant
ASSISTANT_ALWAYS_AVAILABLE path applies).

Add a retrieval-rich description for api_call, plus an ast-based parity test
asserting every FUNCTION_TOOL_SCHEMAS tool has an index description so the
next added tool cannot silently drift the same way.

Fixes #3794

* fix(agent): route API-integration intent to api_call at selection time

Addresses review (RaresKeY) on #3923: indexing api_call in the ToolIndex
description was necessary but not sufficient — the #3794 repro ('Use the
api_call tool to call Home Assistant GET /api/states') matched no domain in
_classify_agent_request, classified as low-signal, so the agent loop skipped
retrieval entirely and the schema filter sent only ALWAYS_AVAILABLE
(manage_memory/ask_user/update_plan). api_call never reached the model.

- _classify_agent_request: detect API-integration intent (api_call,
  integration(s), Home Assistant/Miniflux/Gitea/Linkding/Jellyfin) -> new
  'integrations' domain, so the turn is no longer low-signal.
- _DOMAIN_TOOL_MAP['integrations'] = {api_call}: deterministically seeds
  api_call into relevant tools after retrieval, independent of embeddings.
- _DOMAIN_RULES['integrations']: rule pack (required — _domain_rules_for_tools
  indexes _DOMAIN_RULES[domain] directly).
- tool_index _KEYWORD_HINTS: parity hint for the retrieval / keyword-fallback
  paths.
- Regression drives the real classifier -> domain-map -> FUNCTION_TOOL_SCHEMAS
  filter chain and asserts api_call is advertised for the #3794 prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:43:25 +00:00
Shreyas S Joshi f70db19cc6 fix(document): allow render-pdf to be framed and 503 cleanly on missing PyMuPDF (#2103)
* fix(document): allow render-pdf to be framed and 503 cleanly on missing PyMuPDF

Fixes #2101.

Two related bugs in the PDF-form library preview flow:

1. SecurityHeadersMiddleware was sending X-Frame-Options: DENY and
   frame-ancestors 'none' on /api/document/{doc_id}/render-pdf, but
   static/js/documentLibrary.js embeds the response in an <iframe> for
   the library card preview. The browser blocked the load with
   ERR_BLOCKED_BY_RESPONSE, leaving the user with a blank panel.

   Extend the existing is_tool_render exemption to also cover
   /api/document/.../render-pdf. Per-document owner checks still run in
   the route handler, so the exemption is scoped the same way as the
   tool-render exemption it mirrors. /api/document/.../export-pdf is
   left untouched — it's a download (Content-Disposition: attachment),
   not an iframe embed.

2. routes/document_routes.py:render_pdf called fill_fields, which
   raises RuntimeError via _require_fitz() when the optional PyMuPDF
   dependency isn't installed. That RuntimeError bubbled out as a
   generic 500 with a cryptic 'PDF render failed' detail.

   Reuse the existing _load_pdf_viewer_fitz() helper to fail fast with
   a 503 and a user-actionable install hint (mentions
   requirements-optional.txt and AGPL-3.0), matching the convention
   used by the other PDF endpoints.

Tests cover both fixes:
- middleware headers on /api/document/.../render-pdf (iframeable, but
  X-Content-Type-Options and Referrer-Policy are still set)
- middleware headers on /api/document/.../export-pdf (must stay strict)
- middleware path matching precision (similar-but-different paths stay
  strict)
- middleware headers on /api/tools/.../render (no regression)
- middleware headers on /api/chat (no regression)
- render-pdf returns 503 with install hint when PyMuPDF is missing
- 503 is raised before any file I/O (fail-fast ordering)

* chore: address maintainer feedback on PDF previews same-origin framing and comment trimming

* chore: make render-pdf regression tests order-independent
2026-06-18 06:25:26 +00:00
Kenny Van de Maele 56ba144875 refactor(tools): move model-interaction tools to the agent_tools registry (#4445)
Moves chat_with_model, ask_teacher and list_models out of ai_interaction.py
into src/agent_tools/model_interaction_tools.py (the do_ prefix dropped) and
registers them in TOOL_HANDLERS, so dispatch flows through the registry instead
of the dispatch_ai_tool elif in tool_execution.py.

The implementations are relocated, not wrapped. ai_interaction.py keeps only
the shared helpers they reuse (_resolve_model, AI_CHAT_TIMEOUT), still used by
the not-yet-migrated session/pipeline tools. dispatch_ai_tool loses its three
now-unused branches.

Also removes the dead do_second_opinion: it was already off the live tool
surface (no tag/schema/parsing/dispatch; tool_index.py notes it was removed),
so the function and its stale frontend catalog entries (admin.js, assistant.js)
are deleted.

Tests: owner-scope test points at the new list_models location and drops the
moved tools from the dispatch_ai_tool parametrize; a new
test_model_interaction_registry covers registration, owner threading, and
registry dispatch.
2026-06-18 05:56:37 +00:00
pewdiepie-archdaemon d70c00e8d2 Merge branch 'main' of https://github.com/pewdiepie-archdaemon/odysseus 2026-06-17 12:28:24 +00:00
Matyas Gosztonyi 97a7f59fe7 fix(ui): share one z-order stack across Notes and modals (#3798)
* fix(notes): bring pane above active windows

* fix(notes): align tool window z-order handoff

---------

Co-authored-by: Matyas Fenyves <16389204+uhhgoat@users.noreply.github.com>
2026-06-17 12:15:48 +02:00
Afonso Coutinho 24ace44888 fix: canvasCoords crashes on empty touch list (mobile race) (#2045) 2026-06-17 10:25:39 +02:00
Kenny Van de Maele 93569b141b fix(security): allowlist manage_mcp 'add' to close the agent-path RCE (#4433)
* fix(security): allowlist manage_mcp 'add' to close the agent-path RCE

do_manage_mcp('add') passed model- and prompt-injection-controlled command,
args, and env straight to a stdio subprocess spawn with no validation, and it
persisted an enabled server row before connecting (so a payload also survived
to re-execute on restart). A string smuggled into a skill description, memory
entry, fetched page, or email body could register a server running arbitrary
code as the app UID, e.g. command='sh' args=['-c','...'].

Add _validate_mcp_command, applied on the agent path before any DB write or
spawn:
- Hard-deny interpreters, runtimes, package runners, shells, and exec-wrappers
  (even if an operator lists one in ODYSSEUS_MCP_ALLOWED_COMMANDS).
- Require a bare basename (no path components, no shell metacharacters) that is
  present in the operator allowlist (empty by default).
- Reject code-exec argv flags by prefix so glued forms are caught too
  (-c/-e/-m/--eval/--exec/--print/--module/--command/--require), remote-URL
  args, and env keys that inject code into the child (LD_PRELOAD, NODE_OPTIONS,
  PYTHONPATH, DYLD_*, PATH, ...).

A rejected registration returns an error, writes no row, and makes no
connection. The trusted admin route is unchanged. Mirrors the policy intent of
_validate_serve_cmd but inverted for the model-reachable surface.

Supersedes #438; incorporates the bypass forms found in its review (interpreter
script paths, -m pip, glued -c/-e, --eval=, eval subcommands, package runners,
remote URLs) and adds integration coverage on the real do_manage_mcp path.

Closes #2891

* fix(security): deny versioned/alias runtimes in manage_mcp allowlist

Addresses RaresKeY's review on #4433. The hard-deny matched command names
exactly, so versioned or alias runtime forms (python3.11, node18, pip3,
ruby3.2, java, javac, bunx, tsx, ts-node, pypy3, ...) slipped past and, if an
operator allowlisted one, re-opened the prompt-injection-controlled MCP
registration path.

- Canonicalize a trailing version suffix before the deny check so versioned
  forms collapse to the family (python3.11 -> python, node18 -> node, pip3 ->
  pip); both the raw basename and the canonical form are denied.
- Broaden the denied-family set (java/javac/jshell/jbang/kotlin/dotnet/mono/
  swift/osascript/tsx/ts-node/bunx/pypy/jruby/raku/luajit/wish/expect/iex).

Deny runs before the operator allowlist, so an alias cannot be allowlisted back
in. Canonicalization only feeds the deny check, so a legit name that ends in a
digit still reaches the normal allowlist check rather than being mis-denied.
Adds validator + integration regressions for versioned/alias runtimes asserting
no DB row and no connection, including the allowlisted-anyway case.
2026-06-16 14:34:53 +00:00
Catalin Iliescu 9a00401507 fix(hwfit): use CPU fallback for cpu_only speed estimates (#4397)
* fix(hwfit): use CPU fallback for cpu_only speed estimates

* fix(hwfit): preserve ARM fallback for cpu_only estimates

---------

Co-authored-by: Cata <cata@bigjohn.local>
2026-06-16 14:18:31 +00:00
Aura Rays Lab 76562ae31d Change host from 0.0.0.0 to 127.0.0.1 in CONTRIBUTING.md (#4422)
Updated the host address in the run command for clarity.
2026-06-16 13:40:47 +00:00
Christian Eriksson 497f455da6 fix(cookbook): open() no longer crashes when a task has a diagnosis (#4417)
_showDiagnosis referenced an undefined `body` (left over from the refactor
that moved the diagnosis text into the toolbar), throwing a ReferenceError
whenever a failed task rendered fix buttons. Because open() wraps its render
in try/finally with no catch, the throw escaped before the modal was
un-hidden, so the whole Cookbook silently failed to open.

- cookbook-diagnosis.js: append the fixes row to `diag` (the in-scope
  container) instead of the removed `body` element.
- cookbook.js: guard the render passes in open() so one broken task card
  can't leave the entire panel stuck hidden.

Fixes #4406
2026-06-16 13:35:51 +00:00
Ashvin dd20c2bc75 fix(tasks): offer shell/file tools to scheduled task agents by default (#4398)
The scheduled-task runner built the agent's tool set from RAG retrieval plus
ASSISTANT_ALWAYS_AVAILABLE. Neither includes bash/python (nor the file tools),
and no keyword hint force-includes them, so a task only saw the shell when the
tool-embedding index happened to surface it. On hosts where that index is empty
or degraded (e.g. a fresh Docker deploy), retrieval returns nothing and the task
agent never receives bash/python — telling the user the shell is disabled even
for an admin owner.

Offer the shell/file group to task agents by default, mirroring the chat agent
where these are on unless a privilege or global setting turns them off. The
existing blocked_tools_for_owner() gate in stream_agent_loop still strips the
whole group for non-admin multi-user owners and only admits it for admins and
single-user (AUTH_ENABLED=false) deployments, so this changes what is offered,
not who is allowed. A crew that defines an explicit enabled_tools allowlist
still has its restriction honored.

Also merge the operator's global disabled_tools setting into the scheduler's
disabled set before composing relevant_tools and before entering the agent
loop, matching what chat already does. Without it, the global tool-disable
contract did not reach unattended scheduled tasks: an admin or AUTH_ENABLED=false
task could still see and call shell/file tools the operator had turned off
globally, since the prompt/schema/execution gates only enforce the disabled
tools passed in.
2026-06-16 13:27:30 +00:00
Afonso Coutinho a36b423a4e Fix odysseus-calendar list dropping in-progress / multi-day events (#2065)
cmd_list filtered on the event START falling inside the window
(dtstart >= start AND dtstart < end). The canonical web route
(routes/calendar_routes.py) and the recurrence contract test use
OVERLAP semantics for non-recurring events: dtstart < end AND
dtend > start. So an event that began before the window but is still
ongoing inside it — e.g. a 09:00-17:00 conference listed at 14:00, or
any multi-day event spanning the window — was silently dropped by the
CLI even though the web UI shows it. Use overlap, matching the route.
dtend is NOT NULL in the schema, so no null-end regression.
2026-06-16 14:04:56 +02:00
Rudy Wolf 4e477741e7 harden(agent-loop): wrap non-native tool results as untrusted data (#1629)
The non-native (prompted) tool-call path fed tool output back to the model as a plain "[Tool execution results]" user message, bypassing the untrusted_context_message wrapper that THREAT_MODEL.md requires for tool output. That path is what models without native tool-calling (many smaller local models) use, so prompt-injection inside a tool result (fetched page, file read, MCP/email output) could be read as instructions there.

Wrap it via untrusted_context_message("tool execution results", ...), the same hardening already applied to skills (#788) and escalation traces (#275). Also update _recent_context_for_retrieval, which used the old "[Tool execution results]" prefix as a sentinel to keep tool envelopes out of the retrieval query, to recognise the wrapped envelope via metadata.trusted.

The native path keeps returning tool-role messages (a user-role wrapper would break the native tool-call contract); it is covered by UNTRUSTED_CONTEXT_POLICY. Adds tests/test_tool_output_prompt_injection.py.

Fixes #1627.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 13:35:07 +02:00
Kenny Van de Maele a2261c38c1 refactor(auth): centralize the internal-tool pseudo-username into a constant (#4333)
The in-process tool loopback stamps current_user = "internal-tool" and
require_admin grants admin to that sentinel; it is also a reserved username.
That security-sensitive string was hand-typed in ~7 places (stamp, admin gate,
RESERVED_USERNAMES, and standalone admin-equivalent checks in note/research/
shell/task routes), where a typo silently breaks an auth gate.

Add INTERNAL_TOOL_USER in core/middleware.py next to INTERNAL_TOOL_TOKEN/
INTERNAL_TOOL_HEADER and use it at every such site. A typo is now an
ImportError, not a silent mismatch. auth.py importing middleware is acyclic
(middleware imports no app modules). Behaviour is unchanged.

The multi-sentinel sets bundling internal-tool with api/demo/system
(assistant_routes, task_scheduler, research_routes) are a separate reserved-set
dedup, left for a follow-up.

Closes #4332
2026-06-16 13:13:00 +02:00
Alexandre Teixeira bf56010aad test: split provider classification tests (#4392) 2026-06-16 09:54:07 +00:00
Karl Jussila ee72d71872 fix(auth): centralize password and username validation constants (#4120)
Added PASSWORD_MIN_LENGTH and RESERVED_USERNAMES to src/constants.py as the
single source of truth. Previously PASSWORD_MIN_LENGTH was hardcoded as 8 in
four route handlers and all three JS validation paths; RESERVED_USERNAMES was
an inline frozenset duplicated in core/auth.py, routes/assistant_routes.py,
routes/research_routes.py, and src/task_scheduler.py.

Added GET /api/auth/policy (unauthenticated) so the frontend reads the real
values from the server instead of hardcoding them in JS.

Added missing empty-username guard to /setup and admin POST /users. Both
returned a misleading 500/409 on whitespace-only input. /signup already had the
check; this makes all three consistent.
2026-06-16 09:52:15 +02:00
RaresKeY 2b519bf355 fix(routes): normalize session owner fallback helpers (#4313)
* fix(memory): normalize import session fallback

* fix(chat): use token owner for compaction scope

* fix(background): honor session endpoint fallback
2026-06-16 06:07:42 +01:00
Kfir Sadeh d795d9a923 feat(launcher): add portable windows launcher (#976)
* feat(windows): add standalone portable executable, splash screen, and system tray

* test: fix test_get_wsl_windows_user_profile_falls_back_to_users_dir on Windows

* Refactor launcher: isolate desktop logic into launcher.py, clean app.py/requirements, update build scripts, and add tests

* chore: clean launcher test whitespace

---------

Co-authored-by: Alexandre Teixeira <alexandremagteixeira@gmail.com>
2026-06-16 04:58:16 +01:00
Tal.Yuan 648db61b45 docs(architecture): add Phase 0 runtime inventory document (#4148)
* docs(architecture): add Phase 0 runtime inventory document

Per #4082 requirements, this no-code planning document maps:
- Largest runtime modules (Python + frontend)
- Import dependency graph and cross-layer violations
- Route ownership grouped by feature domain
- Tool registry boundaries and split candidates
- Risk-ranked candidate slices with recommended first 3 PRs
- Safety guardrails and validation commands for follow-up work

Closes #4082

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(architecture): correct inventory metrics per review feedback

Address @alteixeira20 review on #4148 (CHANGES_REQUESTED):

- src/ flat .py: ~60 -> 91; routes/: 52 -> 54
- core/database.py importers: 49 -> 94; src/agent_loop.py: -> 21
- src/ -> routes/ import lines: ~20 -> 38
- src/ subdirs: 3 -> 2 (agent_tools/, search/); drop non-existent agent/
- move main.py and src/agent/ out of current-structure into new
  section 10 'Future Direction (NOT current state)'
- route grouping: frame as one domain per PR, not a broad
  reorganization (helper imports / registration / test path risk)

* docs(architecture): round-2 fixes — move to specs/, correct counts, frame as candidate

Per @alteixeira20 + @RaresKeY review on #4148:

- Move docs/architecture-runtime-inventory.md -> specs/ (docs/ is
  GitHub Pages public content, per @RaresKeY)
- src/ -> routes/ import lines: 38 -> 30 (direct grep of import lines
  referencing routes/, matching reviewer's count)
- self-caught count drift: tests 552 -> 544; routes->src 349 -> 351;
  src->core 49 -> 99
- frame section 6 (rankings/package shapes/split order/route grouping)
  and section 10 (future direction) as candidate proposals pending
  maintainer agreement, not a committed plan (per @RaresKeY)

* docs(architecture): round-3 reviewer fixes — fix tool categorization, counts, appendix

Self-review as reviewer found:
- §5.2 tool categories were wrong: listed filesystem/shell/email-sending
  tools that are NOT in tool_implementations.py (they live in src/agent_tools/).
  Rewrote to the actual 33 do_* functions grouped by domain
  (system/cookbook/calendar/notes/search/research/contacts/vault/image)
- §2.1 builtin_actions.py: 0 -> 2 classes, ~26 -> ~24 functions
- §5.1: '33+' -> '33' (exact count)
- Appendix A: 'Complete File Listing' -> 'File Listing'; src noted as
  '61 of 91 shown' (was claiming complete but listed 61)
- Last updated date refreshed

* docs(architecture): round-4 — verify remaining counts, soften §6.3 framing

- task_scheduler ~6 -> 5 funcs; tool_index ~580 -> 542 lines (verified vs dev)
- §6.3 'Recommended First 3 Slices' -> 'Candidate' (ownership unsettled, per review)
- verified §4 route-domain line counts, §2.2 frontend counts, mcp_servers=4
- full test suite: 3267 passed, 1 skipped, 0 failed

* docs(architecture): refresh Phase 0 inventory metrics + document counting method

Refresh every count against current dev (b58af42) per review on #4148:
- src/ flat .py: 91 -> 95; tests/test_*.py: 544 -> 583
- core.database importers: 94 -> 102; src.agent_loop importers: 21 -> 22
- src/ -> routes/ lines: 30 -> 31; routes/ -> src/: 351 -> 374; src/ -> core/: 99 -> 106
- Last updated: dev@9d7a3d6 -> dev@b58af42

Add a "How the metrics are computed" note under section 3.4 with the exact
grep/find command for each count, so the numbers are reproducible and future
dev drift is a one-command recheck instead of another review round (per the
request to note the counting method).

Documentation-only; no code changes.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(architecture): refresh remaining counts + add snapshot basis note

Reviewer self-audit of the previous refresh caught more stale counts after
the rebase onto dev@b58af42:
- tool_implementations importers: 18 -> 17 (§3.2, §6.2, Appendix B)
- core/database classes: 27 -> 28 (§2.1, §6.2)
- mcp_servers .py files: 4 -> 5 (§1.1)
- routes/ -> core/ import lines: 124 -> 126 (§3.4)

Line counts in §2.1/§2.2 also drifted over the rebased range but are left
as-is and covered by a new "Snapshot basis" note in the header: line counts
are a snapshot that drifts as dev moves (recompute with wc -l), while the
importer/file/import-line counts are the authoritative ones refreshed here.
This keeps the inventory honest about live metric vs structural snapshot, so
dev drift no longer triggers a review round.

Documentation-only; no code changes.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(architecture): fix missed tool_implementations importer count in §6.3

Follow-up to the previous refresh: §6.3 Slice 1 still read "18 importers"
after the 18->17 update elsewhere. Correct to 17 for consistency. Doc-only.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: yuandonghao <yuandonghao@cohl.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:57:24 +01:00
RaresKeY 260ce8ba59 fix(email): enforce MCP owner boundaries (#4335)
* fix(email): enforce MCP owner boundaries

* fix(email): fail closed for unowned MCP fallback
2026-06-16 04:31:24 +01:00
RaresKeY 2f9ae43a58 test(email): cover sender signature owner cache writes (#4278) 2026-06-16 04:21:11 +01:00
RaresKeY 293bbfabf4 test(hwfit): cover SSH target validation regressions (#4279) 2026-06-16 04:18:21 +01:00
Alexandre Teixeira 0086399656 test: add fire_and_forget to API chat webhook stub (#4383) 2026-06-16 03:15:14 +00:00
RaresKeY 9d2989f386 test(auth): cover reserved username sentinel gate (#4276) 2026-06-16 04:09:58 +01:00
RaresKeY b5edbd3df7 fix(devops): harden docker config defaults (#4349) 2026-06-16 04:03:43 +01:00
RaresKeY 33fe7276be fix(endpoints): normalize URL handling (#4338) 2026-06-16 03:59:18 +01:00
RaresKeY a031a94a2e fix(cookbook): harden remote serve host handling (#4345) 2026-06-16 03:46:32 +01:00
RaresKeY 4d10c16d02 fix(auth): clean up rename and null-owner ownership (#4340) 2026-06-16 03:33:02 +01:00
RaresKeY 745c10e0d7 fix(gallery): confine gallery image path resolution (#4352) 2026-06-16 03:28:09 +01:00
Alexandre Teixeira 6b7a4c1e70 test: add oversized test split plan (#3987)
* test: add oversized test split plan

* test: refresh oversized split plan
2026-06-16 02:28:03 +00:00
RaresKeY 422f23fb12 fix(mcp): scope memory server by owner (#4315) 2026-06-16 03:18:17 +01:00
TheDragonTail 0f966d6b9f fix(embeddings): fall back to default cache dir when FASTEMBED_CACHE_PATH is empty (#3434)
docker-compose.yml injects FASTEMBED_CACHE_PATH=${FASTEMBED_CACHE_PATH:-},
which sets the variable to an empty string when the host has not defined it.
FASTEMBED_CACHE_DIR used os.getenv("FASTEMBED_CACHE_PATH", default), and
os.getenv only returns the default when the variable is ABSENT -- so the empty
value won and FASTEMBED_CACHE_DIR became "". os.makedirs("") then raised
[Errno 2] No such file or directory: '', FastEmbed failed to initialise, and
every vector feature (RAG, semantic memory, tool index) silently degraded on
the default Docker stack.

Treat an empty value like an absent one via `os.getenv(...) or default`.
Add a regression test covering the empty, unset, and explicit cases.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 03:11:48 +01:00
Afonso Coutinho 7b09491557 fix: check-in calendar digest leaks every user's events (missing owner scope) (#1925)
* fix: check-in calendar digest leaks every user's events (no owner scope)

* Seed dtend on calendar events in digest test so the NOT NULL column is satisfied
2026-06-16 02:42:41 +01:00
Kenny Van de Maele fafaf089c5 refactor(search): centralize the web-scraping User-Agent into one constant (#4325)
The outbound UA for web_fetch / web_search was inlined in four places with
two different values and nothing keeping them current: content.py pinned a
mid-2021 Chrome 91 build, and providers.py sent a bare Mozilla/5.0 in three
spots. Some sites serve a degraded or blocked page to a UA that old.

Add WEB_FETCH_USER_AGENT to src/constants.py (env-overridable, matching the
existing Copilot/Kimi UA-constant pattern) and import it in content.py and
providers.py. Default to a current, common desktop UA so pages return their
normal HTML: the market-leading desktop OS (Windows; NT 10.0 covers Windows
10 and 11) and browser (Chrome) on a current stable build. The version is now
bumped in one place.

Service-specific self-identifying agents (Copilot, Kimi, webhooks, cookbook)
are intentionally left separate. Adds a regression pinning the constant shape,
the env override, and a guard against a new inline Mozilla literal in the
search sources.

Closes #4324
2026-06-16 01:33:47 +00:00
RaresKeY b58af4267b fix(companion): require chat scope for model inventory (#4319) 2026-06-16 01:15:05 +02:00
AkioKoneko 8ff76f083c fix(cookbook): avoid launching Ollama during Windows cache scan (#4368) 2026-06-16 01:00:40 +02:00
Wei Hong 2196869c86 fix(webhooks): route public emitters through fire_and_forget (#3964) (#4336)
The three public webhook emitters in chat_helpers and webhook_routes
schedule deliveries via asyncio.create_task(webhook_manager.fire(...)),
which bypasses WebhookManager._bg_tasks. asyncio only holds a weak
reference to the outer task, so the GC can collect it mid-delivery and
the webhook is silently dropped.

Route all three through webhook_manager.fire_and_forget() so the task
is tracked by _spawn_tracked() and the manager owns the full lifecycle.

Adds an AST-level guard test that scans routes/ for direct
asyncio.create_task wrapping webhook_manager.fire(...) to prevent
regressions.
2026-06-16 00:41:45 +02:00
holden093 dd2e23c9af fix(agent): report phone numbers from resolve_contact when a matched contact has no email (#4327)
When a CardDAV contact matched the search query but had no email
address (only phone numbers), the tool silently dropped it and
returned 'No contacts found'.  Fall back to the contact's phone
number(s) so the caller still receives usable information.

Refs: #4178 (the contacts-domain classifier fix that made the model
actually call resolve_contact for contacts queries, surfacing this
pre-existing gap)
2026-06-16 00:03:33 +02:00
Fahim facc50cb0f fix(api): attribute bearer-token actions to the token owner on owner-scoped routes (#4054)
* fix(api): attribute bearer-token actions to the token owner on owner-scoped routes

Owner-scoped chat, session, and upload routes called
get_current_user(), which resolves a bearer ody_ API token to the
sandboxed "api" pseudo-user. A paired API-token client (companion, CLI,
IDE extension) therefore saw and created a separate "api"-owned silo
instead of the owner's data.

effective_user() already exists for exactly this: it attributes a token's
actions to request.state.api_token_owner, is identical to
get_current_user() for cookie sessions, and falls back safely when a
token has no owner. session_routes.py was already migrated; this
completes the migration for the remaining owner-scoped routes:

- chat_helpers.py: chat-privilege enforcement, message attribution, prefs/context
- chat_routes.py: orphaned-endpoint owner, session-auth owner, message search
- upload_routes.py: upload owner attribution + access checks

The /api/models swap is intentionally omitted: #4292 already migrated it
to effective_user (plus the chat-scope gate and ownerless-token 403), so
this PR keeps dev's version of routes/model_routes.py unchanged.

chat_routes.py keeps importing get_current_user for the workspace owner
gate; session_routes.py drops the now-unused import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: target effective_user in auth monkeypatches and owner-scope assertion

The owner-scoped routes now call effective_user() instead of
get_current_user(), so the tests that stubbed get_current_user (or
asserted on it) follow suit:

- test_chat_helpers.py, test_review_regressions.py,
  test_kv_cache_invalidation_2927.py: monkeypatch effective_user
- test_session_endpoint_owner_scope.py: assert the owner-scope guard uses
  effective_user(request)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:56:22 +02:00
Kenny Van de Maele 074a1e6eff fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955)
* fix(search): add download budgets to web_fetch with truncation notice and hard ceiling

MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content
buffered and cached the entire response body first, so a large or hostile
URL could pull arbitrarily many bytes into memory and the content cache.

The fetch is now a capped streaming GET (SSRF redirect guard unchanged):
a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call
override via full/max_bytes on the web_fetch tool, and a hard ceiling
(WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed.
When Content-Length already declares a body over the ceiling the fetch
is refused before any body bytes are buffered. Truncated results carry
truncated/fetched_bytes/total_bytes, the tool output leads with a
partial-content notice telling the model how to re-fetch with full=true,
and the tool schema documents the flag. A truncated PDF is reported as
a budget error since a cut PDF is unparseable. The effective cap is part
of the content-cache key so a truncated fetch is never served to a
full-budget request.

Existing tests that faked httpx.get or the old _get_public_url signature
are adapted to the streaming interface; behavior pins are unchanged.

Fixes #3812

* fix(search): close compressed-body cap bypass and protect the partial notice

Addresses RaresKeY's review on #3955:

- Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the
  wire bytes (and Content-Length) can be a fraction of the decoded body, so a
  tiny compressed response could pass the hard-cap preflight and then expand
  past the ceiling in a single decoded chunk before the streamed cap could
  slice it. Identity makes Content-Length the true body size and keeps each
  streamed chunk bounded by the network read, so the hard ceiling actually
  bounds memory.
- Lead web_fetch output with the partial-content notice and cap the page
  title. The notice is the user-facing contract for partial fetches, but the
  title is untrusted, uncapped page content; placed ahead of the notice a giant
  title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads
  and the title is capped as a second guard.

Adds regressions: the fetch advertises identity encoding, and a truncated
result with an oversized title still surfaces the partial notice.

* fix(search): reject compressed responses that ignore the identity request

Requesting Accept-Encoding: identity is not enough on its own: a server can
ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would
decode that, so a tiny compressed body could balloon into one decoded chunk
far past the hard cap before the streamed loop slices it (and Content-Length,
the compressed wire length, makes the preflight and size metadata unreliable).

Refuse a non-identity Content-Encoding before reading the body. Adds a
regression where the server ignores the identity request and returns gzip;
the fetch is refused before any body is decoded.
2026-06-15 17:38:09 +00:00
Kenny Van de Maele 2fab378c6a refactor(search): import REQUEST_TIMEOUT from constants in providers.py (#4331)
providers.py redefined REQUEST_TIMEOUT = 20 locally, shadowing the same
value in src/constants.py and risking drift if the constant is bumped.
Import it from src.constants and drop the local copy; same value, one
source of truth.

Closes #4329
2026-06-15 17:22:08 +00:00
Michael 5bafc30622 fix(api): normalize non-object JSON bodies to empty dict in token PATCH (#3976)
* fix(api): normalize non-object JSON bodies to empty dict in token PATCH

Valid non-dict JSON (e.g. [] or null) reaches payload.get(...) and
raises AttributeError. Normalize to {} so the route returns a controlled
response instead of an unhandled 500.

Fixes #3966

* test(api): add regression tests for PATCH with non-object JSON bodies

Covers array body ([]), null body, and normal object body as requested
in alteixeira20's review of #3976.

---------

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-15 18:05:15 +01:00
darius-f96 d6d2e17214 fix(hwfit): add GB10 unified-memory bandwidth so speed scores are real (#4270)
NVIDIA Grace Blackwell GB10 / DGX Spark was missing from GPU_BANDWIDTH, so
_lookup_bandwidth() returned None for it and _estimate_speed() fell through
to the crude FALLBACK_K path (k/active-params). That over-stated tok/s and
let speed scores saturate regardless of the box's real ~273 GB/s LPDDR5X
pool — distorting model ranking on these 128GB unified-memory rigs.

Add "gb10": 273 (GB/s). nvidia-smi reports the device name as "NVIDIA GB10",
which substring-matches the new key, so detected GB10 boxes now estimate
speed from the real bandwidth instead of the fallback.
2026-06-15 18:55:15 +02:00
Lucas Daniel f4e8990635 chore: add warnings to silent except Exception blocks (#3212)
* log(app): add warnings to silent except Exception blocks

- Internal tool auth header failure now logs a warning instead of
  silently passing, making auth bypass easier to spot in logs.
- Token last_used_at update failure now logs at DEBUG (fire-and-forget,
  non-critical, but useful when debugging token tracking issues).
- Image ownership verification failure now logs a warning so unexpected
  access-check errors surface instead of silently allowing the request.

* log(chat_routes): add warnings to silent except Exception blocks

- clear_orphaned_session_endpoint: log before rollback so failures
  appear in traces when users see stale/deleted model options.
- _endpoint_has_model (JSON parse): log malformed cached_models instead
  of silently treating endpoint as valid.
- _has_any_visible_model (JSON parse): log malformed cached_models
  instead of silently returning empty list.
- timezone header parse: log failure so time-zone-related tool bugs
  (wrong scheduled times, calendar events) are traceable.
- attachments JSON parse: log failure so silently-dropped attachments
  are visible in server logs.

* log(email_routes): add warnings to silent except Exception blocks

- Email alias resolution failure now logs a warning instead of silently
  returning an empty list, making broken account configs diagnosable.

* log(document_routes): add warnings to silent except Exception blocks

- Export ZIP request body parse failure now logs a warning so empty
  exports caused by malformed requests are diagnosable.
- clear_active_document failure on detach now logs a warning to help
  trace doc re-injection bugs like #1160.

* log(agent_loop): add warnings to silent except Exception blocks

- builtin tool overrides load failure now logs a warning so misconfigured
  settings don't silently fall back to defaults without a trace.
- Timezone context injection failure now logs a warning to help debug
  incorrect scheduled times in agent-created tasks.
- PDF form-backed document detection failure now logs a warning so
  broken form-doc UI is traceable to the root cause.

* log(llm_core): add warnings to silent except Exception blocks

- Malformed URL in _is_ollama_native_url now logs a warning so bad
  endpoint configs are traceable instead of silently returning False.
- Model list fetch failure now logs a warning with the endpoint URL so
  endpoints that silently vanish from the model picker are diagnosable.

* log: pass exception via exc_info instead of string interpolation

* fix(logging): avoid logging raw URLs in llm_core error paths

Drop the raw url/base_chat_url from the Ollama-detection and
model-list-fetch warning logs added by this sweep, since these values
can contain private hostnames, internal IPs, credentials, or other
deployment details.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 17:49:27 +01:00
Kfir Sadeh fc3a5e555e feat(paths): abstract runtime path logic for frozen distribution packages (#969)
* feat(core): abstract runtime path logic for frozen distribution packages

* Address review feedback: revert browser MCP check, persistent data dir default when frozen, and add path tests
2026-06-15 17:44:10 +01:00
Hriday Ranka 270b8570fc feat(email): add Google OAuth2 for Google Workspace / .edu IMAP & SMTP (#237)
* feat(email): add Google OAuth2 for Google Workspace / .edu IMAP & SMTP

Google deprecated basic-auth (password) access for Google Workspace
accounts in May 2025. This means any .edu or org Google email account
could no longer connect via IMAP/SMTP with a username + password —
the email feature was silently broken for a large class of users.

This PR adds full OAuth2 (XOAUTH2) support for Google accounts so
Workspace / .edu emails work out of the box.

## What changed

### Backend
- `core/database.py`: add `oauth_provider`, `oauth_access_token`,
  `oauth_refresh_token`, `oauth_token_expiry`, and `display_name`
  columns to `EmailAccount` + idempotent migration
- `routes/email_helpers.py`: XOAUTH2 auth in `_imap_connect()` and
  `_send_smtp_message()`, automatic token refresh, OAuth fields in
  `_get_email_config()`
- `routes/email_routes.py`: OAuth authorize + callback routes,
  `_smtp_ready()` fix, OAuth fields through `_deliver()` closure,
  `display_name` in `From:` header

### Frontend
- `static/js/settings.js`: "Google Workspace / .edu" provider preset,
  "Connect with Google" button, success/error banner, display name field
- `static/js/document.js`: `_accountCanSend()` recognises OAuth accounts
  as SMTP-capable

* security: sign OAuth state, scope callback by owner, fix quotes & logs

Addresses reviewer feedback on the email OAuth2 PR:

- OAuth state is now HMAC-SHA256 signed (keyed with the app secret from
  secret_storage) encoding account_id + owner + a random nonce, and is
  verified with constant-time comparison in the callback before any
  token write. Replaces the bare account_id state, closing the CSRF /
  state-guessing gap.
- Callback extracts the owner from the verified state and re-checks it
  against EmailAccount.owner before writing tokens, matching the
  ownership guards used elsewhere in the email routes. Single-user mode
  (owner == "") still accepts any account, consistent with
  _assert_owns_account.
- Replaced curly/smart quotes in the Name/Email/Display Name input rows
  with plain ASCII so getElementById lookups and event wiring work.
- Stripped account name, SMTP host/user, owner, and raw provider error
  text from send-config and OAuth logs; failures now surface as generic
  error codes in the redirect instead of raw exception strings.

* test(email): add OAuth2 state, _smtp_ready, and XOAUTH2 tests

Move the OAuth state sign/verify helpers out of the setup_email_routes
closure into module-level make_oauth_state/verify_oauth_state in
email_helpers.py so they can be unit-tested, then add tests/test_email_oauth.py:

- signed state round-trips account_id + owner, nonce is unique per call
- tampered account_id, forged signature, and garbage states are rejected
- _smtp_ready treats an OAuth account (no password) as send-capable, and
  still rejects host+user-only accounts with neither password nor OAuth
- _xoauth2_string / _xoauth2_bytes produce the correct SASL XOAUTH2 framing

14 new tests; existing test_security_regressions.py still passes (28).

* refactor(email): single XOAUTH2 frame helper, use RuntimeError

Polish from self-review before merge:

- Collapse the XOAUTH2 framing to one source of truth: _xoauth2_raw()
  returns the unencoded SASL string used by both the SMTP and IMAP auth
  callbacks (each library base64-encodes it), and _xoauth2_bytes() is
  just its .encode(). Removes the unused base64 _xoauth2_string helper
  and the duplicated inline frame in _send_smtp_message.
- Raise RuntimeError (not bare Exception) for the "OAuth token
  unavailable" path, matching the convention used across src/.
- Update tests accordingly.

All 14 OAuth tests + 28 security regressions pass; SMTP/IMAP XOAUTH2
verified live against a real Workspace account.

* tests(email-oauth): cover the security-sensitive OAuth paths before merge

The previous tests only exercised pure helpers (state signing, _smtp_ready,
XOAUTH2 framing). This adds coverage for the actual token-custody and
ownership behaviour, pinning the real route handlers rather than
re-implementations of their logic.

Real OAuth callback route (pulled live from setup_email_routes()):
- missing code -> generic missing_code redirect, no account id / owner in URL
- provider error -> generic google_error redirect, raw error not echoed
- tampered/invalid state -> invalid_state redirect, auth code never leaked
- signed state with owner mismatch -> token write refused (ownership_error),
  DB row left untouched
- signed state with matching owner -> tokens written encrypted, and only to
  the intended account (a second account stays untouched)

Real accounts-list route:
- exposes oauth_provider status but never the access/refresh token values,
  encrypted or otherwise

Token storage / refresh helpers (isolated in-memory SQLite, mocked HTTP):
- refreshed access token stored encrypted; expiry is a timestamp, not a token
- fresh token uses cache (no refresh call); expired token triggers refresh
- refresh HTTP failure returns None silently, no exception or secret surfaced
- missing client credentials short-circuits to None

Password-account regression:
- password IMAP accounts call conn.login(); OAuth accounts call XOAUTH2
  authenticate() and never login()

28 tests pass (14 prior + 14 new).

* fix(email-oauth): drop raw exception text from token-refresh log

Google token refresh failures now log the account id only, matching
the conservative logging used elsewhere on the OAuth path — no raw
provider/exception details surfacing in logs.

* fix(email-oauth): bring OAuth UI parity to the Integrations email form

The Google Workspace / .edu provider preset, Display Name field, and
Connect-with-Google flow were only wired into the Email-tab account
form. The Integrations-tab form (a separate code path for the same
account type) was missing all three, so the OAuth option was invisible
from that entry point. Mirrors the same PROVIDERS entry, OAuth section,
and connect handler so both forms behave identically.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-15 17:02:58 +01:00
Léo 0750486654 fix(notes): fail closed when an unauthenticated request reaches owner-scoped routes (#4062)
* fix(notes): fail closed when an unauthenticated request reaches owner-scoped routes

The notes CRUD routes resolved the acting user with bare get_current_user().
A request that reached them with no identity (auth-middleware regression,
SSRF from a sibling service) came through as user=None — which every query
treats as the single-user mode: list all accounts' notes, read/update/
delete/pin/archive any row, reorder globally.

Resolve the owner through require_user() instead, which already encodes the
right policy: 401 when auth is configured, while the documented anonymous
modes (AUTH_ENABLED=false, LOCALHOST_BYPASS on loopback, unconfigured
first-run) still resolve to the single-user path. fire-reminder in the same
file already gated this way; the CRUD routes now match, and the inline
require_user import there is folded into the module import.

Extracted from #2940 (stabilization slice).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(notes): drive fail-closed test via ASGITransport, not sync TestClient

The focused fail-closed test hung at `TestClient(app).get(...)` on some
environments. Starlette's sync TestClient runs the app in a background
event-loop thread (anyio blocking portal) and then dispatches each sync
endpoint onto a second worker thread; that handshake deadlocks on certain
anyio/httpx/platform combos. The identity injection also used
BaseHTTPMiddleware (@app.middleware("http")), the other known TestClient
deadlock source.

Switch to the repo's existing httpx.ASGITransport + AsyncClient idiom so the
whole request runs on the test's own event loop (no portal thread, no
BaseHTTPMiddleware). Identity now comes from a pure-ASGI shim that writes the
same request.state fields the real auth middleware sets, and a non-loopback
client peer keeps require_user's loopback fall-throughs out of the picture.
Same assertions and coverage; production code unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-15 17:43:28 +02:00
RaresKeY d38e2cbc07 fix(ci): avoid duplicate CodeQL setup (#4297) 2026-06-15 16:39:13 +01:00
Ashvin 7fd937fa57 fix(calendar): parse "mins"/"hrs" reminder offsets in manage_calendar (#4266)
_reminder_minutes matched the offset with (?:m|min|minute|minutes)\b and
(?:h|hr|hour|hours)\b. The trailing \b makes the common plural
abbreviations "mins"/"hrs" fail to match (after "min" the "s" is a word
char, so no boundary), so reminder_minutes "5 mins" or "2 hrs" returned
None and the event was created with no reminder, silently.

Widen the two unit regexes and the matching reminder_only description
regex to a strict superset that also accepts mins/hrs. The sibling
duration parser already accepts these forms (it has no \b), so this only
brings the reminder parser in line.
2026-06-15 17:37:28 +02:00
Catalin Iliescu c41caac438 fix(cookbook): only persist successfully stopped scheduled serves (#4267)
Co-authored-by: Cata <cata@bigjohn.local>
2026-06-15 17:30:18 +02:00
Kenny Van de Maele 1747c13133 test: align README presentation guards with the #4306 refresh (#4311)
* test: align README presentation guards with the #4306 refresh

The 'Refresh README presentation' change (#4306) swapped the ASCII banner
for a centered wordmark image and moved the native quickstart into
docs/setup.md, which left four base tests failing on dev and froze the
merge gate:

- test_security_regressions::test_readme_native_quickstart_uses_loopback
  now also accepts the loopback guidance from docs/setup.md, where the
  quickstart moved (no behaviour change; the guidance is intact there).
- test_readme_ascii_fenced guards the new wordmark title instead of the
  removed ASCII banner, and keeps a defensive check that any reintroduced
  box-drawing banner stays inside a code fence (the original #1390 mode).
- The five unreferenced demo gifs under docs/ (chat, compare, document,
  notes, research) are removed so test_docs_no_orphan_images passes; they
  were de-referenced by the refresh. Recoverable from history if a docs
  page wants to embed them again.

* chore: refresh PR checks

---------

Co-authored-by: Alexandre Teixeira <alexandremagteixeira@gmail.com>
2026-06-15 16:25:38 +01:00
RaresKeY ffd0aaf69b fix(cookbook): validate adopt host (#4282) 2026-06-15 16:44:24 +02:00
RaresKeY 81e7074d93 fix(gallery): confine replacement image path (#4285) 2026-06-15 16:42:41 +02:00
RaresKeY f66a23d19d fix(ai): validate generated image result URLs (#4289) 2026-06-15 16:40:49 +02:00
RaresKeY f602819523 fix(models): scope API-token model listing (#4292) 2026-06-15 16:38:41 +02:00
RaresKeY 85a773ea02 fix(personal): resolve upload delete path (#4291) 2026-06-15 16:38:37 +02:00
PewDiePie fb0a64fe4f Merge pull request #4306 from pewdiepie-archdaemon/readme-refresh-default-branch
docs: refresh README presentation
2026-06-15 23:28:20 +09:00
pewdiepie-archdaemon bcf46dafb9 Refresh README presentation 2026-06-15 23:26:10 +09:00
pewdiepie-archdaemon f3e4e071c7 Merge remote-tracking branch 'origin/dev' 2026-06-15 14:02:53 +00:00
pewdiepie-archdaemon ff7cf8e279 Merge remote-tracking branch 'origin/dev' 2026-06-15 14:00:54 +00:00
262 changed files with 18650 additions and 2798 deletions
+4
View File
@@ -15,6 +15,10 @@ build/
# at runtime — never baked into the image. Mirrored in .gitignore. # at runtime — never baked into the image. Mirrored in .gitignore.
secrets.env secrets.env
secrets.env.* secrets.env.*
secrets.env~
.secrets.env.swp
.secrets.env.swo
**/#secrets.env#
!secrets.env.example !secrets.env.example
/data/ /data/
/logs/ /logs/
+3 -3
View File
@@ -19,7 +19,7 @@ jobs:
name: Python syntax (compileall) name: Python syntax (compileall)
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0 - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
@@ -32,7 +32,7 @@ jobs:
name: JS syntax (node --check) name: JS syntax (node --check)
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
- uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0 - uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0
@@ -54,7 +54,7 @@ jobs:
# ROADMAP "fresh install smoke tests" item; make this required once green. # ROADMAP "fresh install smoke tests" item; make this required once green.
continue-on-error: true continue-on-error: true
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
fetch-depth: 0 fetch-depth: 0
persist-credentials: false persist-credentials: false
+1 -1
View File
@@ -37,7 +37,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
+2 -2
View File
@@ -52,7 +52,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
@@ -93,7 +93,7 @@ jobs:
security-events: write # upload SARIF to the Security tab security-events: write # upload SARIF to the Security tab
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
+2 -2
View File
@@ -36,7 +36,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
@@ -55,7 +55,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
+2 -2
View File
@@ -45,7 +45,7 @@ jobs:
arch: arm64 arch: arm64
runner: ubuntu-24.04-arm runner: ubuntu-24.04-arm
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
- name: Set up Buildx - name: Set up Buildx
@@ -86,7 +86,7 @@ jobs:
contents: read contents: read
packages: write packages: write
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
- name: Read APP_VERSION + short sha - name: Read APP_VERSION + short sha
@@ -14,7 +14,7 @@ jobs:
# Skip bots (Dependabot, release-drafter, etc.) # Skip bots (Dependabot, release-drafter, etc.)
if: ${{ github.event.issue.user.type != 'Bot' }} if: ${{ github.event.issue.user.type != 'Bot' }}
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
sparse-checkout: .github/scripts sparse-checkout: .github/scripts
persist-credentials: false persist-credentials: false
+1 -1
View File
@@ -23,7 +23,7 @@ jobs:
# Skip bots: they open PRs programmatically and have their own process. # Skip bots: they open PRs programmatically and have their own process.
if: github.event.pull_request.user.type != 'Bot' if: github.event.pull_request.user.type != 'Bot'
steps: steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 - uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
ref: ${{ github.base_ref }} ref: ${{ github.base_ref }}
sparse-checkout: .github/scripts sparse-checkout: .github/scripts
+1 -1
View File
@@ -35,7 +35,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
# Full history so a secret committed in an earlier commit (and later # Full history so a secret committed in an earlier commit (and later
# deleted) is still caught -- deletion does not remove it from Git. # deleted) is still caught -- deletion does not remove it from Git.
+2 -2
View File
@@ -36,7 +36,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
@@ -61,7 +61,7 @@ jobs:
contents: read contents: read
steps: steps:
- name: Checkout repository - name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: false persist-credentials: false
+1
View File
@@ -86,6 +86,7 @@ Bundled in `static/fonts/`:
| [Fira Code](https://github.com/tonsky/FiraCode) | SIL Open Font License 1.1 | Nikita Prokopov & contributors | | [Fira Code](https://github.com/tonsky/FiraCode) | SIL Open Font License 1.1 | Nikita Prokopov & contributors |
| [Inter](https://github.com/rsms/inter) | SIL Open Font License 1.1 | Rasmus Andersson | | [Inter](https://github.com/rsms/inter) | SIL Open Font License 1.1 | Rasmus Andersson |
| [GohuFont](https://font.gohu.org/) (`fonts/custom/GohuFont.ttf`) | WTFPL | Hugo Chargois | | [GohuFont](https://font.gohu.org/) (`fonts/custom/GohuFont.ttf`) | WTFPL | Hugo Chargois |
| [OpenDyslexic](https://opendyslexic.org/) (`fonts/OpenDyslexic-{Regular,Bold}.woff2`) | SIL Open Font License 1.1 ([`licenses/OpenDyslexic-OFL.txt`](licenses/OpenDyslexic-OFL.txt)) | Abbie Gonzalez |
## Python dependencies ## Python dependencies
+1 -1
View File
@@ -37,7 +37,7 @@ Manual development uses Python 3.11+:
python3 -m venv venv python3 -m venv venv
source venv/bin/activate source venv/bin/activate
pip install -r requirements.txt pip install -r requirements.txt
python -m uvicorn app:app --host 0.0.0.0 --port 7000 python -m uvicorn app:app --host 127.0.0.1 --port 7000
``` ```
Windows is not actively tested. Docker on Linux or a Linux/macOS manual install is the safer path for now. Windows is not actively tested. Docker on Linux or a Linux/macOS manual install is the safer path for now.
+17
View File
@@ -20,6 +20,23 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
gosu \ gosu \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
# Docker CLI (client only — daemon stays on the host via the
# /var/run/docker.sock mount). The Debian `docker.io` package ships
# dockerd but not the client binary on slim, so grab the static client
# tarball from download.docker.com instead.
ARG DOCKER_CLI_VERSION=27.5.1
RUN ARCH="$(dpkg --print-architecture)" \
&& case "$ARCH" in \
amd64) DARCH=x86_64 ;; \
arm64) DARCH=aarch64 ;; \
*) echo "unsupported arch $ARCH"; exit 1 ;; \
esac \
&& curl -fsSL "https://download.docker.com/linux/static/stable/${DARCH}/docker-${DOCKER_CLI_VERSION}.tgz" \
-o /tmp/docker.tgz \
&& tar -xzf /tmp/docker.tgz -C /tmp \
&& install -m 0755 /tmp/docker/docker /usr/local/bin/docker \
&& rm -rf /tmp/docker /tmp/docker.tgz
WORKDIR /app WORKDIR /app
# Install Python deps first (layer cache). Optional extras (PyMuPDF AGPL, etc.) # Install Python deps first (layer cache). Optional extras (PyMuPDF AGPL, etc.)
+45
View File
@@ -0,0 +1,45 @@
# -*- mode: python ; coding: utf-8 -*-
a = Analysis(
['launcher.py'],
pathex=[],
binaries=[],
datas=[('static', 'static'), ('scripts', 'scripts'), ('mcp_servers', 'mcp_servers'), ('services/hwfit/data', 'services/hwfit/data'), ('config', 'config'), ('.env.example', '.env.example')],
hiddenimports=[],
hookspath=[],
hooksconfig={},
runtime_hooks=[],
excludes=[],
noarchive=False,
optimize=0,
)
pyz = PYZ(a.pure)
exe = EXE(
pyz,
a.scripts,
[],
exclude_binaries=True,
name='Odysseus',
debug=False,
bootloader_ignore_signals=False,
strip=False,
upx=True,
console=False,
disable_windowed_traceback=False,
argv_emulation=False,
target_arch=None,
codesign_identity=None,
entitlements_file=None,
icon=['static\\icon.ico'],
)
coll = COLLECT(
exe,
a.binaries,
a.datas,
strip=False,
upx=True,
upx_exclude=[],
name='Odysseus',
)
+2 -2
View File
@@ -1,5 +1,5 @@
<p align="center"> <p align="center">
<img src="docs/odysseus-wordmark.png" alt="Odysseus" width="280"> <img src="docs/odysseus-wordmark.png" alt="Odysseus" width="238">
</p> </p>
<p align="center"> <p align="center">
@@ -18,7 +18,7 @@
</p> </p>
<p align="center"> <p align="center">
<img src="docs/odysseus.jpg" alt="Odysseus interface"> <img src="docs/odysseus-browser.jpg" alt="Odysseus interface">
</p> </p>
--- ---
+24 -13
View File
@@ -1,6 +1,7 @@
# app.py — slim orchestrator # app.py — slim orchestrator
import mimetypes import mimetypes
import os import os
import sys
def register_static_mime_types() -> None: def register_static_mime_types() -> None:
@@ -38,7 +39,7 @@ load_dotenv(encoding="utf-8-sig")
import asyncio import asyncio
import logging import logging
import secrets import secrets
from datetime import datetime from datetime import datetime, timezone
from typing import Dict from typing import Dict
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
@@ -113,12 +114,13 @@ app = FastAPI(
) )
# ========= CORS ========= # ========= CORS =========
CORS_ALLOW_METHODS = ["GET", "POST", "PUT", "PATCH", "DELETE"]
allowed_origins = os.getenv("ALLOWED_ORIGINS", "http://localhost,http://127.0.0.1").split(",") allowed_origins = os.getenv("ALLOWED_ORIGINS", "http://localhost,http://127.0.0.1").split(",")
app.add_middleware( app.add_middleware(
CORSMiddleware, CORSMiddleware,
allow_origins=allowed_origins, allow_origins=allowed_origins,
allow_credentials=True, allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"], allow_methods=CORS_ALLOW_METHODS,
allow_headers=[ allow_headers=[
"Accept", "Accept",
"Authorization", "Authorization",
@@ -316,7 +318,7 @@ if AUTH_ENABLED:
# (no admin cookie available in that context). Restricted to # (no admin cookie available in that context). Restricted to
# loopback clients + matching token to keep it locked down. # loopback clients + matching token to keep it locked down.
try: try:
from core.middleware import INTERNAL_TOOL_HEADER, INTERNAL_TOOL_TOKEN as _ITT from core.middleware import INTERNAL_TOOL_HEADER, INTERNAL_TOOL_TOKEN as _ITT, INTERNAL_TOOL_USER
_hdr = request.headers.get(INTERNAL_TOOL_HEADER) _hdr = request.headers.get(INTERNAL_TOOL_HEADER)
if _hdr and secrets.compare_digest(_hdr, _ITT) and _is_trusted_loopback(request): if _hdr and secrets.compare_digest(_hdr, _ITT) and _is_trusted_loopback(request):
# Impersonation: when the agent's loopback call sets # Impersonation: when the agent's loopback call sets
@@ -328,11 +330,11 @@ if AUTH_ENABLED:
if _impersonate and _impersonate in getattr(_auth_mgr, "users", {}): if _impersonate and _impersonate in getattr(_auth_mgr, "users", {}):
request.state.current_user = _impersonate request.state.current_user = _impersonate
else: else:
request.state.current_user = "internal-tool" request.state.current_user = INTERNAL_TOOL_USER
request.state.api_token = False request.state.api_token = False
return await call_next(request) return await call_next(request)
except Exception: except Exception as _e:
pass logger.warning("Internal tool auth header check failed", exc_info=_e)
# Allow DIRECT localhost requests (internal service calls from # Allow DIRECT localhost requests (internal service calls from
# heartbeats etc.). Tunnel/proxy-forwarded requests are excluded by # heartbeats etc.). Tunnel/proxy-forwarded requests are excluded by
# _is_trusted_loopback so LOCALHOST_BYPASS can't be abused over a # _is_trusted_loopback so LOCALHOST_BYPASS can't be abused over a
@@ -385,11 +387,10 @@ if AUTH_ENABLED:
_db.close() _db.close()
try: try:
await _asyncio.to_thread(_do) await _asyncio.to_thread(_do)
except Exception: except Exception as _e:
pass logger.debug("Failed to update token last_used_at", exc_info=_e)
_asyncio.create_task(_touch_last_used(matched_id)) _asyncio.create_task(_touch_last_used(matched_id))
# Keep bearer-token callers out of normal cookie/user # Keep bearer-token callers out of normal cookie/user
# routes. API-aware routes can read api_token_owner.
request.state.current_user = "api" request.state.current_user = "api"
request.state.api_token = True request.state.api_token = True
request.state.api_token_id = matched_id request.state.api_token_id = matched_id
@@ -438,7 +439,7 @@ class _RevalidatingStatic(StaticFiles):
return resp return resp
app.mount("/static", _RevalidatingStatic(directory="static"), name="static") app.mount("/static", _RevalidatingStatic(directory=STATIC_DIR), name="static")
# ========= GENERATED IMAGES ========= # ========= GENERATED IMAGES =========
@app.get("/api/generated-image/{filename}") @app.get("/api/generated-image/{filename}")
@@ -464,8 +465,8 @@ async def serve_generated_image(filename: str, request: Request):
_db.close() _db.close()
except HTTPException: except HTTPException:
raise raise
except Exception: except Exception as _e:
pass logger.warning("Image ownership verification failed for %r", filename, exc_info=_e)
ext = filename.rsplit('.', 1)[-1].lower() ext = filename.rsplit('.', 1)[-1].lower()
mime = { mime = {
"png": "image/png", "jpg": "image/jpeg", "jpeg": "image/jpeg", "png": "image/png", "jpg": "image/jpeg", "jpeg": "image/jpeg",
@@ -528,6 +529,7 @@ memory_vector = components.get("memory_vector")
upload_handler = components["upload_handler"] upload_handler = components["upload_handler"]
app.state.upload_handler = upload_handler app.state.upload_handler = upload_handler
personal_docs_mgr = components["personal_docs_manager"] personal_docs_mgr = components["personal_docs_manager"]
app.state.personal_docs_manager = personal_docs_mgr
api_key_manager = components["api_key_manager"] api_key_manager = components["api_key_manager"]
preset_manager = components["preset_manager"] preset_manager = components["preset_manager"]
chat_processor = components["chat_processor"] chat_processor = components["chat_processor"]
@@ -861,7 +863,7 @@ async def get_version():
@app.get("/api/health") @app.get("/api/health")
async def health_check() -> Dict[str, str]: async def health_check() -> Dict[str, str]:
return {"status": "healthy", "timestamp": datetime.utcnow().isoformat()} return {"status": "healthy", "timestamp": datetime.now(timezone.utc).isoformat()}
@app.get("/api/ready") @app.get("/api/ready")
async def readiness_check() -> JSONResponse: async def readiness_check() -> JSONResponse:
@@ -1171,3 +1173,12 @@ async def _shutdown_event():
except Exception as e: except Exception as e:
logger.warning(f"MCP shutdown error: {e}") logger.warning(f"MCP shutdown error: {e}")
logger.info("Application shutdown complete") logger.info("Application shutdown complete")
if __name__ == "__main__":
import uvicorn
bind_host = os.getenv("APP_BIND", "127.0.0.1")
bind_port = int(os.getenv("APP_PORT", "7000"))
uvicorn.run(app, host=bind_host, port=bind_port, log_level="info")
+72
View File
@@ -0,0 +1,72 @@
#Requires -Version 5.1
<#
Build a portable Windows distribution for Odysseus.
Output layout:
dist\Odysseus\Odysseus.exe
dist\Odysseus\static\...
dist\Odysseus\scripts\...
dist\Odysseus\mcp_servers\...
dist\Odysseus\services\hwfit\data\...
The app then keeps using its normal filesystem layout when frozen.
Usage:
powershell -ExecutionPolicy Bypass -File .\build-windows-portable.ps1
#>
$ErrorActionPreference = "Stop"
Set-Location -Path $PSScriptRoot
function Write-Step($msg) { Write-Host ""; Write-Host ("==> " + $msg) -ForegroundColor Cyan }
function Fail($msg) {
Write-Host ""
Write-Host ("ERROR: " + $msg) -ForegroundColor Red
exit 1
}
Write-Step "Checking for Python"
$pyExe = $null
if (Test-Path ".\.venv\Scripts\python.exe") {
$pyExe = (Resolve-Path ".\.venv\Scripts\python.exe").Path
} else {
foreach ($c in @("py", "python")) {
$cmd = Get-Command $c -ErrorAction SilentlyContinue
if ($cmd) { $pyExe = $cmd.Source; break }
}
if ($pyExe -like "*WindowsApps*python.exe") {
$pyCmd = Get-Command py -ErrorAction SilentlyContinue
if ($pyCmd) {
$pyExe = $pyCmd.Source
}
}
}
if (-not $pyExe) {
Fail "Python not found on PATH. Install Python 3.11+ first."
}
Write-Host ("Using Python: " + $pyExe)
Write-Step "Installing build dependencies"
& $pyExe -m pip install --upgrade pip --quiet
& $pyExe -m pip install -r requirements.txt pyinstaller pystray Pillow
if ($LASTEXITCODE -ne 0) { Fail "Dependency install failed." }
Write-Step "Building portable exe bundle"
Remove-Item -Recurse -Force build, dist -ErrorAction SilentlyContinue
$dataArgs = @(
"--add-data", "static;static",
"--add-data", "scripts;scripts",
"--add-data", "mcp_servers;mcp_servers",
"--add-data", "services/hwfit/data;services/hwfit/data",
"--add-data", "config;config",
"--add-data", ".env.example;.env.example"
)
& $pyExe -m PyInstaller --noconfirm --clean --onedir --noconsole --icon=static/icon.ico --name Odysseus @dataArgs launcher.py
if ($LASTEXITCODE -ne 0) { Fail "PyInstaller build failed." }
Write-Host ""
Write-Host "Build complete." -ForegroundColor Green
Write-Host "Portable app folder: $PSScriptRoot\dist\Odysseus" -ForegroundColor Green
Write-Host "Distribute the whole folder (or zip it) so static assets and scripts stay with the exe." -ForegroundColor Green
+17 -3
View File
@@ -5,8 +5,9 @@ offers and pair to it, without duplicating any LLM logic.
Auth is enforced globally by AuthMiddleware (app.py), so reaching a handler here Auth is enforced globally by AuthMiddleware (app.py), so reaching a handler here
means the caller is authenticated by either a cookie session or a Bearer `ody_` means the caller is authenticated by either a cookie session or a Bearer `ody_`
API token. The read endpoints (ping/info/models) accept either; the pairing API token. Ping/info accept either credential type, models requires a chat-
endpoints are admin-cookie only. scoped API token for bearer callers, and the pairing endpoints are admin-cookie
only.
Pairing CSRF posture: minting happens ONLY on POST. The session cookie is Pairing CSRF posture: minting happens ONLY on POST. The session cookie is
SameSite=Lax (routes/auth_routes.py), which a browser does not send on a SameSite=Lax (routes/auth_routes.py), which a browser does not send on a
@@ -18,7 +19,7 @@ on a GET would be unsafe (Lax cookies ride top-level GET navigations), so GET
import html import html
from fastapi import APIRouter, Request from fastapi import APIRouter, HTTPException, Request
from fastapi.responses import HTMLResponse from fastapi.responses import HTMLResponse
from core.middleware import require_admin from core.middleware import require_admin
@@ -52,6 +53,18 @@ def owner_can_see(row_owner, owner) -> bool:
return row_owner is None or row_owner == owner return row_owner is None or row_owner == owner
def require_models_scope(request: Request) -> None:
"""Require the companion chat scope for bearer-token model inventory."""
if not getattr(request.state, "api_token", False):
return
scopes = getattr(request.state, "api_token_scopes", None) or []
if isinstance(scopes, str):
scopes = [scope.strip() for scope in scopes.split(",")]
scope_set = {str(scope).strip() for scope in scopes if str(scope).strip()}
if _pairing.COMPANION_SCOPE not in scope_set:
raise HTTPException(403, "API token requires chat scope")
def mint_pairing_token(owner: str, invalidate=None) -> tuple[str, str]: def mint_pairing_token(owner: str, invalidate=None) -> tuple[str, str]:
"""Mint a pairing token AND invalidate the auth middleware's in-memory token """Mint a pairing token AND invalidate the auth middleware's in-memory token
cache, so the new token is accepted on the very next request without a server cache, so the new token is accepted on the very next request without a server
@@ -103,6 +116,7 @@ def setup_companion_routes() -> APIRouter:
rows -- the same rule as owner_filter. Read-only; never returns api_key rows -- the same rule as owner_filter. Read-only; never returns api_key
material. material.
""" """
require_models_scope(request)
import json as _json import json as _json
from core.database import SessionLocal, ModelEndpoint from core.database import SessionLocal, ModelEndpoint
+22 -8
View File
@@ -20,6 +20,7 @@ logger = logging.getLogger(__name__)
from core.atomic_io import atomic_write_json as _atomic_write_json # noqa: E402 from core.atomic_io import atomic_write_json as _atomic_write_json # noqa: E402
from core.middleware import INTERNAL_TOOL_USER # noqa: E402
DEFAULT_PRIVILEGES = { DEFAULT_PRIVILEGES = {
"can_use_agent": True, "can_use_agent": True,
@@ -47,7 +48,7 @@ ADMIN_PRIVILEGES["allowed_models_restricted"] = False
# backwards for this sentinel. # backwards for this sentinel.
ADMIN_PRIVILEGES["block_all_models"] = False ADMIN_PRIVILEGES["block_all_models"] = False
from src.constants import AUTH_FILE from src.constants import AUTH_FILE, PASSWORD_MIN_LENGTH
DEFAULT_AUTH_PATH = AUTH_FILE DEFAULT_AUTH_PATH = AUTH_FILE
TOKEN_TTL = 60 * 60 * 24 * 7 # 7 days TOKEN_TTL = 60 * 60 * 24 * 7 # 7 days
@@ -65,7 +66,7 @@ TOKEN_TTL = 60 * 60 * 24 * 7 # 7 days
# of those names would be denied an assistant and inconsistently owner-scoped. # of those names would be denied an assistant and inconsistently owner-scoped.
# Refuse to create or rename into any of them so the sentinels can't be # Refuse to create or rename into any of them so the sentinels can't be
# impersonated. (Keep this in sync with that synthetic-owner set.) # impersonated. (Keep this in sync with that synthetic-owner set.)
RESERVED_USERNAMES = frozenset({"internal-tool", "api", "demo", "system"}) RESERVED_USERNAMES = frozenset({INTERNAL_TOOL_USER, "api", "demo", "system"})
def normalize_known_username(users: Dict[str, Any], username: str | None) -> Optional[str]: def normalize_known_username(users: Dict[str, Any], username: str | None) -> Optional[str]:
@@ -243,6 +244,15 @@ class AuthManager:
def is_configured(self) -> bool: def is_configured(self) -> bool:
return len(self.users) > 0 return len(self.users) > 0
def policy(self) -> dict:
"""Return public auth policy constants for the frontend."""
return {
"password_min_length": PASSWORD_MIN_LENGTH,
"reserved_usernames": sorted(RESERVED_USERNAMES),
"signup_enabled": self.signup_enabled,
"session_days": TOKEN_TTL // 86400,
}
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Account management # Account management
# ------------------------------------------------------------------ # ------------------------------------------------------------------
@@ -573,16 +583,20 @@ class AuthManager:
return None return None
return self.create_session_trusted(username) return self.create_session_trusted(username)
def create_session_trusted(self, username: str) -> str: def create_session_trusted(self, username: str) -> Optional[str]:
"""Issue a session token for an already-verified user. """Issue a session token for an already-verified user.
Call only after verify_password (and TOTP if enabled) have passed.""" Call only after verify_password (and TOTP if enabled) have passed."""
username = username.strip().lower() username = username.strip().lower()
token = secrets.token_hex(32) token = secrets.token_hex(32)
with self._sessions_lock: with self._config_lock:
self._sessions[token] = { if username not in self.users:
"username": username, logger.warning("Refused to issue session for missing user '%s'", username)
"expiry": time.time() + TOKEN_TTL, return None
} with self._sessions_lock:
self._sessions[token] = {
"username": username,
"expiry": time.time() + TOKEN_TTL,
}
self._save_sessions() self._save_sessions()
return token return token
+49 -2
View File
@@ -2,12 +2,15 @@ import os
import logging import logging
import sqlite3 import sqlite3
from datetime import datetime, timezone from datetime import datetime, timezone
from pathlib import Path
from sqlalchemy import event, create_engine, Column, String, Text, Boolean, DateTime, Integer, ForeignKey, JSON, Index, func, text from sqlalchemy import event, create_engine, Column, String, Text, Boolean, DateTime, Integer, ForeignKey, JSON, Index, func, text
from sqlalchemy.engine import Engine from sqlalchemy.engine import Engine
from sqlalchemy.types import TypeDecorator from sqlalchemy.types import TypeDecorator
from sqlalchemy.ext.declarative import declarative_base, declared_attr from sqlalchemy.ext.declarative import declarative_base, declared_attr
from sqlalchemy.orm import relationship, sessionmaker, backref from sqlalchemy.orm import relationship, sessionmaker, backref
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# Create base class for declarative models # Create base class for declarative models
@@ -29,9 +32,26 @@ class TimestampMixin:
def updated_at(cls): def updated_at(cls):
return Column(DateTime, default=utcnow_naive, onupdate=utcnow_naive, nullable=False) return Column(DateTime, default=utcnow_naive, onupdate=utcnow_naive, nullable=False)
# Get database URL from environment, default to SQLite in DATA_DIR # Ensure the writable data directory exists before SQLite connects.
from src.constants import DATA_DIR, AUTH_FILE, MEMORY_FILE, USER_PREFS_FILE, SETTINGS_FILE from src.constants import DATA_DIR, AUTH_FILE, MEMORY_FILE, USER_PREFS_FILE, SETTINGS_FILE
DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite:///{DATA_DIR}/app.db") Path(DATA_DIR).mkdir(parents=True, exist_ok=True)
def _default_database_url() -> str:
return f"sqlite:///{Path(DATA_DIR) / 'app.db'}"
def _normalize_sqlite_url(url: str) -> str:
if not url.startswith("sqlite:///"):
return url
db_path = url.replace("sqlite:///", "", 1)
if db_path == ":memory:" or os.path.isabs(db_path):
return url
return f"sqlite:///{(Path(get_app_root()) / db_path).resolve().as_posix()}"
# Get database URL from environment, default to SQLite in DATA_DIR
DATABASE_URL = _normalize_sqlite_url(os.getenv("DATABASE_URL", _default_database_url()))
# Create engine # Create engine
engine = create_engine( engine = create_engine(
@@ -324,6 +344,13 @@ class EmailAccount(TimestampMixin, Base):
smtp_password = Column(String, default="") smtp_password = Column(String, default="")
from_address = Column(String, default="") from_address = Column(String, default="")
display_name = Column(String, nullable=True) # "Hriday Ranka" — used in From: header
# OAuth2 (Google / Google Workspace). Tokens stored encrypted via secret_storage.
oauth_provider = Column(String, nullable=True) # "google" or None
oauth_access_token = Column(String, nullable=True) # encrypted
oauth_refresh_token = Column(String, nullable=True) # encrypted
oauth_token_expiry = Column(String, nullable=True) # unix timestamp string
__table_args__ = ( __table_args__ = (
Index('ix_email_accounts_owner_default', 'owner', 'is_default'), Index('ix_email_accounts_owner_default', 'owner', 'is_default'),
@@ -1427,6 +1454,25 @@ def _migrate_add_task_automation_columns():
except Exception as e: except Exception as e:
logging.getLogger(__name__).warning(f"task automation migration: {e}") logging.getLogger(__name__).warning(f"task automation migration: {e}")
def _migrate_add_email_oauth_columns():
"""Add Google OAuth and display_name columns to email_accounts if missing."""
try:
with engine.connect() as conn:
cols = [r[1] for r in conn.execute(text("PRAGMA table_info(email_accounts)"))]
for col, typedef in [
("oauth_provider", "TEXT"),
("oauth_access_token", "TEXT"),
("oauth_refresh_token", "TEXT"),
("oauth_token_expiry", "TEXT"),
("display_name", "TEXT"),
]:
if col not in cols:
conn.execute(text(f"ALTER TABLE email_accounts ADD COLUMN {col} {typedef}"))
conn.commit()
except Exception as e:
logging.getLogger(__name__).warning(f"email oauth columns migration: {e}")
def _migrate_add_oauth_config(): def _migrate_add_oauth_config():
"""Add oauth_config column to mcp_servers table if missing.""" """Add oauth_config column to mcp_servers table if missing."""
try: try:
@@ -1771,6 +1817,7 @@ def init_db():
_migrate_add_tidy_verdict() _migrate_add_tidy_verdict()
_migrate_add_doc_source_email_cols() _migrate_add_doc_source_email_cols()
_migrate_add_oauth_config() _migrate_add_oauth_config()
_migrate_add_email_oauth_columns()
_migrate_add_task_automation_columns() _migrate_add_task_automation_columns()
_migrate_add_disabled_tools() _migrate_add_disabled_tools()
_migrate_add_mcp_oauth_tokens_column() _migrate_add_mcp_oauth_tokens_column()
+27
View File
@@ -0,0 +1,27 @@
"""Helpers for keeping sensitive data out of logs.
Endpoint URLs configured by admins can embed credentials in the userinfo
(``https://user:pass@host``) or query string (``?api_key=...``). Logging them
raw leaks those secrets, so route/diagnostic logs run URLs through
``redact_url`` first. Reconstructing the URL without userinfo/query/fragment
also doubles as a sanitizer barrier for CodeQL's clear-text-logging query.
"""
from urllib.parse import urlparse, urlunparse
def redact_url(url: str) -> str:
"""Return a URL safe for logs by removing userinfo and query/fragment.
Keeps scheme, host, port and path so logs stay useful for debugging.
"""
try:
parsed = urlparse(url or "")
host = parsed.hostname or ""
if ":" in host: # IPv6 literal — re-bracket so host:port stays unambiguous
host = f"[{host}]"
if parsed.port:
host = f"{host}:{parsed.port}"
return urlunparse((parsed.scheme, host, parsed.path, "", "", ""))
except Exception:
return "<endpoint>"
+6 -7
View File
@@ -15,6 +15,8 @@ from starlette.responses import Response
# same value from this module. Never persisted or exposed externally. # same value from this module. Never persisted or exposed externally.
INTERNAL_TOOL_TOKEN = os.environ.get("ODYSSEUS_INTERNAL_TOKEN") or secrets.token_hex(32) INTERNAL_TOOL_TOKEN = os.environ.get("ODYSSEUS_INTERNAL_TOKEN") or secrets.token_hex(32)
INTERNAL_TOOL_HEADER = "X-Odysseus-Internal-Token" INTERNAL_TOOL_HEADER = "X-Odysseus-Internal-Token"
# Pseudo-username on in-process tool-loopback requests; require_admin trusts it and it is reserved.
INTERNAL_TOOL_USER = "internal-tool"
def is_cors_preflight(method: str, headers) -> bool: def is_cors_preflight(method: str, headers) -> bool:
@@ -39,7 +41,7 @@ def require_admin(request: Request):
hdr = request.headers.get(INTERNAL_TOOL_HEADER) hdr = request.headers.get(INTERNAL_TOOL_HEADER)
if hdr and secrets.compare_digest(hdr, INTERNAL_TOOL_TOKEN): if hdr and secrets.compare_digest(hdr, INTERNAL_TOOL_TOKEN):
return return
if getattr(request.state, "current_user", None) == "internal-tool": if getattr(request.state, "current_user", None) == INTERNAL_TOOL_USER:
return return
except Exception: except Exception:
pass pass
@@ -65,10 +67,9 @@ class SecurityHeadersMiddleware(BaseHTTPMiddleware):
response = await call_next(request) response = await call_next(request)
path = request.url.path path = request.url.path
# Tool render endpoints are served inside iframes — allow framing by self # Tool render endpoints
is_tool_render = path.startswith("/api/tools/") and path.endswith("/render") is_tool_render = path.startswith("/api/tools/") and path.endswith("/render")
# PDF previews are embedded by the in-app document library. Keep the # Document library PDF preview endpoint
# exception route-scoped so normal app pages remain unframeable.
is_document_pdf_preview = path.startswith("/api/document/") and path.endswith("/render-pdf") is_document_pdf_preview = path.startswith("/api/document/") and path.endswith("/render-pdf")
# Visual report pages are self-contained HTML — need inline scripts + external images # Visual report pages are self-contained HTML — need inline scripts + external images
is_report = path.startswith("/api/research/report/") is_report = path.startswith("/api/research/report/")
@@ -95,9 +96,7 @@ class SecurityHeadersMiddleware(BaseHTTPMiddleware):
"frame-ancestors 'none'" "frame-ancestors 'none'"
) )
elif is_tool_render: elif is_tool_render:
# Tool iframe content: skip all framing headers — the iframe's # Skip framing headers for tools.
# sandbox="allow-scripts" attribute provides isolation.
# Don't overwrite the route's own restrictive CSP either.
pass pass
elif is_document_pdf_preview: elif is_document_pdf_preview:
response.headers["X-Frame-Options"] = "SAMEORIGIN" response.headers["X-Frame-Options"] = "SAMEORIGIN"
+16
View File
@@ -28,6 +28,14 @@ services:
# land under /app/.local for the odysseus user. Persist them so a # land under /app/.local for the odysseus user. Persist them so a
# container recreate does not silently remove installed serve engines. # container recreate does not silently remove installed serve engines.
- ${APP_DATA_DIR:-./data}/local:/app/.local:z - ${APP_DATA_DIR:-./data}/local:/app/.local:z
# Docker socket — lets Cookbook launch commands like
# `docker exec ollama-rocm ollama show <tag>` reach the host's
# Docker daemon (and sibling containers like ollama-rocm /
# ollama-test). The in-container user needs to be in the
# socket's owning group — see `group_add` below; the GID
# there must match the host's `docker` group (defaults to 963
# on Debian, 999 on Ubuntu — override via env if yours differs).
- /var/run/docker.sock:/var/run/docker.sock
extra_hosts: extra_hosts:
# Lets the container reach local services on the Docker host, including # Lets the container reach local services on the Docker host, including
# Ollama at http://host.docker.internal:11434. # Ollama at http://host.docker.internal:11434.
@@ -60,6 +68,13 @@ services:
- ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1} - ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1}
- ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost} - ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost}
- ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760} - ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760}
- ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES:-104857600}
- ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_MEMORY_IMPORT_MAX_BYTES=${ODYSSEUS_MEMORY_IMPORT_MAX_BYTES:-10485760}
- ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES=${ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES=${ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_STT_MAX_AUDIO_BYTES=${ODYSSEUS_STT_MAX_AUDIO_BYTES:-26214400}
- ODYSSEUS_ICS_MAX_BYTES=${ODYSSEUS_ICS_MAX_BYTES:-10485760}
- DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-} - DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-}
- GOOGLE_API_KEY=${GOOGLE_API_KEY:-} - GOOGLE_API_KEY=${GOOGLE_API_KEY:-}
- GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-} - GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-}
@@ -86,6 +101,7 @@ services:
- /dev/kfd - /dev/kfd
- /dev/dri - /dev/dri
group_add: group_add:
- "${DOCKER_GID:-963}"
- video - video
- ${RENDER_GID:-render} - ${RENDER_GID:-render}
+17
View File
@@ -27,6 +27,16 @@ services:
# land under /app/.local for the odysseus user. Persist them so a # land under /app/.local for the odysseus user. Persist them so a
# container recreate does not silently remove installed serve engines. # container recreate does not silently remove installed serve engines.
- ${APP_DATA_DIR:-./data}/local:/app/.local:z - ${APP_DATA_DIR:-./data}/local:/app/.local:z
# Docker socket — lets Cookbook launch commands like
# `docker exec ollama-rocm ollama show <tag>` reach the host's
# Docker daemon (and sibling containers like ollama-rocm /
# ollama-test). The in-container user needs to be in the
# socket's owning group — see `group_add` below; the GID
# there must match the host's `docker` group (defaults to 963
# on Debian, 999 on Ubuntu — override via env if yours differs).
- /var/run/docker.sock:/var/run/docker.sock
group_add:
- "${DOCKER_GID:-963}"
extra_hosts: extra_hosts:
# Lets the container reach local services on the Docker host, including # Lets the container reach local services on the Docker host, including
# Ollama at http://host.docker.internal:11434. # Ollama at http://host.docker.internal:11434.
@@ -59,6 +69,13 @@ services:
- ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1} - ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1}
- ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost} - ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost}
- ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760} - ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760}
- ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES:-104857600}
- ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_MEMORY_IMPORT_MAX_BYTES=${ODYSSEUS_MEMORY_IMPORT_MAX_BYTES:-10485760}
- ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES=${ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES=${ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_STT_MAX_AUDIO_BYTES=${ODYSSEUS_STT_MAX_AUDIO_BYTES:-26214400}
- ODYSSEUS_ICS_MAX_BYTES=${ODYSSEUS_ICS_MAX_BYTES:-10485760}
- DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-} - DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-}
- GOOGLE_API_KEY=${GOOGLE_API_KEY:-} - GOOGLE_API_KEY=${GOOGLE_API_KEY:-}
- GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-} - GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-}
+17
View File
@@ -16,6 +16,16 @@ services:
# land under /app/.local for the odysseus user. Persist them so a # land under /app/.local for the odysseus user. Persist them so a
# container recreate does not silently remove installed serve engines. # container recreate does not silently remove installed serve engines.
- ${APP_DATA_DIR:-./data}/local:/app/.local:z - ${APP_DATA_DIR:-./data}/local:/app/.local:z
# Docker socket — lets Cookbook launch commands like
# `docker exec ollama-rocm ollama show <tag>` reach the host's
# Docker daemon (and sibling containers like ollama-rocm /
# ollama-test). The in-container user needs to be in the
# socket's owning group — see `group_add` below; the GID
# there must match the host's `docker` group (defaults to 963
# on Debian, 999 on Ubuntu — override via env if yours differs).
- /var/run/docker.sock:/var/run/docker.sock
group_add:
- "${DOCKER_GID:-963}"
extra_hosts: extra_hosts:
# Lets the container reach local services on the Docker host, including # Lets the container reach local services on the Docker host, including
# Ollama at http://host.docker.internal:11434. # Ollama at http://host.docker.internal:11434.
@@ -48,6 +58,13 @@ services:
- ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1} - ODYSSEUS_INPROCESS_TASKS=${ODYSSEUS_INPROCESS_TASKS:-1}
- ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost} - ODYSSEUS_SCRIPT_HOST=${ODYSSEUS_SCRIPT_HOST:-localhost}
- ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760} - ODYSSEUS_CHAT_UPLOAD_MAX_BYTES=${ODYSSEUS_CHAT_UPLOAD_MAX_BYTES:-10485760}
- ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES:-104857600}
- ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES=${ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_MEMORY_IMPORT_MAX_BYTES=${ODYSSEUS_MEMORY_IMPORT_MAX_BYTES:-10485760}
- ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES=${ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES=${ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES:-26214400}
- ODYSSEUS_STT_MAX_AUDIO_BYTES=${ODYSSEUS_STT_MAX_AUDIO_BYTES:-26214400}
- ODYSSEUS_ICS_MAX_BYTES=${ODYSSEUS_ICS_MAX_BYTES:-10485760}
- DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-} - DATA_BRAVE_API_KEY=${DATA_BRAVE_API_KEY:-}
- GOOGLE_API_KEY=${GOOGLE_API_KEY:-} - GOOGLE_API_KEY=${GOOGLE_API_KEY:-}
- GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-} - GOOGLE_PSE_CX=${GOOGLE_PSE_CX:-}
+74 -19
View File
@@ -13,6 +13,8 @@ set -e
PUID="${PUID:-1000}" PUID="${PUID:-1000}"
PGID="${PGID:-1000}" PGID="${PGID:-1000}"
GOSU_BIN="$(command -v gosu)"
PYTHON_BIN="$(command -v python)"
# Reuse an existing matching group/user if the host's UID/GID already # Reuse an existing matching group/user if the host's UID/GID already
# corresponds to one in /etc/passwd (e.g. when the image is rebuilt # corresponds to one in /etc/passwd (e.g. when the image is rebuilt
@@ -24,26 +26,78 @@ if ! getent passwd "$PUID" >/dev/null 2>&1; then
useradd -u "$PUID" -g "$PGID" -M -s /bin/sh -d /app odysseus useradd -u "$PUID" -g "$PGID" -M -s /bin/sh -d /app odysseus
fi fi
# Repair ownership on every writable path the app touches at runtime. ODY_USER="$(getent passwd "$PUID" | cut -d: -f1)"
# [ -z "$ODY_USER" ] && ODY_USER=odysseus
# Bind-mounted dirs (/app/data, /app/logs) are the obvious ones, but
# the app ALSO writes inside the image's own source tree at runtime: # Docker-socket group plumbing. When /var/run/docker.sock is bind-mounted
# - services/cache/{search,content}/* (search cache LRU) # (Cookbook uses docker exec to reach sibling containers), the socket is
# - services/search_analytics.json # owned by root:<host docker gid>. Add the app user to that group and later
# - services/search_engine_error.log # call gosu by username so supplementary groups are retained.
# - services/tts cache, etc. DOCKER_SOCK="${DOCKER_SOCK:-/var/run/docker.sock}"
# These dirs were created as root during `docker build`, so dropping if [ -S "$DOCKER_SOCK" ]; then
# to PUID:PGID would otherwise crash on the first import that tries SOCK_GID="$(stat -c '%g' "$DOCKER_SOCK" 2>/dev/null || echo '')"
# to mkdir them. Chown the whole /app tree — fast (<1s on this size) if [ -n "$SOCK_GID" ] && [ "$SOCK_GID" != "0" ]; then
# and idempotent via the `-not -uid` filter so we only touch files if ! getent group "$SOCK_GID" >/dev/null 2>&1; then
# that need fixing. groupadd -g "$SOCK_GID" docker_host || true
for dir in /app /app/data /app/logs; do fi
SOCK_GROUP="$(getent group "$SOCK_GID" | cut -d: -f1)"
if [ -n "$SOCK_GROUP" ]; then
usermod -aG "$SOCK_GROUP" "$ODY_USER" 2>/dev/null || true
fi
fi
fi
mount_root_for() {
awk -v target="$1" '$5 == target { print $4; exit }' /proc/self/mountinfo 2>/dev/null || true
}
is_broad_mount_root() {
case "$1" in
/|/home|/srv|/var|/usr|/opt|/tmp|/mnt|/media)
return 0
;;
esac
return 1
}
repair_tree_ownership() {
dir="$1"
if [ -d "$dir" ]; then if [ -d "$dir" ]; then
# `find ... -not -uid` keeps this O(touched-files), not find "$dir" -xdev -not -uid "$PUID" -print0 2>/dev/null \
# O(everything), so terabyte-sized maildirs don't slow startup.
find "$dir" -not -uid "$PUID" -print0 2>/dev/null \
| xargs -0 -r chown "$PUID:$PGID" 2>/dev/null || true | xargs -0 -r chown "$PUID:$PGID" 2>/dev/null || true
fi fi
}
repair_app_tree_ownership() {
if [ -d /app ]; then
find /app -xdev \
\( -path /app/data -o -path /app/logs -o -path /app/.ssh -o -path /app/.cache -o -path /app/.local \) -prune \
-o -not -uid "$PUID" -print0 2>/dev/null \
| xargs -0 -r chown "$PUID:$PGID" 2>/dev/null || true
fi
}
repair_bind_mount_ownership() {
dir="$1"
if [ ! -d "$dir" ]; then
return
fi
mount_root="$(mount_root_for "$dir")"
if is_broad_mount_root "$mount_root"; then
echo "Skipping recursive ownership repair for $dir because it maps to broad host path $mount_root" >&2
chown "$PUID:$PGID" "$dir" 2>/dev/null || true
return
fi
repair_tree_ownership "$dir"
}
# Repair image-owned writable paths without walking into bind-mounted host
# trees, then repair the app-owned mount roots separately.
repair_app_tree_ownership
for dir in /app/data /app/logs /app/.ssh /app/.cache/huggingface /app/.local; do
repair_bind_mount_ownership "$dir"
done done
# Cookbook installs vllm/etc. via `pip install --user`, which pulls # Cookbook installs vllm/etc. via `pip install --user`, which pulls
@@ -70,6 +124,7 @@ for cu in \
break break
fi fi
done done
# Disable the FlashInfer JIT sampler unconditionally — it is sampler-only # Disable the FlashInfer JIT sampler unconditionally — it is sampler-only
# and has no impact on the attention path, but requires nvcc + matching # and has no impact on the attention path, but requires nvcc + matching
# CUDA headers at startup. Without this, vLLM crashes with "Could not find # CUDA headers at startup. Without this, vLLM crashes with "Could not find
@@ -83,9 +138,9 @@ export PATH="/app/.local/bin:$PATH"
# Run first-time setup as the app user so data/ files get the right ownership. # Run first-time setup as the app user so data/ files get the right ownership.
# setup.py is idempotent — skips auth.json / .env if they already exist. # setup.py is idempotent — skips auth.json / .env if they already exist.
# || true so a setup failure never prevents the container from starting. # || true so a setup failure never prevents the container from starting.
gosu "$PUID:$PGID" python /app/setup.py || true "$GOSU_BIN" "$ODY_USER" "$PYTHON_BIN" /app/setup.py || true
# Drop root and run the actual app. `gosu` is preferred over `su` / # Drop root and run the actual app. `gosu` is preferred over `su` /
# `sudo` because it cleans up the process tree (no extra shell layer) # `sudo` because it cleans up the process tree (no extra shell layer)
# so signals (SIGTERM from `docker stop`) reach uvicorn directly. # so signals (SIGTERM from `docker stop`) reach uvicorn directly.
exec gosu "$PUID:$PGID" "$@" exec "$GOSU_BIN" "$ODY_USER" "$@"
BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.0 MiB

BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 3.4 MiB

BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.1 MiB

BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 1003 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 185 KiB

BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 79 KiB

BIN
View File
Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.5 MiB

+14 -9
View File
@@ -1,14 +1,16 @@
# Security CI guide # Security CI guide
This project runs a set of automated security checks on every pull request and This project runs a set of automated security checks on pull requests and
on every push to `main`. This page explains what each one does, whether it can selected branch pushes. This page explains what each one does, whether it can
block a merge, and the few one-time settings you should turn on to get the full block a merge, and the few one-time settings you should turn on to get the full
benefit. benefit.
## What runs, and why ## What runs, and why
Each check lives in its own file under `.github/workflows/`. They run Most checks live in files under `.github/workflows/`. CodeQL is configured
automatically; you do not start them. through GitHub's code scanning default setup, so it appears as a dynamic GitHub
workflow instead of a checked-in workflow file. They run automatically; you do
not start them.
| Check | What it protects against | Blocks a merge? | | Check | What it protects against | Blocks a merge? |
|---|---|---| |---|---|---|
@@ -88,11 +90,14 @@ let the workflows run on one pull request first, then add them here.
2. Turn on **Dependency graph** (usually on by default for public repos) -- this 2. Turn on **Dependency graph** (usually on by default for public repos) -- this
powers Dependency review and Dependabot. powers Dependency review and Dependabot.
3. Turn on **Dependabot alerts** and **Dependabot security updates**. 3. Turn on **Dependabot alerts** and **Dependabot security updates**.
4. Under **Code scanning**, you have two ways to scan the app code with CodeQL: 4. Under **Code scanning**, use **Set up -> Default** for CodeQL. GitHub then
- The included `codeql.yml` workflow already scans `main` and runs weekly. runs CodeQL as a dynamic workflow without the fork-token limitations that
- To also scan **pull requests** (recommended, since most contributions come affect checked-in advanced workflows.
from forks), click **Set up -> Default** under Code scanning. GitHub then
runs CodeQL on pull requests for you, with no token limitations. Do not also add a checked-in CodeQL workflow while default setup is enabled:
GitHub rejects advanced CodeQL uploads when default setup is active. If the
project later needs an advanced CodeQL workflow, disable default setup first
and keep only one CodeQL publishing path active.
## Keeping it current ## Keeping it current
+14 -1
View File
@@ -15,7 +15,7 @@ On first setup, Odysseus creates an admin account (`admin` unless
For Docker installs, the same line is in `docker compose logs odysseus`. For Docker installs, the same line is in `docker compose logs odysseus`.
Use that for the first login, then change it in **Settings**. Use that for the first login, then change it in **Settings**.
Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and Contributing? See [CONTRIBUTING.md](../CONTRIBUTING.md) for setup, testing, and
pull request guidelines. pull request guidelines.
### Docker (recommended) ### Docker (recommended)
@@ -250,6 +250,19 @@ python -m uvicorn app:app --host 127.0.0.1 --port 7000
If `python` points at an older interpreter, use `py -3.12` (or another installed If `python` points at an older interpreter, use `py -3.12` (or another installed
3.11+ version) for the venv step. 3.11+ version) for the venv step.
**Exposing on a LAN/Tailscale (Windows):** the launcher binds to `127.0.0.1` and
does **not** read `APP_BIND` / `ODYSSEUS_HOST` from `.env`, so editing `.env`
alone leaves the native Windows server on loopback. Pass the launcher's
`-BindHost` flag instead:
```powershell
powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1 -BindHost 0.0.0.0
```
The manual `uvicorn` command takes the same address as `--host 0.0.0.0`. Bind
outside loopback only for a trusted LAN/VPN such as Tailscale: keep
`AUTH_ENABLED=true` and do not expose the port directly to the public internet.
**Requirements:** Python 3.11+. The core app (chat, agent, memory, documents, **Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
email, calendar, deep research) runs fully native. For full **Cookbook** background email, calendar, deep research) runs fully native. For full **Cookbook** background
model downloads and the agent shell tool, also install model downloads and the agent shell tool, also install
+8
View File
@@ -105,6 +105,14 @@ if (-not $pyExe) {
} }
} }
if ($pyExe -like "*WindowsApps*python.exe") {
$pyCmd = Get-Command py -ErrorAction SilentlyContinue
if ($pyCmd) {
$pyExe = $pyCmd.Source
$pyArgs = @("-3.11")
}
}
if (-not $pyExe) { if (-not $pyExe) {
Fail "Couldn't find Python 3.11+ for Windows setup. Install Python 3.11+ (or open the Python launcher with 'py -3.11') from https://www.python.org/downloads/, then re-run this script." Fail "Couldn't find Python 3.11+ for Windows setup. Install Python 3.11+ (or open the Python launcher with 'py -3.11') from https://www.python.org/downloads/, then re-run this script."
} }
+142
View File
@@ -0,0 +1,142 @@
# launcher.py
"""Dedicated entrypoint for the standalone Windows portable launcher.
Handles:
- Immediate GUI splash screen creation using tkinter.
- Suppressing console stream crashes in windowed GUI mode via NullWriter.
- Spawning system tray icon via pystray and Pillow (lazy-loaded).
- Auto-opening default browser pointing to the running backend.
- Launching the FastAPI server (importing and running app.py).
"""
import os
import sys
import threading
import time
import webbrowser
# Define a dummy NullWriter to suppress standard stream crashes (isatty etc.) in GUI mode
class NullWriter:
def write(self, text):
pass
def flush(self):
pass
def isatty(self):
return False
if sys.stdout is None:
sys.stdout = NullWriter()
if sys.stderr is None:
sys.stderr = NullWriter()
splash_root = None
# If running from a frozen PyInstaller bundle, launch the splash screen IMMEDIATELY
if getattr(sys, 'frozen', False):
import tkinter as tk
def show_splash_instantly():
global splash_root
try:
splash_root = tk.Tk()
splash_root.title("Odysseus")
splash_root.overrideredirect(True)
splash_root.configure(bg="#1a1c23")
# Accented borders
splash_root.config(highlightbackground="#e06c75", highlightcolor="#e06c75", highlightthickness=1)
w, h = 360, 160
ws = splash_root.winfo_screenwidth()
hs = splash_root.winfo_screenheight()
x = (ws - w) // 2
y = (hs - h) // 2
splash_root.geometry(f"{w}x{h}+{x}+{y}")
tk.Label(splash_root, text="⛵ Odysseus", font=("Segoe UI", 22, "bold"), bg="#1a1c23", fg="#e06c75").pack(pady=(22, 2))
tk.Label(splash_root, text="Launching background services...", font=("Segoe UI", 10), bg="#1a1c23", fg="#d1d4e0").pack(pady=2)
tk.Label(splash_root, text="Please wait, this will take a few seconds.", font=("Segoe UI", 8, "italic"), bg="#1a1c23", fg="#5c6370").pack(pady=(12, 0))
splash_root.attributes("-topmost", True)
splash_root.mainloop()
except Exception:
pass
# Launch the GUI splash screen immediately on a background thread
threading.Thread(target=show_splash_instantly, daemon=True).start()
def create_tray_image():
# Generate a beautiful 64x64 icon matching Odysseus brand red accent (#e06c75)
from PIL import Image, ImageDraw
image = Image.new('RGBA', (64, 64), (0, 0, 0, 0))
dc = ImageDraw.Draw(image)
accent_red = (224, 108, 117, 255)
light_red = (224, 108, 117, 150)
# Draw premium sailing boat
dc.polygon([(32, 10), (32, 45), (12, 45)], fill=accent_red)
dc.polygon([(32, 18), (32, 45), (48, 45)], fill=light_red)
dc.polygon([(8, 48), (56, 48), (44, 56), (20, 56)], fill=accent_red)
return image
def on_open_browser(icon, item, url):
webbrowser.open(url)
def on_exit(icon, item):
icon.stop()
os._exit(0)
def setup_system_tray(url):
try:
import pystray
icon_img = create_tray_image()
menu = (
pystray.MenuItem('Open Odysseus', lambda icon, item: on_open_browser(icon, item, url), default=True),
pystray.MenuItem('Exit', on_exit)
)
tray_icon = pystray.Icon(
"Odysseus",
icon_img,
"Odysseus",
menu
)
tray_icon.run()
except Exception:
pass
def open_browser(url):
# Allow uvicorn and app lifecycles to complete warmups
time.sleep(3.5)
# Safely close the splash screen
try:
global splash_root
if splash_root:
splash_root.after(0, splash_root.destroy)
except Exception:
pass
webbrowser.open(url)
if __name__ == "__main__":
import uvicorn
# Import the FastAPI app from app.py
from app import app
bind_host = os.getenv("APP_BIND", "127.0.0.1")
bind_port = int(os.getenv("APP_PORT", "7000"))
url = f"http://{bind_host}:{bind_port}"
if getattr(sys, 'frozen', False):
# Start browser manager thread
threading.Thread(target=open_browser, args=(url,), daemon=True).start()
# Start system tray manager thread
threading.Thread(target=setup_system_tray, args=(url,), daemon=True).start()
uvicorn.run(app, host=bind_host, port=bind_port, log_level="info")
+94
View File
@@ -0,0 +1,94 @@
Copyright (c) 2019-07-29, Abbie Gonzalez (https://abbiecod.es|support@abbiecod.es),
with Reserved Font Name OpenDyslexic.
Copyright (c) 12/2012 - 2019
This Font Software is licensed under the SIL Open Font License, Version 1.1.
This license is copied below, and is also available with a FAQ at:
http://scripts.sil.org/OFL
-----------------------------------------------------------
SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
-----------------------------------------------------------
PREAMBLE
The goals of the Open Font License (OFL) are to stimulate worldwide
development of collaborative font projects, to support the font creation
efforts of academic and linguistic communities, and to provide a free and
open framework in which fonts may be shared and improved in partnership
with others.
The OFL allows the licensed fonts to be used, studied, modified and
redistributed freely as long as they are not sold by themselves. The
fonts, including any derivative works, can be bundled, embedded,
redistributed and/or sold with any software provided that any reserved
names are not used by derivative works. The fonts and derivatives,
however, cannot be released under any other type of license. The
requirement for fonts to remain under this license does not apply
to any document created using the fonts or their derivatives.
DEFINITIONS
"Font Software" refers to the set of files released by the Copyright
Holder(s) under this license and clearly marked as such. This may
include source files, build scripts and documentation.
"Reserved Font Name" refers to any names specified as such after the
copyright statement(s).
"Original Version" refers to the collection of Font Software components as
distributed by the Copyright Holder(s).
"Modified Version" refers to any derivative made by adding to, deleting,
or substituting -- in part or in whole -- any of the components of the
Original Version, by changing formats or by porting the Font Software to a
new environment.
"Author" refers to any designer, engineer, programmer, technical
writer or other person who contributed to the Font Software.
PERMISSION & CONDITIONS
Permission is hereby granted, free of charge, to any person obtaining
a copy of the Font Software, to use, study, copy, merge, embed, modify,
redistribute, and sell modified and unmodified copies of the Font
Software, subject to the following conditions:
1) Neither the Font Software nor any of its individual components,
in Original or Modified Versions, may be sold by itself.
2) Original or Modified Versions of the Font Software may be bundled,
redistributed and/or sold with any software, provided that each copy
contains the above copyright notice and this license. These can be
included either as stand-alone text files, human-readable headers or
in the appropriate machine-readable metadata fields within text or
binary files as long as those fields can be easily viewed by the user.
3) No Modified Version of the Font Software may use the Reserved Font
Name(s) unless explicit written permission is granted by the corresponding
Copyright Holder. This restriction only applies to the primary font name as
presented to the users.
4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
Software shall not be used to promote, endorse or advertise any
Modified Version, except to acknowledge the contribution(s) of the
Copyright Holder(s) and the Author(s) or with their explicit written
permission.
5) The Font Software, modified or unmodified, in part or in whole,
must be distributed entirely under this license, and must not be
distributed under any other license. The requirement for fonts to
remain under this license does not apply to any document created
using the Font Software.
TERMINATION
This license becomes null and void if any of the above conditions are
not met.
DISCLAIMER
THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE
COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
OTHER DEALINGS IN THE FONT SOFTWARE.
+89 -12
View File
@@ -23,6 +23,7 @@ import os.path
from pathlib import Path from pathlib import Path
from datetime import datetime, timedelta from datetime import datetime, timedelta
import uuid import uuid
from contextvars import ContextVar
from mcp.server import Server from mcp.server import Server
from mcp.server.stdio import stdio_server from mcp.server.stdio import stdio_server
@@ -55,6 +56,8 @@ def _uid_fetch_rows(data) -> list:
# flat keys when no DB row matches (legacy single-account behaviour). # flat keys when no DB row matches (legacy single-account behaviour).
_ACCOUNT_CACHE: dict = {} # key = normalized account selector -> config dict _ACCOUNT_CACHE: dict = {} # key = normalized account selector -> config dict
_MCP_OWNER_ARG = "_odysseus_owner"
_CURRENT_OWNER: ContextVar[str | None] = ContextVar("email_mcp_owner", default=None)
def _clean_header_value(value) -> str: def _clean_header_value(value) -> str:
@@ -68,6 +71,45 @@ def _db_path() -> Path:
return Path(APP_DB) return Path(APP_DB)
def _current_owner() -> str:
owner = _CURRENT_OWNER.get()
return str(owner or "").strip()
def _account_visible_to_owner(row: dict, owner: str) -> bool:
row_owner = str(row.get("owner") or "").strip()
if row_owner == owner:
return True
if row_owner:
return False
# Legacy ownerless accounts are only visible to a scoped caller when the
# mailbox itself matches the owner, mirroring the HTTP email route fallback.
owner_l = owner.lower()
return owner_l in {
str(row.get("imap_user") or "").strip().lower(),
str(row.get("from_address") or "").strip().lower(),
}
def _filter_accounts_for_owner(rows: list[dict]) -> list[dict]:
owner = _current_owner()
if owner:
return [r for r in rows if _account_visible_to_owner(r, owner)]
owners = {str(r.get("owner") or "").strip() for r in rows if str(r.get("owner") or "").strip()}
if len(owners) > 1:
return []
return rows
def _mcp_owner_required(rows: list[dict] | None = None) -> bool:
if _current_owner():
return False
rows = rows if rows is not None else _read_accounts_from_db()
owners = {str(r.get("owner") or "").strip() for r in rows if str(r.get("owner") or "").strip()}
return len(owners) > 1
def _load_email_writing_style() -> str: def _load_email_writing_style() -> str:
"""Return the existing Settings > Email > Writing Style value.""" """Return the existing Settings > Email > Writing Style value."""
try: try:
@@ -121,9 +163,8 @@ def _default_document_owner() -> str | None:
return None return None
def _list_accounts_raw() -> list: def _read_accounts_from_db() -> list:
"""Return list of dicts from the email_accounts table. Empty list if table """Return all enabled email account rows. Empty list if missing. Never raises."""
missing or empty. Never raises."""
path = _db_path() path = _db_path()
if not path.exists(): if not path.exists():
return [] return []
@@ -131,9 +172,10 @@ def _list_accounts_raw() -> list:
conn = sqlite3.connect(str(path)) conn = sqlite3.connect(str(path))
conn.row_factory = sqlite3.Row conn.row_factory = sqlite3.Row
columns = {r[1] for r in conn.execute("PRAGMA table_info(email_accounts)").fetchall()} columns = {r[1] for r in conn.execute("PRAGMA table_info(email_accounts)").fetchall()}
owner_select = "owner" if "owner" in columns else "NULL AS owner"
smtp_security_select = "smtp_security" if "smtp_security" in columns else "'' AS smtp_security" smtp_security_select = "smtp_security" if "smtp_security" in columns else "'' AS smtp_security"
rows = conn.execute(f""" rows = conn.execute(f"""
SELECT id, name, is_default, enabled, SELECT id, {owner_select}, name, is_default, enabled,
imap_host, imap_port, imap_user, imap_password, imap_starttls, imap_host, imap_port, imap_user, imap_password, imap_starttls,
smtp_host, smtp_port, {smtp_security_select}, smtp_user, smtp_password, from_address smtp_host, smtp_port, {smtp_security_select}, smtp_user, smtp_password, from_address
FROM email_accounts WHERE enabled = 1 FROM email_accounts WHERE enabled = 1
@@ -147,11 +189,15 @@ def _list_accounts_raw() -> list:
return [] return []
def _resolve_account(selector: str | None) -> dict | None: def _list_accounts_raw() -> list:
"""Return owner-visible email account rows for the active MCP call."""
return _filter_accounts_for_owner(_read_accounts_from_db())
def _resolve_account_from_rows(rows: list[dict], selector: str | None) -> dict | None:
"""Given a selector (None = default, or a name/user/id string), return the """Given a selector (None = default, or a name/user/id string), return the
matching row or None. Matching is case-insensitive substring on name + matching row or None. Matching is case-insensitive substring on name +
imap_user + from_address, plus exact id match.""" imap_user + from_address, plus exact id match."""
rows = _list_accounts_raw()
if not rows: if not rows:
return None return None
if not selector: if not selector:
@@ -186,6 +232,10 @@ def _resolve_account(selector: str | None) -> dict | None:
return None return None
def _resolve_account(selector: str | None) -> dict | None:
return _resolve_account_from_rows(_list_accounts_raw(), selector)
def _load_config(account: str | None = None) -> dict: def _load_config(account: str | None = None) -> dict:
"""Return the full config dict for the requested account (or default). """Return the full config dict for the requested account (or default).
@@ -194,7 +244,7 @@ def _load_config(account: str | None = None) -> dict:
2. env vars + settings.json flat keys (legacy) 2. env vars + settings.json flat keys (legacy)
3. hardcoded fallbacks (localhost:31143 etc.) 3. hardcoded fallbacks (localhost:31143 etc.)
""" """
cache_key = (account or "").strip().lower() or "__default__" cache_key = (_current_owner(), (account or "").strip().lower() or "__default__")
if cache_key in _ACCOUNT_CACHE: if cache_key in _ACCOUNT_CACHE:
return _ACCOUNT_CACHE[cache_key] return _ACCOUNT_CACHE[cache_key]
@@ -223,8 +273,11 @@ def _load_config(account: str | None = None) -> dict:
"account_name": None, "account_name": None,
} }
rows = _list_accounts_raw() raw_rows = _read_accounts_from_db()
row = _resolve_account(account) rows = _filter_accounts_for_owner(raw_rows)
row = _resolve_account_from_rows(rows, account)
if _current_owner() and raw_rows and not rows:
raise ValueError("No email account is configured for the authenticated owner")
if account and rows and not row: if account and rows and not row:
available = ", ".join( available = ", ".join(
f"{r.get('name') or r.get('imap_user')} <{r.get('imap_user') or r.get('from_address') or '?'}>" f"{r.get('name') or r.get('imap_user')} <{r.get('imap_user') or r.get('from_address') or '?'}>"
@@ -953,7 +1006,7 @@ def _stash_agent_draft(*, to, subject, body, in_reply_to=None, references=None,
now, now,
account or None, account or None,
"agent_draft", "agent_draft",
"", _current_owner(),
)) ))
conn.commit() conn.commit()
conn.close() conn.close()
@@ -1139,7 +1192,7 @@ def _create_email_draft_document(
doc_id = str(uuid.uuid4()) doc_id = str(uuid.uuid4())
ver_id = str(uuid.uuid4()) ver_id = str(uuid.uuid4())
doc_title = (title or subject or "Email draft").strip() or "Email draft" doc_title = (title or subject or "Email draft").strip() or "Email draft"
doc_owner = _default_document_owner() doc_owner = _current_owner() or _default_document_owner()
db = SessionLocal() db = SessionLocal()
try: try:
@@ -1925,10 +1978,22 @@ async def list_tools() -> list[Tool]:
@server.call_tool() @server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]: async def call_tool(name: str, arguments: dict) -> list[TextContent]:
arguments = dict(arguments) if isinstance(arguments, dict) else {}
owner = str(arguments.pop(_MCP_OWNER_ARG, "") or "").strip()
owner_token = _CURRENT_OWNER.set(owner or None)
try: try:
all_db_accounts = _read_accounts_from_db()
if _mcp_owner_required(all_db_accounts):
return [TextContent(
type="text",
text="Error: email MCP requires an authenticated owner when multiple email account owners are configured.",
)]
if name == "list_email_accounts": if name == "list_email_accounts":
rows = _list_accounts_raw() rows = _filter_accounts_for_owner(all_db_accounts)
if not rows: if not rows:
if all_db_accounts and owner:
return [TextContent(type="text", text="No email accounts configured for this owner.")]
return [TextContent(type="text", text="No email accounts configured. Legacy single-account mode active.")] return [TextContent(type="text", text="No email accounts configured. Legacy single-account mode active.")]
lines = [f"Found {len(rows)} email account(s):\n"] lines = [f"Found {len(rows)} email account(s):\n"]
for r in rows: for r in rows:
@@ -2108,6 +2173,16 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
bcc=arguments.get("bcc"), bcc=arguments.get("bcc"),
account=acct, account=acct,
) )
if "error" in result:
return [TextContent(type="text", text=f"Error: {result['error']}")]
if result.get("pending"):
return [TextContent(
type="text",
text=(
f"Draft staged for approval (pending id: {result.get('pending_id')}). "
"Nothing has been sent yet. Review and approve it in Odysseus before delivery."
),
)]
acct_note = f" (from {result['account']})" if result.get("account") else "" acct_note = f" (from {result['account']})" if result.get("account") else ""
return [TextContent(type="text", text=f"Sent email to {result['to']} with subject '{result['subject']}'{acct_note}.")] return [TextContent(type="text", text=f"Sent email to {result['to']} with subject '{result['subject']}'{acct_note}.")]
@@ -2283,6 +2358,8 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
except Exception as e: except Exception as e:
return [TextContent(type="text", text=f"Error: {e}")] return [TextContent(type="text", text=f"Error: {e}")]
finally:
_CURRENT_OWNER.reset(owner_token)
# ── Main ── # ── Main ──
+90 -29
View File
@@ -6,6 +6,7 @@ Imports MemoryManager and MemoryVectorStore from the Odysseus codebase.
""" """
import asyncio import asyncio
import os
import sys import sys
import time import time
from pathlib import Path from pathlib import Path
@@ -23,6 +24,55 @@ _memory_manager = None
_memory_vector = None _memory_vector = None
_initialized = False _initialized = False
_OWNER_ENV_KEYS = ("ODYSSEUS_MCP_MEMORY_OWNER", "ODYSSEUS_MEMORY_OWNER")
_OWNER_SCOPE_ERROR = (
"Error: Memory MCP owner is not configured for an owner-scoped memory store. "
"Set ODYSSEUS_MCP_MEMORY_OWNER for this server or use the owner-aware native memory tool."
)
def _configured_owner() -> str | None:
for key in _OWNER_ENV_KEYS:
owner = os.environ.get(key, "").strip()
if owner:
return owner
return None
def _entry_owner(entry: dict) -> str | None:
owner = entry.get("owner")
if owner is None:
return None
owner_text = str(owner).strip()
return owner_text or None
def _owner_scoped_store(entries: list[dict]) -> bool:
return any(_entry_owner(entry) for entry in entries if isinstance(entry, dict))
def _scope_entries() -> tuple[str | None, list[dict], list[dict], str | None]:
"""Return configured owner, all entries, visible entries, and optional error."""
entries = _memory_manager.load_all()
owner = _configured_owner()
if owner is None and _owner_scoped_store(entries):
return None, entries, [], _OWNER_SCOPE_ERROR
if owner is None:
visible = [
entry for entry in entries
if isinstance(entry, dict) and _entry_owner(entry) is None
]
else:
visible = [
entry for entry in entries
if isinstance(entry, dict) and _entry_owner(entry) == owner
]
return owner, entries, visible, None
def _text_result(text: str) -> list[TextContent]:
return [TextContent(type="text", text=text)]
def _ensure_init(): def _ensure_init():
"""Lazy-init memory managers on first use.""" """Lazy-init memory managers on first use."""
@@ -75,24 +125,26 @@ async def list_tools() -> list[Tool]:
@server.call_tool() @server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]: async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name != "manage_memory": if name != "manage_memory":
return [TextContent(type="text", text=f"Unknown tool: {name}")] return _text_result(f"Unknown tool: {name}")
_ensure_init() _ensure_init()
if not _memory_manager: if not _memory_manager:
return [TextContent(type="text", text="Error: Memory manager not available")] return _text_result("Error: Memory manager not available")
action = arguments.get("action", "") action = arguments.get("action", "")
if action == "list": if action == "list":
category_filter = arguments.get("category", "") category_filter = arguments.get("category", "")
memories = _memory_manager.load() _owner, _all_memories, memories, scope_error = _scope_entries()
if scope_error:
return _text_result(scope_error)
if category_filter: if category_filter:
memories = [m for m in memories if m.get("category", "").lower() == category_filter.lower()] memories = [m for m in memories if m.get("category", "").lower() == category_filter.lower()]
if not memories: if not memories:
msg = "No memories found" msg = "No memories found"
if category_filter: if category_filter:
msg += f" in category '{category_filter}'" msg += f" in category '{category_filter}'"
return [TextContent(type="text", text=msg + ".")] return _text_result(msg + ".")
lines = [f"Found {len(memories)} memory entries:\n"] lines = [f"Found {len(memories)} memory entries:\n"]
for m in memories: for m in memories:
@@ -102,15 +154,17 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if len(text) > 150: if len(text) > 150:
text = text[:150] + "..." text = text[:150] + "..."
lines.append(f"- [{cat}] `{mid}` — {text}") lines.append(f"- [{cat}] `{mid}` — {text}")
return [TextContent(type="text", text="\n".join(lines))] return _text_result("\n".join(lines))
elif action == "add": elif action == "add":
text = arguments.get("text", "") text = arguments.get("text", "")
category = arguments.get("category", "fact") category = arguments.get("category", "fact")
if not text: if not text:
return [TextContent(type="text", text="Error: Memory text cannot be empty")] return _text_result("Error: Memory text cannot be empty")
entry = _memory_manager.add_entry(text, source="ai_agent", category=category) owner, memories, _visible, scope_error = _scope_entries()
memories = _memory_manager.load_all() if scope_error:
return _text_result(scope_error)
entry = _memory_manager.add_entry(text, source="ai_agent", category=category, owner=owner)
memories.append(entry) memories.append(entry)
_memory_manager.save(memories) _memory_manager.save(memories)
if _memory_vector and _memory_vector.healthy: if _memory_vector and _memory_vector.healthy:
@@ -118,25 +172,28 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
_memory_vector.add(entry["id"], text) _memory_vector.add(entry["id"], text)
except Exception: except Exception:
pass pass
return [TextContent(type="text", text=f"Memory added: [{category}] {text} (id: {entry['id'][:8]})")] return _text_result(f"Memory added: [{category}] {text} (id: {entry['id'][:8]})")
elif action == "edit": elif action == "edit":
memory_id = arguments.get("memory_id", "") memory_id = arguments.get("memory_id", "")
new_text = arguments.get("text", "") new_text = arguments.get("text", "")
if not memory_id or not new_text: if not memory_id or not new_text:
return [TextContent(type="text", text="Error: edit needs memory_id and text")] return _text_result("Error: edit needs memory_id and text")
memories = _memory_manager.load_all() _owner, memories, visible, scope_error = _scope_entries()
found = False if scope_error:
return _text_result(scope_error)
full_id = None full_id = None
for m in memories: for m in visible:
if m.get("id", "").startswith(memory_id): if m.get("id", "").startswith(memory_id):
m["text"] = new_text
m["timestamp"] = int(time.time())
found = True
full_id = m["id"] full_id = m["id"]
break break
if not found: if not full_id:
return [TextContent(type="text", text=f"Error: Memory '{memory_id}' not found")] return _text_result(f"Error: Memory '{memory_id}' not found")
for m in memories:
if m.get("id") == full_id:
m["text"] = new_text
m["timestamp"] = int(time.time())
break
_memory_manager.save(memories) _memory_manager.save(memories)
if _memory_vector and _memory_vector.healthy and full_id: if _memory_vector and _memory_vector.healthy and full_id:
try: try:
@@ -144,24 +201,26 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
_memory_vector.add(full_id, new_text) _memory_vector.add(full_id, new_text)
except Exception: except Exception:
pass pass
return [TextContent(type="text", text=f"Memory updated: {new_text}")] return _text_result(f"Memory updated: {new_text}")
elif action == "delete": elif action == "delete":
memory_id = arguments.get("memory_id", "") memory_id = arguments.get("memory_id", "")
if not memory_id: if not memory_id:
return [TextContent(type="text", text="Error: delete needs memory_id")] return _text_result("Error: delete needs memory_id")
memories = _memory_manager.load_all() _owner, memories, visible, scope_error = _scope_entries()
if scope_error:
return _text_result(scope_error)
full_id = None full_id = None
deleted_text = "" deleted_text = ""
deleted_category = "" deleted_category = ""
for m in memories: for m in visible:
if m.get("id", "").startswith(memory_id): if m.get("id", "").startswith(memory_id):
full_id = m["id"] full_id = m["id"]
deleted_text = m.get("text", "") deleted_text = m.get("text", "")
deleted_category = m.get("category", "") deleted_category = m.get("category", "")
break break
if not full_id: if not full_id:
return [TextContent(type="text", text=f"Error: Memory '{memory_id}' not found")] return _text_result(f"Error: Memory '{memory_id}' not found")
memories = [m for m in memories if m.get("id") != full_id] memories = [m for m in memories if m.get("id") != full_id]
_memory_manager.save(memories) _memory_manager.save(memories)
if _memory_vector and _memory_vector.healthy and full_id: if _memory_vector and _memory_vector.healthy and full_id:
@@ -171,30 +230,32 @@ async def call_tool(name: str, arguments: dict) -> list[TextContent]:
pass pass
cat = f"[{deleted_category}] " if deleted_category else "" cat = f"[{deleted_category}] " if deleted_category else ""
snippet = deleted_text if len(deleted_text) <= 120 else deleted_text[:117] + "..." snippet = deleted_text if len(deleted_text) <= 120 else deleted_text[:117] + "..."
return [TextContent(type="text", text=f"Memory deleted: {cat}{snippet} (id: {memory_id})")] return _text_result(f"Memory deleted: {cat}{snippet} (id: {memory_id})")
elif action == "search": elif action == "search":
query = arguments.get("text", "") query = arguments.get("text", "")
if not query: if not query:
return [TextContent(type="text", text="Error: search needs text (query)")] return _text_result("Error: search needs text (query)")
memories = _memory_manager.load() _owner, _all_memories, memories, scope_error = _scope_entries()
if scope_error:
return _text_result(scope_error)
if hasattr(_memory_manager, 'get_relevant_memories'): if hasattr(_memory_manager, 'get_relevant_memories'):
results = _memory_manager.get_relevant_memories(query, memories, threshold=0.05, max_items=20) results = _memory_manager.get_relevant_memories(query, memories, threshold=0.05, max_items=20)
else: else:
query_lower = query.lower() query_lower = query.lower()
results = [m for m in memories if query_lower in m.get("text", "").lower()][:20] results = [m for m in memories if query_lower in m.get("text", "").lower()][:20]
if not results: if not results:
return [TextContent(type="text", text=f"No memories found matching '{query}'.")] return _text_result(f"No memories found matching '{query}'.")
lines = [f"Found {len(results)} matching memories:\n"] lines = [f"Found {len(results)} matching memories:\n"]
for m in results: for m in results:
cat = m.get("category", "fact") cat = m.get("category", "fact")
mid = m.get("id", "?")[:8] mid = m.get("id", "?")[:8]
text = m.get("text", "") text = m.get("text", "")
lines.append(f"- [{cat}] `{mid}` — {text}") lines.append(f"- [{cat}] `{mid}` — {text}")
return [TextContent(type="text", text="\n".join(lines))] return _text_result("\n".join(lines))
else: else:
return [TextContent(type="text", text=f"Error: Unknown action '{action}'. Use: list, add, edit, delete, search")] return _text_result(f"Error: Unknown action '{action}'. Use: list, add, edit, delete, search")
async def run(): async def run():
+4 -78
View File
@@ -4,93 +4,19 @@
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"dependencies": {
"@anthropic-ai/sdk": "^0.104.1"
},
"devDependencies": { "devDependencies": {
"@antithesishq/bombadil": "^0.5.0" "@antithesishq/bombadil": "^0.6.1"
}
},
"node_modules/@anthropic-ai/sdk": {
"version": "0.104.1",
"resolved": "https://registry.npmjs.org/@anthropic-ai/sdk/-/sdk-0.104.1.tgz",
"integrity": "sha512-gGACa/+IaiXzRRmF96aOhamoBgapKRBiFWbmmTFP8aMkpaEcuStF+Q61bjo4vPxBM7gqWJNZqsngslRdnLHv0Q==",
"license": "MIT",
"dependencies": {
"json-schema-to-ts": "^3.1.1",
"standardwebhooks": "^1.0.0"
},
"bin": {
"anthropic-ai-sdk": "bin/cli"
},
"peerDependencies": {
"zod": "^3.25.0 || ^4.0.0"
},
"peerDependenciesMeta": {
"zod": {
"optional": true
}
} }
}, },
"node_modules/@antithesishq/bombadil": { "node_modules/@antithesishq/bombadil": {
"version": "0.5.0", "version": "0.6.1",
"resolved": "https://registry.npmjs.org/@antithesishq/bombadil/-/bombadil-0.5.0.tgz", "resolved": "https://registry.npmjs.org/@antithesishq/bombadil/-/bombadil-0.6.1.tgz",
"integrity": "sha512-s0zImmr0iyvSP6QcVLvf40CUiZYIdWBAxiq20uhzujwvfitYa3PGJN652k/pLtVccHM/JrGQxZdvLnihZpltHA==", "integrity": "sha512-d1iufG3MI7gSMSiSmMeNdcMW+qR0yQXL2zdkVynC3n3DYgFJYlYXKUQzygmqU12m4RWlR5iOdQU1hsx5UT6+IA==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"bin": { "bin": {
"bombadil": "bin/bombadil.js" "bombadil": "bin/bombadil.js"
} }
},
"node_modules/@babel/runtime": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/runtime/-/runtime-7.29.7.tgz",
"integrity": "sha512-Nq8OhGWiZIZGV6hLHoyAKLLcJihP/xFeBMGJoUrxTX2psI8dCifzLhZISFb+VWS3wFMRDmCGw5R+dOySCqPLhw==",
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@stablelib/base64": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/@stablelib/base64/-/base64-1.0.1.tgz",
"integrity": "sha512-1bnPQqSxSuc3Ii6MhBysoWCg58j97aUjuCSZrGSmDxNqtytIi0k8utUenAwTZN4V5mXXYGsVUI9zeBqy+jBOSQ==",
"license": "MIT"
},
"node_modules/fast-sha256": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/fast-sha256/-/fast-sha256-1.3.0.tgz",
"integrity": "sha512-n11RGP/lrWEFI/bWdygLxhI+pVeo1ZYIVwvvPkW7azl/rOy+F3HYRZ2K5zeE9mmkhQppyv9sQFx0JM9UabnpPQ==",
"license": "Unlicense"
},
"node_modules/json-schema-to-ts": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/json-schema-to-ts/-/json-schema-to-ts-3.1.1.tgz",
"integrity": "sha512-+DWg8jCJG2TEnpy7kOm/7/AxaYoaRbjVB4LFZLySZlWn8exGs3A4OLJR966cVvU26N7X9TWxl+Jsw7dzAqKT6g==",
"license": "MIT",
"dependencies": {
"@babel/runtime": "^7.18.3",
"ts-algebra": "^2.0.0"
},
"engines": {
"node": ">=16"
}
},
"node_modules/standardwebhooks": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/standardwebhooks/-/standardwebhooks-1.0.0.tgz",
"integrity": "sha512-BbHGOQK9olHPMvQNHWul6MYlrRTAOKn03rOe4A8O3CLWhNf4YHBqq2HJKKC+sfqpxiBY52pNeesD6jIiLDz8jg==",
"license": "MIT",
"dependencies": {
"@stablelib/base64": "^1.0.0",
"fast-sha256": "^1.3.0"
}
},
"node_modules/ts-algebra": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/ts-algebra/-/ts-algebra-2.0.0.tgz",
"integrity": "sha512-FPAhNPFMrkwz76P7cdjdmiShwMynZYN6SgOujD1urY4oNm80Ou9oMdmbR45LotcKOXoy7wSmHkRFE6Mxbrhefw==",
"license": "MIT"
} }
} }
} }
+1 -4
View File
@@ -4,9 +4,6 @@
"url": "https://github.com/pewdiepie-archdaemon/odysseus.git" "url": "https://github.com/pewdiepie-archdaemon/odysseus.git"
}, },
"devDependencies": { "devDependencies": {
"@antithesishq/bombadil": "^0.5.0" "@antithesishq/bombadil": "^0.6.1"
},
"dependencies": {
"@anthropic-ai/sdk": "^0.104.1"
} }
} }
+2
View File
@@ -160,6 +160,8 @@ def setup_api_token_routes() -> APIRouter:
payload = await request.json() payload = await request.json()
except Exception: except Exception:
payload = {} payload = {}
if not isinstance(payload, dict):
payload = {}
with get_db_session() as db: with get_db_session() as db:
token = db.query(ApiToken).filter(ApiToken.id == token_id).first() token = db.query(ApiToken).filter(ApiToken.id == token_id).first()
if not token: if not token:
+3 -2
View File
@@ -16,6 +16,7 @@ from pydantic import BaseModel
from core.database import SessionLocal, CrewMember, ScheduledTask from core.database import SessionLocal, CrewMember, ScheduledTask
from src.auth_helpers import get_current_user from src.auth_helpers import get_current_user
from core.auth import RESERVED_USERNAMES
from src.task_scheduler import compute_next_run from src.task_scheduler import compute_next_run
@@ -89,11 +90,11 @@ def setup_assistant_routes(task_scheduler) -> APIRouter:
# check-in tasks seeded. Hitting any /assistant route under one of these # check-in tasks seeded. Hitting any /assistant route under one of these
# used to seed a full CrewMember + Morning/Midday/Evening tasks under that # used to seed a full CrewMember + Morning/Midday/Evening tasks under that
# owner, which then double-fired alongside the real user's check-ins. # owner, which then double-fired alongside the real user's check-ins.
_SYNTHETIC_OWNERS = frozenset({"internal-tool", "api", "demo", "system", ""}) # RESERVED_USERNAMES covers the same set; the `not owner` guard handles "".
async def _get_or_create(owner: str) -> CrewMember: async def _get_or_create(owner: str) -> CrewMember:
"""Return the per-owner assistant CrewMember, creating it on demand.""" """Return the per-owner assistant CrewMember, creating it on demand."""
if not owner or owner in _SYNTHETIC_OWNERS: if not owner or owner in RESERVED_USERNAMES:
raise HTTPException(status_code=400, detail=f"Cannot seed assistant for {owner!r}") raise HTTPException(status_code=400, detail=f"Cannot seed assistant for {owner!r}")
db = SessionLocal() db = SessionLocal()
try: try:
+45 -11
View File
@@ -12,8 +12,8 @@ import re
from pathlib import Path from pathlib import Path
from core.atomic_io import atomic_write_json, atomic_write_text from core.atomic_io import atomic_write_json, atomic_write_text
from core.auth import AuthManager, SetAdminResult from core.auth import AuthManager, RESERVED_USERNAMES, SetAdminResult, TOKEN_TTL
from src.constants import DEEP_RESEARCH_DIR, MEMORY_FILE, SKILLS_DIR from src.constants import DEEP_RESEARCH_DIR, MEMORY_FILE, PASSWORD_MIN_LENGTH, SKILLS_DIR
from src.rate_limiter import RateLimiter from src.rate_limiter import RateLimiter
from src.settings_scrub import scrub_settings from src.settings_scrub import scrub_settings
from src.settings import ( from src.settings import (
@@ -102,8 +102,12 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
raise HTTPException(429, "Too many requests — try again later") raise HTTPException(429, "Too many requests — try again later")
if auth_manager.is_configured: if auth_manager.is_configured:
raise HTTPException(400, "Already configured") raise HTTPException(400, "Already configured")
if len(body.password) < 8: if len(body.password) < PASSWORD_MIN_LENGTH:
raise HTTPException(400, "Password must be at least 8 characters") raise HTTPException(400, f"Password must be at least {PASSWORD_MIN_LENGTH} characters")
if len(body.username.strip()) < 1:
raise HTTPException(400, "Username is required")
if body.username.lower() in RESERVED_USERNAMES:
raise HTTPException(403, "Username is reserved")
ok = await asyncio.to_thread(auth_manager.setup, body.username, body.password) ok = await asyncio.to_thread(auth_manager.setup, body.username, body.password)
if not ok: if not ok:
raise HTTPException(500, "Setup failed") raise HTTPException(500, "Setup failed")
@@ -118,10 +122,12 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
raise HTTPException(400, "Run setup first") raise HTTPException(400, "Run setup first")
if not auth_manager.signup_enabled: if not auth_manager.signup_enabled:
raise HTTPException(403, "Registration is disabled. Ask an admin for an account.") raise HTTPException(403, "Registration is disabled. Ask an admin for an account.")
if len(body.password) < 8: if len(body.password) < PASSWORD_MIN_LENGTH:
raise HTTPException(400, "Password must be at least 8 characters") raise HTTPException(400, f"Password must be at least {PASSWORD_MIN_LENGTH} characters")
if len(body.username.strip()) < 1: if len(body.username.strip()) < 1:
raise HTTPException(400, "Username is required") raise HTTPException(400, "Username is required")
if body.username.lower() in RESERVED_USERNAMES:
raise HTTPException(403, "Username is reserved")
ok = await asyncio.to_thread(auth_manager.create_user, body.username, body.password, is_admin=False) ok = await asyncio.to_thread(auth_manager.create_user, body.username, body.password, is_admin=False)
if not ok: if not ok:
raise HTTPException(409, "Username already taken") raise HTTPException(409, "Username already taken")
@@ -144,6 +150,8 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
raise HTTPException(401, "Invalid 2FA code") raise HTTPException(401, "Invalid 2FA code")
# All checks passed — create session (password already verified above) # All checks passed — create session (password already verified above)
token = await asyncio.to_thread(auth_manager.create_session_trusted, username) token = await asyncio.to_thread(auth_manager.create_session_trusted, username)
if not token:
raise HTTPException(401, "Invalid credentials")
cookie_kwargs = dict( cookie_kwargs = dict(
key=SESSION_COOKIE, key=SESSION_COOKIE,
value=token, value=token,
@@ -153,7 +161,7 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
path="/", path="/",
) )
if body.remember: if body.remember:
cookie_kwargs["max_age"] = 60 * 60 * 24 * 7 # 7 days cookie_kwargs["max_age"] = TOKEN_TTL
response.set_cookie(**cookie_kwargs) response.set_cookie(**cookie_kwargs)
return {"ok": True, "username": username} return {"ok": True, "username": username}
@@ -182,13 +190,18 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
pass pass
return result return result
@router.get("/policy")
async def auth_policy():
"""Return public auth policy constants for the frontend."""
return auth_manager.policy()
@router.post("/change-password") @router.post("/change-password")
async def change_password(body: ChangePasswordRequest, request: Request): async def change_password(body: ChangePasswordRequest, request: Request):
user = _get_current_user(request) user = _get_current_user(request)
if not user: if not user:
raise HTTPException(401, "Not authenticated") raise HTTPException(401, "Not authenticated")
if len(body.new_password) < 8: if len(body.new_password) < PASSWORD_MIN_LENGTH:
raise HTTPException(400, "Password must be at least 8 characters") raise HTTPException(400, f"Password must be at least {PASSWORD_MIN_LENGTH} characters")
current_token = request.cookies.get(SESSION_COOKIE) current_token = request.cookies.get(SESSION_COOKIE)
ok = await asyncio.to_thread(auth_manager.change_password, user, body.current_password, body.new_password) ok = await asyncio.to_thread(auth_manager.change_password, user, body.current_password, body.new_password)
if not ok: if not ok:
@@ -268,8 +281,12 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
user = _get_current_user(request) user = _get_current_user(request)
if not user or not auth_manager.is_admin(user): if not user or not auth_manager.is_admin(user):
raise HTTPException(403, "Admin only") raise HTTPException(403, "Admin only")
if len(body.password) < 8: if len(body.password) < PASSWORD_MIN_LENGTH:
raise HTTPException(400, "Password must be at least 8 characters") raise HTTPException(400, f"Password must be at least {PASSWORD_MIN_LENGTH} characters")
if len(body.username.strip()) < 1:
raise HTTPException(400, "Username is required")
if body.username.lower() in RESERVED_USERNAMES:
raise HTTPException(403, "Username is reserved")
ok = auth_manager.create_user(body.username, body.password, body.is_admin) ok = auth_manager.create_user(body.username, body.password, body.is_admin)
if not ok: if not ok:
raise HTTPException(409, "Username already taken") raise HTTPException(409, "Username already taken")
@@ -432,6 +449,23 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
except Exception as e: except Exception as e:
logger.warning("Failed to rename upload owner references %s -> %s: %s", old_username, new_username, e) logger.warning("Failed to rename upload owner references %s -> %s: %s", old_username, new_username, e)
# direct personal RAG uploads live in per-owner directories and the
# vector metadata also carries the username used for owner-filtered
# search. Keep both in sync with the auth rename.
try:
from routes.personal_routes import rename_personal_upload_owner
personal_docs_manager = getattr(request.app.state, "personal_docs_manager", None)
if personal_docs_manager is not None:
rag_manager = getattr(personal_docs_manager, "rag_manager", None)
rename_personal_upload_owner(
old_username,
new_username,
personal_docs_manager=personal_docs_manager,
rag_manager=rag_manager,
)
except Exception as e:
logger.warning("Failed to rename personal RAG upload owner references %s -> %s: %s", old_username, new_username, e)
# skills: SKILL.md frontmatter carries owner: <username>; the usage # skills: SKILL.md frontmatter carries owner: <username>; the usage
# sidecar (_usage.json) keys entries as owner::skill-name. Both must # sidecar (_usage.json) keys entries as owner::skill-name. Both must
# be updated or the renamed user's Skills panel goes empty. # be updated or the renamed user's Skills panel goes empty.
+64 -26
View File
@@ -14,7 +14,7 @@ from core.database import Session as DBSession, ModelEndpoint
from src.llm_core import normalize_model_id from src.llm_core import normalize_model_id
from src.endpoint_resolver import normalize_base from src.endpoint_resolver import normalize_base
from src.context_compactor import maybe_compact, trim_for_context from src.context_compactor import maybe_compact, trim_for_context
from src.auth_helpers import get_current_user from src.auth_helpers import effective_user
from src.prompt_security import untrusted_context_message from src.prompt_security import untrusted_context_message
from routes.prefs_routes import _load_for_user as load_prefs_for_user from routes.prefs_routes import _load_for_user as load_prefs_for_user
@@ -22,6 +22,47 @@ from fastapi import HTTPException
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
_CASUAL_OPENING_RE = re.compile(
r"^\s*(?:h+i+|hey+|hello+|yo+|sup+|what'?s up|wass?up|hiya|howdy|"
r"lol|lmao|haha+|hehe+|thanks?|thank you|ty|idk|dunno|meh|bruh|bro)\b(?P<tail>.*)$",
re.IGNORECASE,
)
_CASUAL_BLOCKLIST_RE = re.compile(
r"\b(?:cookbook|serve|serving|launch|start|vllm|sglang|llama\.?cpp|ollama|"
r"download|model|email|document|doc|note|calendar|task|search|web|research|"
r"file|folder|repo|git|settings?|endpoint|api|token|mcp)\b",
re.IGNORECASE,
)
def _is_casual_low_signal(text: str) -> bool:
"""Short greetings/slang should not pull memory, skills, RAG, or docs."""
s = str(text or "").strip()
m = _CASUAL_OPENING_RE.match(s)
if not m:
return False
tail = m.group("tail") or ""
if _CASUAL_BLOCKLIST_RE.search(tail):
return False
tail_words = re.findall(r"[A-Za-z0-9_'-]+", tail)
return len(tail_words) <= 2
# Strong references to in-flight fire-and-forget tasks scheduled from this
# module. asyncio only keeps weak references to tasks created via
# create_task, so without this the GC can collect a task mid-execution and
# the background work (extraction, auto-naming) silently never runs.
# Mirrors WebhookManager._spawn_tracked from src/webhook_manager.py.
_BG_TASKS: set[asyncio.Task] = set()
def _spawn_bg(coro) -> asyncio.Task:
"""Schedule a background task and hold a strong reference until it finishes."""
task = asyncio.create_task(coro)
_BG_TASKS.add(task)
task.add_done_callback(_BG_TASKS.discard)
return task
# ── Data containers ────────────────────────────────────────────────────── # # ── Data containers ────────────────────────────────────────────────────── #
@@ -78,7 +119,7 @@ def _enforce_chat_privileges(request, sess) -> None:
which means unrestricted allowed_models / zero cap -> no-op for them. which means unrestricted allowed_models / zero cap -> no-op for them.
""" """
try: try:
user = get_current_user(request) user = effective_user(request)
except Exception: except Exception:
user = None user = None
if not user: if not user:
@@ -159,17 +200,9 @@ async def auto_name_session(session_manager, sess):
return return
owner = getattr(sess, "owner", None) owner = getattr(sess, "owner", None)
t_url, t_model, t_headers = resolve_task_endpoint(owner=owner) t_url, t_model, t_headers = resolve_task_endpoint(
if not t_model: sess.endpoint_url, sess.model, sess.headers, owner=owner
# If no task/utility model is configured at all, fall back to )
# the session's own model so auto-naming still works even on
# minimal setups.
from src.endpoint_resolver import resolve_endpoint
_fallback = resolve_endpoint("default", owner=owner)
if _fallback and _fallback[1]:
t_url, t_model, t_headers = _fallback
else:
t_url, t_model, t_headers = sess.endpoint_url, sess.model, sess.headers
if not t_model: if not t_model:
logger.debug("[auto-name] No model provided, skipping") logger.debug("[auto-name] No model provided, skipping")
return return
@@ -346,11 +379,11 @@ def add_user_message(sess, chat_handler, preprocessed: PreprocessedMessage, inco
def fire_message_event(request, webhook_manager, session_id: str, sess, message: str, compare_mode: bool = False): def fire_message_event(request, webhook_manager, session_id: str, sess, message: str, compare_mode: bool = False):
"""Fire webhook and event_bus events for a new user message.""" """Fire webhook and event_bus events for a new user message."""
if webhook_manager and not compare_mode: if webhook_manager and not compare_mode:
asyncio.create_task(webhook_manager.fire("chat.message", { webhook_manager.fire_and_forget("chat.message", {
"session_id": session_id, "model": sess.model, "message": message[:2000], "session_id": session_id, "model": sess.model, "message": message[:2000],
})) })
from src.event_bus import fire_event from src.event_bus import fire_event
user = get_current_user(request) user = effective_user(request)
fire_event("message_sent", user) fire_event("message_sent", user)
@@ -576,9 +609,11 @@ async def build_chat_context(
if not incognito: if not incognito:
fire_message_event(request, webhook_manager, session_id, sess, message, compare_mode) fire_message_event(request, webhook_manager, session_id, sess, message, compare_mode)
# Resolve user prefs # Resolve owner-scoped prefs/context. Browser requests keep the cookie user;
user = get_current_user(request) # bearer-token chat requests use the token owner instead of the "api" sentinel.
user = effective_user(request)
uprefs = load_prefs_for_user(user) uprefs = load_prefs_for_user(user)
casual_low_signal = _is_casual_low_signal(message)
# Memory enabled? # Memory enabled?
mem_enabled = not incognito and not no_memory and uprefs.get("memory_enabled", True) mem_enabled = not incognito and not no_memory and uprefs.get("memory_enabled", True)
@@ -588,6 +623,9 @@ async def build_chat_context(
if not allow_tool_preprocessing: if not allow_tool_preprocessing:
mem_enabled = False mem_enabled = False
skills_enabled = False skills_enabled = False
if casual_low_signal:
mem_enabled = False
skills_enabled = False
logger.debug( logger.debug(
"Memory enabled=%s for user=%s (incognito=%s, no_memory=%s, pref=%s)", "Memory enabled=%s for user=%s (incognito=%s, no_memory=%s, pref=%s)",
mem_enabled, user, incognito, no_memory, uprefs.get("memory_enabled", "NOT_SET"), mem_enabled, user, incognito, no_memory, uprefs.get("memory_enabled", "NOT_SET"),
@@ -603,11 +641,11 @@ async def build_chat_context(
# Use RAG? # Use RAG?
use_rag_val = (str(use_rag).lower() != "false") if use_rag is not None else True use_rag_val = (str(use_rag).lower() != "false") if use_rag is not None else True
if incognito or not allow_tool_preprocessing or is_research_spinoff: if incognito or not allow_tool_preprocessing or is_research_spinoff or casual_low_signal:
use_rag_val = False use_rag_val = False
# If pre-fetched search context was provided (compare mode), skip live web search # If pre-fetched search context was provided (compare mode), skip live web search
skip_web = bool(search_context) or not allow_tool_preprocessing skip_web = bool(search_context) or not allow_tool_preprocessing or casual_low_signal
# Build context preface # Build context preface
# The stream path uses enhanced_message (with CoT/preprocessing applied), # The stream path uses enhanced_message (with CoT/preprocessing applied),
@@ -626,7 +664,7 @@ async def build_chat_context(
incognito=incognito, incognito=incognito,
use_skills=skills_enabled, use_skills=skills_enabled,
) )
if use_rag is not None or is_research_spinoff: if use_rag is not None or is_research_spinoff or casual_low_signal:
_preface_kwargs["use_rag"] = use_rag_val _preface_kwargs["use_rag"] = use_rag_val
preface, rag_sources, web_sources = chat_processor.build_context_preface(**_preface_kwargs) preface, rag_sources, web_sources = chat_processor.build_context_preface(**_preface_kwargs)
@@ -634,7 +672,7 @@ async def build_chat_context(
used_memories = getattr(chat_processor, '_last_used_memories', []) used_memories = getattr(chat_processor, '_last_used_memories', [])
# Inject pre-fetched search context (compare mode) # Inject pre-fetched search context (compare mode)
if search_context and allow_tool_preprocessing: if search_context and allow_tool_preprocessing and not casual_low_signal:
preface.append(untrusted_context_message("prefetched search context", search_context)) preface.append(untrusted_context_message("prefetched search context", search_context))
# YouTube transcripts # YouTube transcripts
@@ -1112,7 +1150,7 @@ def run_post_response_tasks(
))) )))
if _extraction_jobs: if _extraction_jobs:
asyncio.create_task(_run_extraction_jobs_sequentially(session_id, _extraction_jobs)) _spawn_bg(_run_extraction_jobs_sequentially(session_id, _extraction_jobs))
# Token accumulation # Token accumulation
if last_metrics: if last_metrics:
@@ -1120,11 +1158,11 @@ def run_post_response_tasks(
# Webhook # Webhook
if webhook_manager and not compare_mode: if webhook_manager and not compare_mode:
asyncio.create_task(webhook_manager.fire("chat.completed", { webhook_manager.fire_and_forget("chat.completed", {
"session_id": session_id, "model": sess.model, "session_id": session_id, "model": sess.model,
"user_message": message, "response": full_response[:2000], "user_message": message, "response": full_response[:2000],
})) })
# Auto-name # Auto-name
if needs_auto_name(sess.name): if needs_auto_name(sess.name):
asyncio.create_task(auto_name_session(session_manager, sess)) _spawn_bg(auto_name_session(session_manager, sess))
+25 -12
View File
@@ -23,12 +23,13 @@ from src.endpoint_resolver import normalize_base as _normalize_base, build_chat_
from src.session_search import search_session_messages from src.session_search import search_session_messages
from src.prompt_security import untrusted_context_message from src.prompt_security import untrusted_context_message
from core.exceptions import SessionNotFoundError from core.exceptions import SessionNotFoundError
from src.auth_helpers import get_current_user from src.auth_helpers import effective_user, get_current_user
from routes.session_routes import _verify_session_owner from routes.session_routes import _verify_session_owner
from routes.document_helpers import _owner_session_filter from routes.document_helpers import _owner_session_filter
from core.database import SessionLocal, get_session_mode, set_session_mode from core.database import SessionLocal, get_session_mode, set_session_mode
from core.database import Session as DBSession, ChatMessage as DBChatMessage from core.database import Session as DBSession, ChatMessage as DBChatMessage
from core.database import Document as DBDocument, ModelEndpoint from core.database import Document as DBDocument, ModelEndpoint
from core.log_safety import redact_url
from routes.research_routes import _resolve_research_endpoint from routes.research_routes import _resolve_research_endpoint
from routes.model_routes import _visible_models from routes.model_routes import _visible_models
from routes.chat_helpers import ( from routes.chat_helpers import (
@@ -126,7 +127,8 @@ def _clear_orphaned_session_endpoint(sess, owner: str | None = None) -> bool:
sess.model = "" sess.model = ""
sess.headers = {} sess.headers = {}
return True return True
except Exception: except Exception as e:
logger.warning("Failed to clear orphaned session endpoint", exc_info=e)
db.rollback() db.rollback()
return False return False
finally: finally:
@@ -144,7 +146,8 @@ def _endpoint_cache_contains_model(endpoint, model: str) -> bool:
return True return True
try: try:
models = json.loads(raw) if isinstance(raw, str) else raw models = json.loads(raw) if isinstance(raw, str) else raw
except Exception: except Exception as e:
logger.warning("Failed to parse cached models list, treating as containing model", exc_info=e)
return True return True
if not isinstance(models, list) or not models: if not isinstance(models, list) or not models:
return True return True
@@ -236,7 +239,8 @@ def _recover_empty_session_model(sess, session_id: str, owner: str | None = None
is_chatgpt_subscription = False is_chatgpt_subscription = False
try: try:
cached = json.loads(ep.cached_models) if isinstance(ep.cached_models, str) else (ep.cached_models or []) cached = json.loads(ep.cached_models) if isinstance(ep.cached_models, str) else (ep.cached_models or [])
except Exception: except Exception as e:
logger.warning("Failed to parse cached_models for endpoint %r", getattr(ep, "id", "?"), exc_info=e)
cached = [] cached = []
if not cached: if not cached:
visible = [] visible = []
@@ -360,7 +364,7 @@ def setup_chat_routes(
sess = session_manager.get_session(session) sess = session_manager.get_session(session)
except KeyError: except KeyError:
raise HTTPException(404, f"Session '{session}' not found") raise HTTPException(404, f"Session '{session}' not found")
owner = get_current_user(request) owner = effective_user(request)
if _clear_orphaned_session_endpoint(sess, owner=owner): if _clear_orphaned_session_endpoint(sess, owner=owner):
raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.") raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.")
@@ -600,7 +604,7 @@ def setup_chat_routes(
# but BEFORE loading. Prevents cross-user session hijack. # but BEFORE loading. Prevents cross-user session hijack.
_verify_session_owner(request, session) _verify_session_owner(request, session)
sess = session_manager.get_session(session) sess = session_manager.get_session(session)
owner = get_current_user(request) owner = effective_user(request)
if _clear_orphaned_session_endpoint(sess, owner=owner): if _clear_orphaned_session_endpoint(sess, owner=owner):
raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.") raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.")
# Issue #587: picker shows a model from the endpoint cache but # Issue #587: picker shows a model from the endpoint cache but
@@ -631,7 +635,7 @@ def setup_chat_routes(
_enforce_chat_privileges(request, sess) _enforce_chat_privileges(request, sess)
# Ensure session has auth headers # Ensure session has auth headers
resolve_session_auth(sess, session, owner=get_current_user(request)) resolve_session_auth(sess, session, owner=effective_user(request))
# Check for research_pending BEFORE mode persist overwrites it # Check for research_pending BEFORE mode persist overwrites it
do_research = str(use_research).lower() == "true" do_research = str(use_research).lower() == "true"
@@ -646,8 +650,8 @@ def setup_chat_routes(
elif attachments: elif attachments:
try: try:
att_ids = [str(x) for x in json.loads(attachments)] att_ids = [str(x) for x in json.loads(attachments)]
except Exception: except Exception as e:
pass logger.warning("Failed to parse attachments JSON, ignoring attachments", exc_info=e)
no_memory = str(form_data.get("no_memory", "")).lower() == "true" no_memory = str(form_data.get("no_memory", "")).lower() == "true"
pre_context_tool_policy = build_effective_tool_policy( pre_context_tool_policy = build_effective_tool_policy(
@@ -826,7 +830,11 @@ def setup_chat_routes(
from src.settings import get_setting from src.settings import get_setting
_global_disabled = get_setting("disabled_tools", []) _global_disabled = get_setting("disabled_tools", [])
if _global_disabled and isinstance(_global_disabled, list): if _global_disabled and isinstance(_global_disabled, list):
disabled_tools.update(_global_disabled) explicit_web_allowed = allow_web_search is not None and str(allow_web_search).lower() == "true"
if explicit_web_allowed:
disabled_tools.update(t for t in _global_disabled if t not in {"web_search", "web_fetch"})
else:
disabled_tools.update(_global_disabled)
# Light auto-escalation: the user is in chat mode and just expressed a # Light auto-escalation: the user is in chat mode and just expressed a
# notes/calendar/email intent. Grant the relevant managers but withhold # notes/calendar/email intent. Grant the relevant managers but withhold
@@ -923,7 +931,7 @@ def setup_chat_routes(
if effective_do_research: if effective_do_research:
_r_ep, _r_model, _r_headers = _resolve_research_endpoint(sess) _r_ep, _r_model, _r_headers = _resolve_research_endpoint(sess)
_auth_keys = list(_r_headers.keys()) if _r_headers else [] _auth_keys = list(_r_headers.keys()) if _r_headers else []
logger.info(f"Research endpoint resolved: model={_r_model}, endpoint={_r_ep}, auth_keys={_auth_keys}, sess_headers_keys={list(sess.headers.keys()) if isinstance(sess.headers, dict) else type(sess.headers)}") logger.info(f"Research endpoint resolved: model={_r_model}, endpoint={redact_url(_r_ep)}, auth_keys={_auth_keys}, sess_headers_keys={list(sess.headers.keys()) if isinstance(sess.headers, dict) else type(sess.headers)}")
# Clarification round: only for very short/vague queries on first research message. # Clarification round: only for very short/vague queries on first research message.
# Skip in compare mode — each pane is a fresh session, so every one would # Skip in compare mode — each pane is a fresh session, so every one would
@@ -1256,6 +1264,10 @@ def setup_chat_routes(
_max_rounds = _DEFAULT_ROUNDS _max_rounds = _DEFAULT_ROUNDS
_max_rounds = max(1, min(_max_rounds, 200)) _max_rounds = max(1, min(_max_rounds, 200))
_forced_tools = None
if allow_web_search is not None and str(allow_web_search).lower() == "true":
_forced_tools = {"web_search", "web_fetch"}
async for chunk in stream_agent_loop( async for chunk in stream_agent_loop(
sess.endpoint_url, sess.endpoint_url,
sess.model, sess.model,
@@ -1277,6 +1289,7 @@ def setup_chat_routes(
plan_mode=plan_mode, plan_mode=plan_mode,
approved_plan=approved_plan or None, approved_plan=approved_plan or None,
workspace=workspace or None, workspace=workspace or None,
forced_tools=_forced_tools,
): ):
if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"): if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"):
try: try:
@@ -1482,7 +1495,7 @@ def setup_chat_routes(
if not q or not q.strip(): if not q or not q.strip():
return [] return []
_user = get_current_user(request) _user = effective_user(request)
return [ return [
result.to_dict() result.to_dict()
for result in search_session_messages( for result in search_session_messages(
+11 -4
View File
@@ -46,8 +46,12 @@ def _ssh_prefix_for_task(task: dict) -> tuple[str, str]:
shell metacharacters in ``remoteHost`` is rejected with 400 rather than shell metacharacters in ``remoteHost`` is rejected with 400 rather than
injected. injected.
""" """
host = validate_remote_host((task.get("remoteHost") or "").strip() or None) or "" raw_host = task.get("remoteHost")
ssh_port = validate_ssh_port((task.get("sshPort") or "").strip() or None) or "" raw_port = task.get("sshPort")
host_value = str(raw_host).strip() if raw_host is not None else None
port_value = str(raw_port).strip() if raw_port is not None else None
host = validate_remote_host(host_value or None) or ""
ssh_port = validate_ssh_port(port_value or None) or ""
port_flag = f"-p {ssh_port} " if ssh_port and ssh_port != "22" else "" port_flag = f"-p {ssh_port} " if ssh_port and ssh_port != "22" else ""
return host, port_flag return host, port_flag
@@ -306,7 +310,10 @@ def setup_codex_routes(
@router.post("/emails/draft-document") @router.post("/emails/draft-document")
async def codex_email_draft_document(request: Request, body: dict[str, Any] = Body(default_factory=dict)): async def codex_email_draft_document(request: Request, body: dict[str, Any] = Body(default_factory=dict)):
owner = _scope_owner_all(request, {"email:draft", "documents:write"}) owner = _scope_owner(request, EMAIL_DRAFT_SCOPES)
docs_owner = _scope_owner_all(request, DOCS_WRITE_SCOPES)
if docs_owner != owner:
raise HTTPException(403, "API token owner mismatch")
if documents_create_endpoint is None: if documents_create_endpoint is None:
raise HTTPException(503, "Documents integration is not available") raise HTTPException(503, "Documents integration is not available")
from routes.document_routes import DocumentCreate from routes.document_routes import DocumentCreate
@@ -790,7 +797,7 @@ def setup_codex_routes(
norm = dict(body or {}) norm = dict(body or {})
sess = (norm.get("tmux_session") or norm.get("session_id") or "").strip() sess = (norm.get("tmux_session") or norm.get("session_id") or "").strip()
model = (norm.get("model") or norm.get("repo_id") or "").strip() model = (norm.get("model") or norm.get("repo_id") or "").strip()
host = (norm.get("host") or norm.get("remote_host") or "").strip() host = validate_remote_host((norm.get("host") or norm.get("remote_host") or "").strip() or None) or ""
port = norm.get("port") or 8000 port = norm.get("port") or 8000
import re as _re import re as _re
if not sess or not _re.fullmatch(r"[a-zA-Z0-9_-]+", sess): if not sess or not _re.fullmatch(r"[a-zA-Z0-9_-]+", sess):
+18 -8
View File
@@ -18,6 +18,7 @@ from pathlib import Path
from datetime import datetime from datetime import datetime
from urllib.parse import urljoin, urlparse, urlunparse from urllib.parse import urljoin, urlparse, urlunparse
from core.log_safety import redact_url
from fastapi import APIRouter, Query, Depends, Response, HTTPException from fastapi import APIRouter, Query, Depends, Response, HTTPException
from typing import List, Dict, Optional from typing import List, Dict, Optional
@@ -689,15 +690,24 @@ def _delete_contact(uid: str) -> bool:
url = _resolve_resource_url(uid) url = _resolve_resource_url(uid)
auth = (cfg["username"], cfg["password"]) if cfg["username"] else None auth = (cfg["username"], cfg["password"]) if cfg["username"] else None
r = httpx.delete(url, auth=auth, timeout=10) r = httpx.delete(url, auth=auth, timeout=10)
if r.status_code in (200, 204): if r.status_code in (200, 204, 404):
_contact_cache["fetched_at"] = None # Invalidate cache so the next fetch sees the server truth.
return True
if r.status_code == 404:
# Resource not found at the resolved URL. With href resolution
# this should be rare (genuinely already deleted). Invalidate
# the cache and report success so the UI doesn't keep a ghost.
logger.info(f"CardDAV DELETE 404 for {uid} — treating as already gone")
_contact_cache["fetched_at"] = None _contact_cache["fetched_at"] = None
# Verify: force a fresh fetch and check the UID is actually gone.
# A 404 on the guessed URL ({uid}.vcf) can mean the contact
# lives at a different resource URL — the DELETE missed it but
# we'd silently report success. This check catches that.
fresh = _fetch_contacts(force=True)
still_there = any(c.get("uid") == uid for c in fresh)
if still_there:
logger.warning(
f"CardDAV DELETE reported success for {uid} "
f"but UID still present after re-fetch — "
f"resource URL may differ from {redact_url(url)}"
)
return False
if r.status_code == 404:
logger.info(f"CardDAV DELETE 404 for {uid} — already gone")
return True return True
logger.warning(f"CardDAV DELETE returned {r.status_code}: {r.text[:200]}") logger.warning(f"CardDAV DELETE returned {r.status_code}: {r.text[:200]}")
return False return False
+182 -18
View File
@@ -505,6 +505,8 @@ def _cached_model_scan_script(model_dirs: list[str] | None = None, add_hf_cache:
" if u.startswith('KB'): return int(n * 1024)", " if u.startswith('KB'): return int(n * 1024)",
" return int(n)", " return int(n)",
"def scan_ollama():", "def scan_ollama():",
" if any(m.get('is_ollama') for m in models): return",
" if os.name == 'nt' and not os.environ.get('ODYSSEUS_ALLOW_OLLAMA_CLI_SCAN'): return",
" if not shutil.which('ollama'): return", " if not shutil.which('ollama'): return",
" try:", " try:",
" p = subprocess.run(['ollama', 'list'], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True, timeout=6)", " p = subprocess.run(['ollama', 'list'], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True, timeout=6)",
@@ -535,8 +537,8 @@ def _cached_model_scan_script(model_dirs: list[str] | None = None, add_hf_cache:
" models.append({'repo_id':name,'size_bytes':size_bytes,'nb_files':1,'has_incomplete':False,'path':'ollama','backend':'ollama','is_ollama':True})", " models.append({'repo_id':name,'size_bytes':size_bytes,'nb_files':1,'has_incomplete':False,'path':'ollama','backend':'ollama','is_ollama':True})",
" return", " return",
"for _hf_cache in hf_cache_paths(): scan_hf(_hf_cache)", "for _hf_cache in hf_cache_paths(): scan_hf(_hf_cache)",
"scan_ollama()",
"scan_ollama_api()", "scan_ollama_api()",
"scan_ollama()",
] ]
for model_dir in model_dirs or []: for model_dir in model_dirs or []:
lines.append(f"scan_dir(os.path.expanduser({model_dir!r}))") lines.append(f"scan_dir(os.path.expanduser({model_dir!r}))")
@@ -784,25 +786,149 @@ def _append_llama_cpp_linux_accel_build_lines(runner_lines: list[str]) -> None:
to hard-wire CUDA on Linux. That made ROCm hosts attempt a CUDA configure and to hard-wire CUDA on Linux. That made ROCm hosts attempt a CUDA configure and
fail with "CUDA Toolkit not found" instead of building with HIP. fail with "CUDA Toolkit not found" instead of building with HIP.
""" """
# Try a prebuilt binary from llama.cpp's GitHub releases FIRST — no
# cmake/build-essential/git/CUDA-headers needed at all. The from-source
# build below stays as a fallback (custom flags, esoteric arch, no
# internet, etc). 30 seconds vs 5+ minutes of compile, and removes
# every OS-package dep from the launch path. Sets _odysseus_have_prebuilt=1
# on success; the existing build-tier if/elif chain below is gated on
# that variable so we never compile twice or shadow the prebuilt symlink.
runner_lines.append(' _odysseus_have_prebuilt=""')
runner_lines.append(' _odysseus_arch="$(uname -m)"')
runner_lines.append(' _odysseus_prebuilt_url=""')
runner_lines.append(' if command -v curl >/dev/null 2>&1 && [ "$_odysseus_arch" = "x86_64" ]; then')
runner_lines.append(' _odysseus_pat=""')
runner_lines.append(' _odysseus_has_nv_inline() { command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L 2>/dev/null | grep -q "GPU "; }')
runner_lines.append(' _odysseus_has_vk_inline() { ldconfig -p 2>/dev/null | grep -q "libvulkan\\.so" || command -v vulkaninfo >/dev/null 2>&1 || [ -e /usr/lib/x86_64-linux-gnu/libvulkan.so.1 ]; }')
runner_lines.append(' _odysseus_has_vkdev_inline() { ls /dev/dri/renderD* >/dev/null 2>&1 || (lspci 2>/dev/null | grep -Ei \'VGA|3D|Display\' | grep -Eiq \'AMD|ATI|Radeon\'); }')
runner_lines.append(' if _odysseus_has_nv_inline; then')
runner_lines.append(' _odysseus_pat="ubuntu.*cuda"')
runner_lines.append(' elif _odysseus_has_vkdev_inline && _odysseus_has_vk_inline; then')
runner_lines.append(' _odysseus_pat="ubuntu.*vulkan"')
runner_lines.append(' else')
runner_lines.append(' _odysseus_pat="ubuntu-x64\\\\.zip"')
runner_lines.append(' fi')
runner_lines.append(' _odysseus_prebuilt_url="$(curl -fsSL --max-time 15 https://api.github.com/repos/ggml-org/llama.cpp/releases/latest 2>/dev/null | grep \'"browser_download_url"\' | cut -d\'"\' -f4 | grep -iE "$_odysseus_pat" | grep -iv "arm\\|aarch64" | head -1)"')
runner_lines.append(' fi')
# Accept any of unzip / bsdtar / python3 -m zipfile as the extractor.
# python3 is essentially always present on modern Linux, so this lets
# the prebuilt path work on minimal Ubuntu installs that lack `unzip`.
runner_lines.append(' if [ -n "$_odysseus_prebuilt_url" ] && (command -v unzip >/dev/null 2>&1 || command -v bsdtar >/dev/null 2>&1 || command -v python3 >/dev/null 2>&1); then')
runner_lines.append(' echo "[odysseus] Found prebuilt llama-server: $_odysseus_prebuilt_url"')
runner_lines.append(' mkdir -p ~/bin "$HOME/.cache/odysseus/llama-cpp-prebuilt" && cd "$HOME/.cache/odysseus/llama-cpp-prebuilt"')
runner_lines.append(' rm -f llama-cpp.zip')
runner_lines.append(' if curl -fsSL --max-time 120 "$_odysseus_prebuilt_url" -o llama-cpp.zip && [ -s llama-cpp.zip ]; then')
runner_lines.append(' rm -rf build && mkdir -p build')
runner_lines.append(' if command -v unzip >/dev/null 2>&1; then unzip -qq -o llama-cpp.zip -d build; elif command -v bsdtar >/dev/null 2>&1; then bsdtar -xf llama-cpp.zip -C build; else python3 -c "import zipfile; zipfile.ZipFile(\\"llama-cpp.zip\\").extractall(\\"build\\")"; fi')
runner_lines.append(' _odysseus_extracted="$(find build -type f -name llama-server 2>/dev/null | head -1)"')
runner_lines.append(' if [ -n "$_odysseus_extracted" ]; then')
runner_lines.append(' chmod +x "$_odysseus_extracted"')
runner_lines.append(' ln -sf "$_odysseus_extracted" ~/bin/llama-server')
runner_lines.append(' _odysseus_libdir="$(dirname "$_odysseus_extracted")"')
runner_lines.append(' mkdir -p ~/.config && echo "export LD_LIBRARY_PATH=\\"$_odysseus_libdir:\\${LD_LIBRARY_PATH:-}\\"" > ~/.config/odysseus-llama-cpp-env')
runner_lines.append(' _odysseus_have_prebuilt=1')
runner_lines.append(' echo "[odysseus] Prebuilt llama-server installed at $_odysseus_extracted"')
runner_lines.append(' fi')
runner_lines.append(' fi')
runner_lines.append(' [ -z "$_odysseus_have_prebuilt" ] && echo "[odysseus] Prebuilt download/extract failed — falling back to from-source build."')
runner_lines.append(' elif [ -z "$_odysseus_prebuilt_url" ]; then')
runner_lines.append(' echo "[odysseus] No matching prebuilt llama-server for this host (arch=$_odysseus_arch) — will build from source."')
runner_lines.append(' fi')
runner_lines.append(' if [ -z "$_odysseus_have_prebuilt" ]; then')
# Detect pip-installed nvcc (from vLLM/nvidia CUDA wheels) and put it on PATH # Detect pip-installed nvcc (from vLLM/nvidia CUDA wheels) and put it on PATH
# so cmake's CUDA configure can find it. We keep this after the ROCm/HIP # so cmake's CUDA configure can find it — BUT only when actual NVIDIA
# check — a machine with both stacks should honor the native HIP toolchain on # hardware is present. On AMD/Intel hosts the pip nvcc is a misleading
# AMD hosts instead of accidentally preferring a stray nvcc wheel. # leftover (no libcudart, no GPU it could target) and would otherwise
runner_lines.append(' for _cudir in ~/.local/lib/python*/site-packages/nvidia/cu13 ~/.local/lib/python*/site-packages/nvidia/cu12 ~/.local/lib/python*/site-packages/nvidia/cuda_nvcc; do') # send the build down the CUDA branch and fail with "CUDA Toolkit not
runner_lines.append(' [ -x "$_cudir/bin/nvcc" ] && export CUDA_HOME="$_cudir" && export PATH="$_cudir/bin:$PATH" && break') # found" instead of trying Vulkan.
runner_lines.append(' done') runner_lines.append(' _odysseus_has_nvidia_hw() {')
runner_lines.append(' command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L 2>/dev/null | grep -q "GPU " && return 0')
runner_lines.append(' ls /dev/nvidia* >/dev/null 2>&1 && return 0')
runner_lines.append(' lspci 2>/dev/null | grep -iE \'VGA|3D|Display\' | grep -iq nvidia && return 0')
runner_lines.append(' return 1')
runner_lines.append(' }')
runner_lines.append(' if _odysseus_has_nvidia_hw; then')
runner_lines.append(' for _cudir in ~/.local/lib/python*/site-packages/nvidia/cu13 ~/.local/lib/python*/site-packages/nvidia/cu12 ~/.local/lib/python*/site-packages/nvidia/cuda_nvcc; do')
runner_lines.append(' [ -x "$_cudir/bin/nvcc" ] && export CUDA_HOME="$_cudir" && export PATH="$_cudir/bin:$PATH" && break')
runner_lines.append(' done')
runner_lines.append(' fi')
# rm -rf build so a prior poisoned CMakeCache.txt (e.g. from a failed CUDA # rm -rf build so a prior poisoned CMakeCache.txt (e.g. from a failed CUDA
# or HIP attempt) doesn't cause the next configure to reuse stale settings. # or HIP attempt) doesn't cause the next configure to reuse stale settings.
runner_lines.append(' mkdir -p ~/bin') runner_lines.append(' mkdir -p ~/bin')
runner_lines.append(' cd ~/llama.cpp && rm -rf build') # Try to install cmake / build-essential / git automatically before the
# build, but ONLY via passwordless sudo (`sudo -n`) — interactive sudo
# would hang a tmux-backgrounded serve task waiting for a password. If
# sudo asks for a password the install is skipped silently and the
# diagnosis pattern (cookbook_routes.py / cookbook_helpers.py) surfaces
# an explicit "install cmake" suggestion in the Cookbook diagnosis
# toolbar after the inevitable build failure.
runner_lines.append(' _odysseus_apt_bootstrap() {')
runner_lines.append(' local _missing=""')
runner_lines.append(' command -v cmake >/dev/null 2>&1 || _missing="$_missing cmake"')
runner_lines.append(' command -v g++ >/dev/null 2>&1 || command -v gcc >/dev/null 2>&1 || _missing="$_missing build-essential"')
runner_lines.append(' command -v git >/dev/null 2>&1 || _missing="$_missing git"')
runner_lines.append(' [ -z "$_missing" ] && return 0')
runner_lines.append(' if command -v apt-get >/dev/null 2>&1 && sudo -n true 2>/dev/null; then')
runner_lines.append(' echo "[odysseus] Auto-installing missing build deps via apt:$_missing"')
runner_lines.append(' sudo -n env DEBIAN_FRONTEND=noninteractive apt-get update -qq 2>&1 | tail -3')
runner_lines.append(' sudo -n env DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends $_missing 2>&1 | tail -5 || true')
runner_lines.append(' elif command -v pacman >/dev/null 2>&1 && sudo -n true 2>/dev/null; then')
runner_lines.append(' echo "[odysseus] Auto-installing missing build deps via pacman:$_missing"')
runner_lines.append(' local _pacpkgs="$(echo "$_missing" | sed -e \'s/build-essential/base-devel/g\')"')
runner_lines.append(' sudo -n pacman -Sy --needed --noconfirm $_pacpkgs 2>&1 | tail -5 || true')
runner_lines.append(' elif command -v dnf >/dev/null 2>&1 && sudo -n true 2>/dev/null; then')
runner_lines.append(' echo "[odysseus] Auto-installing missing build deps via dnf:$_missing"')
runner_lines.append(' local _dnfpkgs="$(echo "$_missing" | sed -e \'s/build-essential/gcc gcc-c++ make/g\')"')
runner_lines.append(' sudo -n dnf install -y $_dnfpkgs 2>&1 | tail -5 || true')
runner_lines.append(' else')
runner_lines.append(' echo "[odysseus] WARNING: missing build deps ($_missing) — passwordless sudo is unavailable, cannot auto-install. Cookbook Diagnosis will explain the fix after the build fails."')
runner_lines.append(' fi')
runner_lines.append(' }')
runner_lines.append(' _odysseus_apt_bootstrap')
runner_lines.append(' _odysseus_missing_build_deps=""')
runner_lines.append(' command -v cmake >/dev/null 2>&1 || _odysseus_missing_build_deps="$_odysseus_missing_build_deps cmake"')
runner_lines.append(' command -v git >/dev/null 2>&1 || _odysseus_missing_build_deps="$_odysseus_missing_build_deps git"')
runner_lines.append(' command -v g++ >/dev/null 2>&1 || command -v gcc >/dev/null 2>&1 || _odysseus_missing_build_deps="$_odysseus_missing_build_deps build-essential"')
runner_lines.append(' if [ -n "$_odysseus_missing_build_deps" ]; then')
runner_lines.append(' echo "ERROR: llama.cpp source build needs missing packages:$_odysseus_missing_build_deps"')
runner_lines.append(' if command -v apt-get >/dev/null 2>&1; then')
runner_lines.append(' echo "Install on this host: sudo apt-get update && sudo apt-get install -y cmake build-essential git"')
runner_lines.append(' elif command -v pacman >/dev/null 2>&1; then')
runner_lines.append(' echo "Install on this host: sudo pacman -Sy --needed cmake base-devel git"')
runner_lines.append(' elif command -v dnf >/dev/null 2>&1; then')
runner_lines.append(' echo "Install on this host: sudo dnf install -y cmake gcc gcc-c++ make git"')
runner_lines.append(' fi')
runner_lines.append(' echo "Alternative: install a native llama-server on PATH, then relaunch."')
runner_lines.append(' ODYSSEUS_PREFLIGHT_EXIT=127')
runner_lines.append(' fi')
runner_lines.append(' cd ~/llama.cpp')
runner_lines.append(' _odysseus_has_vulkan() {')
runner_lines.append(' ldconfig -p 2>/dev/null | grep -q \'libvulkan\\.so\' && return 0')
runner_lines.append(' [ -e /usr/lib/libvulkan.so.1 ] && return 0')
runner_lines.append(' [ -e /usr/lib/x86_64-linux-gnu/libvulkan.so.1 ] && return 0')
runner_lines.append(' command -v vulkaninfo >/dev/null 2>&1 && return 0')
runner_lines.append(' return 1')
runner_lines.append(' }')
runner_lines.append(' _odysseus_has_vulkan_device() {')
runner_lines.append(' ls /dev/dri/renderD* >/dev/null 2>&1 && return 0')
runner_lines.append(' lspci 2>/dev/null | grep -Ei \'VGA|3D|Display\' | grep -Eiq \'AMD|ATI|Radeon\' && return 0')
runner_lines.append(' return 1')
runner_lines.append(' }')
# Backend preference: native ROCm/HIP > native CUDA > Vulkan > CPU.
# Vulkan is a portable fallback that works on AMD when ROCm isn't
# installed (e.g. Strix Halo) and on any vendor's discrete GPU, but
# it's ~30-40% slower than native HIP/CUDA for LLM inference — only
# pick it when no native toolchain is present.
runner_lines.append(' if command -v hipconfig &>/dev/null || [ -d /opt/rocm ] || [ -n "$ROCM_PATH" ] || [ -n "$HIP_PATH" ]; then') runner_lines.append(' if command -v hipconfig &>/dev/null || [ -d /opt/rocm ] || [ -n "$ROCM_PATH" ] || [ -n "$HIP_PATH" ]; then')
runner_lines.append(' rm -rf build')
runner_lines.append(' if command -v hipconfig &>/dev/null; then') runner_lines.append(' if command -v hipconfig &>/dev/null; then')
runner_lines.append(' export HIPCXX="${HIPCXX:-$(hipconfig -l)/clang}"') runner_lines.append(' export HIPCXX="${HIPCXX:-$(hipconfig -l)/clang}"')
runner_lines.append(' export HIP_PATH="${HIP_PATH:-$(hipconfig -R)}"') runner_lines.append(' export HIP_PATH="${HIP_PATH:-$(hipconfig -R)}"')
runner_lines.append(' fi') runner_lines.append(' fi')
runner_lines.append(' echo "[odysseus] ROCm/HIP detected — building llama-server with HIP support..."') runner_lines.append(' echo "[odysseus] ROCm/HIP detected — building llama-server with HIP support..."')
runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_HIP=ON && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server') runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release -DGGML_HIP=ON && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server')
runner_lines.append(' elif command -v nvcc &>/dev/null; then') runner_lines.append(' elif command -v nvcc &>/dev/null && _odysseus_has_nvidia_hw; then')
runner_lines.append(' rm -rf build')
# nvcc alone is not sufficient — pip-installed CUDA wheels or incomplete # nvcc alone is not sufficient — pip-installed CUDA wheels or incomplete
# tooling can expose nvcc without shipping libcudart, causing cmake to fail # tooling can expose nvcc without shipping libcudart, causing cmake to fail
# mid-build with "CUDA runtime library not found". Check cudart explicitly # mid-build with "CUDA runtime library not found". Check cudart explicitly
@@ -826,31 +952,50 @@ def _append_llama_cpp_linux_accel_build_lines(runner_lines: list[str]) -> None:
runner_lines.append(' echo "[odysseus] Ensure libcudart is installed (e.g. cuda-runtime package) and visible via ldconfig or CUDA_HOME."') runner_lines.append(' echo "[odysseus] Ensure libcudart is installed (e.g. cuda-runtime package) and visible via ldconfig or CUDA_HOME."')
runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server') runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server')
runner_lines.append(' fi') runner_lines.append(' fi')
runner_lines.append(' elif _odysseus_has_vulkan_device && _odysseus_has_vulkan; then')
runner_lines.append(' echo "[odysseus] Vulkan-capable GPU detected (no ROCm/CUDA toolchain installed) — building llama-server with Vulkan support..."')
runner_lines.append(' rm -rf build-vulkan')
runner_lines.append(' cmake -B build-vulkan -DCMAKE_BUILD_TYPE=Release -DGGML_VULKAN=ON && cmake --build build-vulkan -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build-vulkan/bin/llama-server ~/bin/llama-server')
runner_lines.append(' else') runner_lines.append(' else')
runner_lines.append(' echo "[odysseus] WARNING: no HIP/CUDA toolchain found — building llama-server for CPU only."') runner_lines.append(' echo "[odysseus] WARNING: no HIP/CUDA/Vulkan toolchain found — building llama-server for CPU only."')
runner_lines.append(' echo "[odysseus] GPU inference will not be available for this llama.cpp build."') runner_lines.append(' echo "[odysseus] GPU inference will not be available for this llama.cpp build."')
runner_lines.append(' echo "[odysseus] Install ROCm for AMD GPUs or vLLM/CUDA tooling for NVIDIA, then re-launch this serve task."') runner_lines.append(' echo "[odysseus] Install Vulkan (libvulkan-dev) / ROCm for AMD GPUs or CUDA tooling for NVIDIA, then re-launch this serve task."')
runner_lines.append(' rm -rf build')
runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server') runner_lines.append(' cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j"$NPROC" --target llama-server && ln -sf ~/llama.cpp/build/bin/llama-server ~/bin/llama-server')
runner_lines.append(' fi') runner_lines.append(' fi')
runner_lines.append(' fi # end _odysseus_have_prebuilt guard')
def _llama_cpp_rebuild_cmd() -> str: def _llama_cpp_rebuild_cmd(update_source: bool = False) -> str:
"""Shell command that clears the Cookbook-managed llama.cpp build. """Shell command that clears the Cookbook-managed llama.cpp build.
Removes the cached ``llama-server`` symlink and the ``~/llama.cpp/build`` Removes the cached ``llama-server`` symlink and the ``~/llama.cpp/build*``
directory so the next llama.cpp serve recompiles from source, picking up a directory so the next llama.cpp serve recompiles from source, picking up a
CUDA or HIP toolchain if one is now available. The serve bootstrap only CUDA or HIP toolchain if one is now available. The serve bootstrap only
builds when ``llama-server`` is missing from PATH, so without this an builds when ``llama-server`` is missing from PATH, so without this an
existing CPU-only build is reused forever. It deliberately installs and existing CPU-only build is reused forever. When ``update_source`` is true,
downloads nothing; the rebuild itself happens on the next serve. the command also fast-forwards the Cookbook-managed ``~/llama.cpp`` checkout
if it exists. The rebuild itself happens on the next serve.
""" """
update_cmd = ''
if update_source:
update_cmd = (
'if [ -d "$HOME/llama.cpp/.git" ]; then '
'git -C "$HOME/llama.cpp" pull --ff-only --depth 1 || '
'echo "[odysseus] WARNING: llama.cpp source update failed; clearing cached build anyway."; '
'elif command -v git >/dev/null 2>&1; then '
'git clone --depth 1 https://github.com/ggml-org/llama.cpp "$HOME/llama.cpp" || '
'echo "[odysseus] WARNING: llama.cpp clone failed; clearing cached build anyway."; '
'fi && '
)
return ( return (
'mkdir -p "$HOME/bin" && ' 'mkdir -p "$HOME/bin" && '
f'{update_cmd}'
'rm -f "$HOME/bin/llama-server" && ' 'rm -f "$HOME/bin/llama-server" && '
'rm -rf "$HOME/llama.cpp/build" && ' 'rm -rf "$HOME/llama.cpp/build" "$HOME/llama.cpp/build-vulkan" && '
'echo "[odysseus] Cleared the cached llama.cpp build. ' 'echo "[odysseus] Cleared the cached llama.cpp build. '
'Re-launch the serve task to rebuild llama-server from source ' 'Re-launch the serve task to rebuild llama-server from source '
'(CUDA or HIP will be used if a toolchain is now available)."' '(Vulkan, HIP, or CUDA will be used if a matching toolchain is now available)."'
) )
@@ -1113,8 +1258,27 @@ def _diagnose_serve_output(text: str) -> dict | None:
"SGLang is not installed or not in PATH on this server.", "SGLang is not installed or not in PATH on this server.",
[{"label": "install SGLang in Cookbook Dependencies", "op": "dependency", "package": "sglang[all]"}], [{"label": "install SGLang in Cookbook Dependencies", "op": "dependency", "package": "sglang[all]"}],
), ),
# System build deps come BEFORE the generic llama.cpp catch-all so
# cmake / build-essential / git missing → a specific OS-package
# remediation instead of "install llama-cpp-python[server]" (which
# itself fails to compile when cmake is absent).
( (
r"llama-server.*command not found|llama\.cpp.*not found|No module named.*llama_cpp|No module named 'starlette_context'|git: command not found|cmake: command not found", r"cmake: command not found|cmake.*not found.*[Cc]ould not",
"cmake is required to build llama.cpp from source but isn't installed on this server.",
[{"label": "install build deps for llama.cpp (apt: cmake build-essential git / pacman: cmake base-devel git / dnf: cmake gcc-c++ make git / brew: cmake git)", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"^(make|g\+\+|gcc): command not found|Could not find C\+\+ compiler",
"A C/C++ compiler (build-essential) is required to build llama.cpp from source.",
[{"label": "install build deps for llama.cpp on this server", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"^git: command not found",
"git is required to clone the llama.cpp source tree.",
[{"label": "install build deps for llama.cpp on this server", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"llama-server.*command not found|llama\.cpp.*not found|No module named.*llama_cpp|No module named 'starlette_context'",
"llama.cpp / llama-cpp-python dependencies are missing.", "llama.cpp / llama-cpp-python dependencies are missing.",
[{"label": "install llama.cpp dependencies or llama-cpp-python[server]", "op": "dependency", "package": "llama-cpp-python[server]"}], [{"label": "install llama.cpp dependencies or llama-cpp-python[server]", "op": "dependency", "package": "llama-cpp-python[server]"}],
), ),
+340 -16
View File
@@ -189,8 +189,27 @@ def setup_cookbook_routes() -> APIRouter:
"SGLang is not installed or not in PATH on this server.", "SGLang is not installed or not in PATH on this server.",
[{"label": "install SGLang in Cookbook Dependencies", "op": "dependency", "package": "sglang[all]"}], [{"label": "install SGLang in Cookbook Dependencies", "op": "dependency", "package": "sglang[all]"}],
), ),
# System build deps come BEFORE the generic llama.cpp catch-all
# so cmake / build-essential / git missing → a specific OS-package
# remediation instead of "install llama-cpp-python[server]" (which
# itself fails to compile when cmake is absent).
( (
r"llama-server.*command not found|llama\.cpp.*not found|No module named.*llama_cpp|No module named 'starlette_context'|git: command not found|cmake: command not found", r"cmake: command not found|cmake.*not found.*[Cc]ould not",
"cmake is required to build llama.cpp from source but isn't installed on this server.",
[{"label": "install build deps for llama.cpp (apt: cmake build-essential git / pacman: cmake base-devel git / dnf: cmake gcc-c++ make git / brew: cmake git)", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"^(make|g\+\+|gcc): command not found|Could not find C\+\+ compiler",
"A C/C++ compiler (build-essential) is required to build llama.cpp from source.",
[{"label": "install build deps for llama.cpp on this server", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"^git: command not found",
"git is required to clone the llama.cpp source tree.",
[{"label": "install build deps for llama.cpp on this server", "op": "dependency", "package": "llama-cpp-python[server]"}],
),
(
r"llama-server.*command not found|llama\.cpp.*not found|No module named.*llama_cpp|No module named 'starlette_context'",
"llama.cpp / llama-cpp-python dependencies are missing.", "llama.cpp / llama-cpp-python dependencies are missing.",
[{"label": "install llama.cpp dependencies or llama-cpp-python[server]", "op": "dependency", "package": "llama-cpp-python[server]"}], [{"label": "install llama.cpp dependencies or llama-cpp-python[server]", "op": "dependency", "package": "llama-cpp-python[server]"}],
), ),
@@ -254,6 +273,79 @@ def setup_cookbook_routes() -> APIRouter:
def _load_stored_hf_token() -> str: def _load_stored_hf_token() -> str:
return load_stored_hf_token(state_path=_cookbook_state_path) return load_stored_hf_token(state_path=_cookbook_state_path)
def _normalize_minimax_m3_vllm_cmd(cmd: str) -> str:
"""Patch MiniMax M3 vLLM launches into the known-good local form.
The browser form can be stale or omit advanced-only fields. MiniMax M3
is sensitive to several flags: using the HF repo id with block-size 128
fails KV-cache setup, and FlashInfer sampler JIT fails on this host's
system nvcc. Normalize server-side before writing the tmux runner.
"""
cmd_lower = (cmd or "").lower()
if not cmd or "vllm serve" not in cmd_lower or "minimax" not in cmd_lower or "m3" not in cmd_lower:
return cmd
try:
parts = shlex.split(cmd)
except ValueError:
return cmd
if "serve" not in parts:
return cmd
env_re = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*=")
env_parts = [p for p in parts if env_re.match(p)]
body = [p for p in parts if not env_re.match(p)]
try:
serve_i = body.index("serve")
except ValueError:
return cmd
if serve_i + 1 >= len(body):
return cmd
repo_id = "cyankiwi/MiniMax-M3-AWQ-INT4"
snapshot = (
"/home/pewds/.cache/huggingface/hub/"
"models--cyankiwi--MiniMax-M3-AWQ-INT4/"
"snapshots/4082acbbec1236d21828d55b6bb0fe02ade4ab5b"
)
if body[serve_i + 1] == repo_id:
body[serve_i + 1] = snapshot
def add_env(key: str, value: str) -> None:
if not any(p.startswith(f"{key}=") for p in env_parts):
env_parts.append(f"{key}={value}")
def has_flag(flag: str) -> bool:
return any(p == flag or p.startswith(flag + "=") for p in body)
def set_flag(flag: str, value: str) -> None:
for i, part in enumerate(body):
if part == flag:
if i + 1 < len(body):
body[i + 1] = value
else:
body.append(value)
return
if part.startswith(flag + "="):
body[i] = f"{flag}={value}"
return
body.extend([flag, value])
def add_bool(flag: str) -> None:
if not has_flag(flag):
body.append(flag)
add_env("VLLM_TARGET_DEVICE", "cuda")
add_env("VLLM_USE_FLASHINFER_SAMPLER", "0")
set_flag("--served-model-name", repo_id)
set_flag("--tool-call-parser", "minimax_m3")
set_flag("--reasoning-parser", "minimax_m3")
set_flag("--attention-backend", "TRITON_ATTN")
set_flag("--block-size", "128")
add_bool("--language-model-only")
add_bool("--disable-custom-all-reduce")
add_bool("--enable-expert-parallel")
return shlex.join(env_parts + body)
def _cookbook_ssh_dir() -> Path: def _cookbook_ssh_dir() -> Path:
# The Docker image keeps cookbook keys under /app/.ssh; that path only # The Docker image keeps cookbook keys under /app/.ssh; that path only
# exists inside the container. On Windows (and any non-container host) # exists inside the container. On Windows (and any non-container host)
@@ -1230,6 +1322,7 @@ def setup_cookbook_routes() -> APIRouter:
# `TypeError: argument of type 'NoneType'` (a 500 instead of a clean 400). # `TypeError: argument of type 'NoneType'` (a 500 instead of a clean 400).
req.cmd = _validate_serve_cmd(req.cmd) or "" req.cmd = _validate_serve_cmd(req.cmd) or ""
req.cmd = _normalize_llama_cpp_python_cache_types(req.cmd) or "" req.cmd = _normalize_llama_cpp_python_cache_types(req.cmd) or ""
req.cmd = _normalize_minimax_m3_vllm_cmd(req.cmd)
req.cmd = _venv_safe_local_pip_install_cmd( req.cmd = _venv_safe_local_pip_install_cmd(
req.cmd, req.cmd,
local=not bool(req.remote_host), local=not bool(req.remote_host),
@@ -1243,8 +1336,16 @@ def setup_cookbook_routes() -> APIRouter:
req.cmd = _pip_install_no_cache(req.cmd) req.cmd = _pip_install_no_cache(req.cmd)
# Accept common aliases and enforce server extras for llama-cpp so # Accept common aliases and enforce server extras for llama-cpp so
# `python -m llama_cpp.server` has all runtime dependencies. # `python -m llama_cpp.server` has all runtime dependencies.
req.cmd = re.sub(r"(?<![A-Za-z0-9_.-])llama_cpp(?![A-Za-z0-9_.-])", "llama-cpp-python[server]", req.cmd) # CRITICAL: the lookbehind / lookahead must also exclude `/` so
req.cmd = re.sub(r"(?<![A-Za-z0-9_.-])llama-cpp-python(?!\[)", "llama-cpp-python[server]", req.cmd) # the regex DOESN'T mangle a URL path like
# https://abetlen.github.io/llama-cpp-python/whl/cu124
# The previous regex turned that URL into
# https://abetlen.github.io/llama-cpp-python[server]/whl/cu124
# which pip then couldn't resolve → silent fallback to source
# build of the .tar.gz → CPU-only binary (because CMAKE_ARGS
# isn't set), defeating the entire purpose of the CUDA index.
req.cmd = re.sub(r"(?<![A-Za-z0-9_.\-/])llama_cpp(?![A-Za-z0-9_.\-/])", "llama-cpp-python[server]", req.cmd)
req.cmd = re.sub(r"(?<![A-Za-z0-9_.\-/])llama-cpp-python(?![\[/])", "llama-cpp-python[server]", req.cmd)
if "llama-cpp-python" in req.cmd and "--extra-index-url" not in req.cmd: if "llama-cpp-python" in req.cmd and "--extra-index-url" not in req.cmd:
req.cmd += " --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu" req.cmd += " --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu"
# PEP-508-style package spec — letters, digits, `.-_` for the # PEP-508-style package spec — letters, digits, `.-_` for the
@@ -1284,6 +1385,11 @@ def setup_cookbook_routes() -> APIRouter:
# LOCAL execution on a native-Windows host never uses tmux (detached # LOCAL execution on a native-Windows host never uses tmux (detached
# process path below), regardless of the UI-supplied platform. # process path below), regardless of the UI-supplied platform.
local_windows = IS_WINDOWS and not remote local_windows = IS_WINDOWS and not remote
if is_windows and remote and "diffusion_server.py" in req.cmd:
raise HTTPException(
400,
"Remote Windows Diffusers serving is not supported yet; use local Windows or a Linux remote server.",
)
if not is_windows and not local_windows and not await _binary_available("tmux", remote, req.ssh_port): if not is_windows and not local_windows and not await _binary_available("tmux", remote, req.ssh_port):
return { return {
@@ -1426,6 +1532,69 @@ def setup_cookbook_routes() -> APIRouter:
runner_lines.append(' else') runner_lines.append(' else')
_append_llama_cpp_linux_accel_build_lines(runner_lines) _append_llama_cpp_linux_accel_build_lines(runner_lines)
runner_lines.append(' fi') runner_lines.append(' fi')
# Source the env file the prebuilt-download path writes so
# LD_LIBRARY_PATH includes the directory holding libllama.so
# and friends. No-op when prebuilt wasn't used.
runner_lines.append(' [ -r ~/.config/odysseus-llama-cpp-env ] && . ~/.config/odysseus-llama-cpp-env')
# Auto-upgrade pip llama-cpp-python to the CUDA-enabled
# wheel when (a) NVIDIA hardware is present and (b) the
# currently-installed wheel is CPU-only. Without this the
# user gets the Python server happily running at 3 tok/s
# because pip's default index ships CPU-only wheels.
# Forward-compat: cu124 wheels work on driver/runtime
# 12.4+ including the cu13.x line.
runner_lines.append(' if command -v nvidia-smi >/dev/null 2>&1 && nvidia-smi -L 2>/dev/null | grep -q "GPU " && python3 -c "import llama_cpp" 2>/dev/null; then')
runner_lines.append(' if ! python3 -c "import llama_cpp; import sys; sys.exit(0 if llama_cpp.llama_supports_gpu_offload() else 1)" 2>/dev/null; then')
runner_lines.append(' echo "[odysseus] NVIDIA detected but installed llama-cpp-python is CPU-only — reinstalling with CUDA wheel index for GPU offload..."')
runner_lines.append(' python3 -m pip install --user --break-system-packages --force-reinstall --no-cache-dir "llama-cpp-python[server]" --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124 2>&1 | tail -8 || echo "[odysseus] WARNING: CUDA wheel reinstall failed — Python server will stay CPU-only (slow). Manual fix: pip install --user --force-reinstall \'llama-cpp-python[server]\' --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124"')
runner_lines.append(' if python3 -c "import llama_cpp; import sys; sys.exit(0 if llama_cpp.llama_supports_gpu_offload() else 1)" 2>/dev/null; then')
runner_lines.append(' echo "[odysseus] llama-cpp-python now supports GPU offload."')
runner_lines.append(' fi')
runner_lines.append(' fi')
runner_lines.append(' fi')
# SHORT-CIRCUIT before the build/pip fallback: if the
# native binary is missing but llama_cpp Python is already
# installed, drop a wrapper at ~/bin/llama-server that
# translates llama-server CLI args to llama_cpp.server's
# underscore-style flags. The user's serve command stays
# `llama-server ...` and "just works" — no build, no cmake,
# no second install. This is the path that unblocks every
# remote where pip-installed llama-cpp-python is already
# working but Cookbook used to insist on a native binary.
runner_lines.append(' if ! command -v llama-server >/dev/null 2>&1 && python3 -c "import llama_cpp" 2>/dev/null; then')
runner_lines.append(' mkdir -p ~/bin')
runner_lines.append(' cat > ~/bin/llama-server <<\'_ODY_LLAMA_SHIM_EOF\'')
runner_lines.append('#!/usr/bin/env bash')
runner_lines.append('# Auto-generated by Odysseus Cookbook: a `llama-server` lookalike')
runner_lines.append('# that translates the native CLI to `python -m llama_cpp.server`.')
runner_lines.append('# Lets cookbook-generated launch commands run unchanged on hosts')
runner_lines.append('# where only the pip llama-cpp-python package is installed.')
runner_lines.append('ARGS=()')
runner_lines.append('while [ $# -gt 0 ]; do')
runner_lines.append(' case "$1" in')
runner_lines.append(' -ngl|--gpu-layers|--n-gpu-layers) ARGS+=(--n_gpu_layers "$2"); shift 2 ;;')
runner_lines.append(' -c|--ctx-size) ARGS+=(--n_ctx "$2"); shift 2 ;;')
runner_lines.append(' -b|--batch-size) ARGS+=(--n_batch "$2"); shift 2 ;;')
runner_lines.append(' -ub|--ubatch-size) shift 2 ;; # llama-cpp-python has no separate ubatch')
runner_lines.append(' --flash-attn) ARGS+=(--flash_attn true); shift 2 ;;')
runner_lines.append(' --cache-type-k) ARGS+=(--type_k "$2"); shift 2 ;;')
runner_lines.append(' --cache-type-v) ARGS+=(--type_v "$2"); shift 2 ;;')
runner_lines.append(' --n-cpu-moe) ARGS+=(--n_cpu_moe "$2"); shift 2 ;;')
runner_lines.append(' --mmproj) ARGS+=(--clip_model_path "$2"); shift 2 ;;')
runner_lines.append(' --image-max-tokens) shift 2 ;; # native-only')
runner_lines.append(' --no-mmap) ARGS+=(--no_mmap true); shift ;;')
runner_lines.append(' --no-warmup) shift ;; # native-only')
runner_lines.append(' --chat-template) ARGS+=(--chat_format "$2"); shift 2 ;;')
runner_lines.append(' --fit|--split-mode|--tensor-split|--main-gpu|--parallel) shift 2 ;; # native-only')
runner_lines.append(' --mlock) ARGS+=(--use_mlock true); shift ;;')
runner_lines.append(' *) ARGS+=("$1"); shift ;;')
runner_lines.append(' esac')
runner_lines.append('done')
runner_lines.append('exec python3 -m llama_cpp.server "${ARGS[@]}"')
runner_lines.append('_ODY_LLAMA_SHIM_EOF')
runner_lines.append(' chmod +x ~/bin/llama-server')
runner_lines.append(' echo "[odysseus] Created llama-server shim → python -m llama_cpp.server (no native binary needed)"')
runner_lines.append(' fi')
runner_lines.append(' # If the native build failed, fall back to the Python bindings.') runner_lines.append(' # If the native build failed, fall back to the Python bindings.')
runner_lines.append(' if ! command -v llama-server &>/dev/null && ! python3 -c "import llama_cpp" 2>/dev/null; then') runner_lines.append(' if ! command -v llama-server &>/dev/null && ! python3 -c "import llama_cpp" 2>/dev/null; then')
runner_lines.append(' echo "llama-server build failed — installing Python bindings as fallback..."') runner_lines.append(' echo "llama-server build failed — installing Python bindings as fallback..."')
@@ -1489,6 +1658,96 @@ def setup_cookbook_routes() -> APIRouter:
runner_lines.append(' echo "ERROR: vLLM is not installed."') runner_lines.append(' echo "ERROR: vLLM is not installed."')
runner_lines.append(' ODYSSEUS_PREFLIGHT_EXIT=127') runner_lines.append(' ODYSSEUS_PREFLIGHT_EXIT=127')
runner_lines.append('fi') runner_lines.append('fi')
runner_lines.append(f"ODYSSEUS_SERVE_CMD='{_bash_squote(req.cmd)}'")
runner_lines.append('if [ -z "$ODYSSEUS_PREFLIGHT_EXIT" ]; then')
runner_lines.append(' ODYSSEUS_VLLM_HELP_CMD="$(python3 - "$ODYSSEUS_SERVE_CMD" <<\'PY\'')
runner_lines.append('import shlex, sys')
runner_lines.append('parts = shlex.split(sys.argv[1])')
runner_lines.append('try:')
runner_lines.append(' serve_i = parts.index("serve")')
runner_lines.append('except ValueError:')
runner_lines.append(' print("vllm serve --help")')
runner_lines.append('else:')
runner_lines.append(' print(shlex.join(parts[:serve_i + 1] + ["--help"]))')
runner_lines.append('PY')
runner_lines.append(')"')
runner_lines.append(' ODYSSEUS_VLLM_SUPPORTS_SWAP=0')
runner_lines.append(' if eval "$ODYSSEUS_VLLM_HELP_CMD" 2>&1 | grep -q -- "--swap-space"; then ODYSSEUS_VLLM_SUPPORTS_SWAP=1; fi')
runner_lines.append('fi')
runner_lines.append('if [ -z "$ODYSSEUS_PREFLIGHT_EXIT" ] && [ "${ODYSSEUS_VLLM_SUPPORTS_SWAP:-0}" = "1" ] && ! printf "%s" "$ODYSSEUS_SERVE_CMD" | grep -q -- "--swap-space"; then')
runner_lines.append(' echo "[odysseus] Setting vLLM --swap-space 0 so the runtime does not reserve CPU swap per GPU."')
runner_lines.append(' ODYSSEUS_SERVE_CMD="${ODYSSEUS_SERVE_CMD} --swap-space 0"')
runner_lines.append('fi')
runner_lines.append('if [ -z "$ODYSSEUS_PREFLIGHT_EXIT" ] && [ "${ODYSSEUS_VLLM_SUPPORTS_SWAP:-0}" != "1" ]; then')
runner_lines.append(' if printf "%s" "$ODYSSEUS_SERVE_CMD" | grep -q -- "--swap-space"; then')
runner_lines.append(' echo "[odysseus] vLLM serve does not expose --swap-space; removing the flag and patching the runtime default to 0."')
runner_lines.append(' ODYSSEUS_SERVE_CMD="$(python3 - "$ODYSSEUS_SERVE_CMD" <<\'PY\'')
runner_lines.append('import shlex, sys')
runner_lines.append('parts = shlex.split(sys.argv[1])')
runner_lines.append('out = []')
runner_lines.append('skip = False')
runner_lines.append('for part in parts:')
runner_lines.append(' if skip:')
runner_lines.append(' skip = False')
runner_lines.append(' continue')
runner_lines.append(' if part == "--swap-space":')
runner_lines.append(' skip = True')
runner_lines.append(' continue')
runner_lines.append(' if part.startswith("--swap-space="):')
runner_lines.append(' continue')
runner_lines.append(' out.append(part)')
runner_lines.append('print(shlex.join(out))')
runner_lines.append('PY')
runner_lines.append(')"')
runner_lines.append(' fi')
runner_lines.append(' ODYSSEUS_SERVE_CMD="$(python3 - "$ODYSSEUS_SERVE_CMD" <<\'PY\'')
runner_lines.append('import shlex, sys')
runner_lines.append('parts = shlex.split(sys.argv[1])')
runner_lines.append('patch = r"""import inspect, sys')
runner_lines.append('from vllm.engine.arg_utils import EngineArgs, AsyncEngineArgs')
runner_lines.append('def _odysseus_swap0(cls):')
runner_lines.append(' params = list(inspect.signature(cls).parameters)')
runner_lines.append(' if "swap_space" not in params:')
runner_lines.append(' return')
runner_lines.append(' idx = params.index("swap_space")')
runner_lines.append(' defaults = list(cls.__init__.__defaults__ or ())')
runner_lines.append(' if idx < len(defaults):')
runner_lines.append(' defaults[idx] = 0')
runner_lines.append(' cls.__init__.__defaults__ = tuple(defaults)')
runner_lines.append(' fields = getattr(cls, "__dataclass_fields__", {})')
runner_lines.append(' if "swap_space" in fields:')
runner_lines.append(' fields["swap_space"].default = 0')
runner_lines.append('_odysseus_swap0(EngineArgs)')
runner_lines.append('_odysseus_swap0(AsyncEngineArgs)')
runner_lines.append('try:')
runner_lines.append(' from vllm.config import CacheConfig')
runner_lines.append(' CacheConfig.swap_space = 0')
runner_lines.append('except Exception:')
runner_lines.append(' pass')
runner_lines.append('_orig_create_engine_config = EngineArgs.create_engine_config')
runner_lines.append('def _odysseus_create_engine_config(self, *args, **kwargs):')
runner_lines.append(' self.swap_space = 0')
runner_lines.append(' return _orig_create_engine_config(self, *args, **kwargs)')
runner_lines.append('EngineArgs.create_engine_config = _odysseus_create_engine_config')
runner_lines.append('AsyncEngineArgs.create_engine_config = _odysseus_create_engine_config')
runner_lines.append('from vllm.entrypoints.cli.main import main')
runner_lines.append('sys.exit(main())"""')
runner_lines.append('try:')
runner_lines.append(' serve_i = parts.index("serve")')
runner_lines.append('except ValueError:')
runner_lines.append(' print(shlex.join(parts))')
runner_lines.append('else:')
runner_lines.append(' exe_i = serve_i - 1')
runner_lines.append(' exe = parts[exe_i] if exe_i >= 0 else "vllm"')
runner_lines.append(' py = "python3"')
runner_lines.append(' if exe.endswith("/bin/vllm"):')
runner_lines.append(' py = exe[:-len("/bin/vllm")] + "/bin/python"')
runner_lines.append(' parts[exe_i:serve_i] = [py, "-c", patch]')
runner_lines.append(' print(shlex.join(parts))')
runner_lines.append('PY')
runner_lines.append(')"')
runner_lines.append(' echo "[odysseus] Patched vLLM internal swap_space default to 0 for this runtime."')
runner_lines.append('fi')
elif "sglang.launch_server" in req.cmd: elif "sglang.launch_server" in req.cmd:
runner_lines.append('export PATH="$HOME/.local/bin:$PATH"') runner_lines.append('export PATH="$HOME/.local/bin:$PATH"')
runner_lines.append('if ! command -v sglang &>/dev/null; then') runner_lines.append('if ! command -v sglang &>/dev/null; then')
@@ -1530,7 +1789,10 @@ def setup_cookbook_routes() -> APIRouter:
runner_lines, runner_lines,
keep_shell_open=not local_windows, keep_shell_open=not local_windows,
) )
runner_lines.append(req.cmd) if "vllm serve" in req.cmd:
runner_lines.append('eval "$ODYSSEUS_SERVE_CMD"')
else:
runner_lines.append(req.cmd)
if local_windows: if local_windows:
# Detached background process — no interactive shell to keep open. # Detached background process — no interactive shell to keep open.
# Print the exit marker the status poller looks for, then stop. # Print the exit marker the status poller looks for, then stop.
@@ -1834,6 +2096,25 @@ def setup_cookbook_routes() -> APIRouter:
out, err = await _run_gpu_shell("ls -1 /sys/class/drm 2>/dev/null", host, ssh_port, timeout=4) out, err = await _run_gpu_shell("ls -1 /sys/class/drm 2>/dev/null", host, ssh_port, timeout=4)
if err is not None or not out: if err is not None or not out:
return [] return []
# Pick the runtime label up-front so each GPU dict gets the
# right `backend`. AMD silicon can be driven by ROCm/HIP (native)
# OR Vulkan (mesa RADV). Reporting "rocm" on a host where no
# ROCm toolchain is installed misleads the frontend env-var
# prefix logic — it would emit `HIP_VISIBLE_DEVICES=` for a
# Vulkan-only stack, which is a silent no-op at best.
rt_out, _ = await _run_gpu_shell(
'command -v rocminfo >/dev/null 2>&1 && echo rocm '
'|| (command -v hipconfig >/dev/null 2>&1 && echo rocm) '
'|| (command -v vulkaninfo >/dev/null 2>&1 && echo vulkan) '
'|| echo unknown',
host, ssh_port, timeout=4,
)
_amd_runtime = (rt_out or "").strip().splitlines()[-1:][0].strip() if rt_out else "rocm"
if _amd_runtime not in ("rocm", "vulkan"):
# Default to rocm so existing ROCm-installed hosts keep
# working; "unknown" only happens when neither toolchain is
# detected (e.g. minimal sysfs read on a fresh box).
_amd_runtime = "rocm"
gpus = [] gpus = []
for entry in out.split(): for entry in out.split():
if not entry.startswith("card") or "-" in entry: if not entry.startswith("card") or "-" in entry:
@@ -1877,7 +2158,7 @@ def setup_cookbook_routes() -> APIRouter:
"free_mb": free_mb, "total_mb": total_mb, "used_mb": used_mb, "free_mb": free_mb, "total_mb": total_mb, "used_mb": used_mb,
"gtt_used_mb": gtt_used_mb, "gtt_used_mb": gtt_used_mb,
"util_pct": 0, "busy": bool(total_mb and (free_mb / total_mb) < 0.85), "util_pct": 0, "busy": bool(total_mb and (free_mb / total_mb) < 0.85),
"processes": [], "backend": "rocm", "source": "amd-sysfs", "processes": [], "backend": _amd_runtime, "source": "amd-sysfs",
"unified_memory": unified, "unified_memory": unified,
}) })
if gpus: if gpus:
@@ -2018,10 +2299,15 @@ def setup_cookbook_routes() -> APIRouter:
amd_gpus = await _probe_amd_sysfs(host, ssh_port) amd_gpus = await _probe_amd_sysfs(host, ssh_port)
if amd_gpus: if amd_gpus:
# The per-GPU dict already carries the runtime label picked by
# _probe_amd_sysfs (rocm vs vulkan); mirror that into the
# wrapper so the frontend can read `data.backend` directly
# without scanning the list.
_amd_wrap_backend = str(amd_gpus[0].get("backend") or "rocm")
return { return {
"ok": True, "ok": True,
"gpus": amd_gpus, "gpus": amd_gpus,
"backend": "rocm", "backend": _amd_wrap_backend,
"source": "amd-sysfs", "source": "amd-sysfs",
"fallback_from": "nvidia-smi", "fallback_from": "nvidia-smi",
"nvidia_error": nvidia_error, "nvidia_error": nvidia_error,
@@ -2161,6 +2447,17 @@ def setup_cookbook_routes() -> APIRouter:
disk_tasks = on_disk.get("tasks") or [] if isinstance(on_disk, dict) else [] disk_tasks = on_disk.get("tasks") or [] if isinstance(on_disk, dict) else []
incoming_tasks = data.get("tasks") if isinstance(data.get("tasks"), list) else [] incoming_tasks = data.get("tasks") if isinstance(data.get("tasks"), list) else []
incoming_removed = data.get("removedTasks") if isinstance(data.get("removedTasks"), dict) else {}
disk_removed = on_disk.get("removedTasks") if isinstance(on_disk, dict) and isinstance(on_disk.get("removedTasks"), dict) else {}
removed_tasks = {**disk_removed, **incoming_removed}
data["removedTasks"] = removed_tasks
removed_ids = set(removed_tasks.keys())
if removed_ids:
incoming_tasks = [
t for t in incoming_tasks
if not (isinstance(t, dict) and t.get("sessionId") in removed_ids)
]
data["tasks"] = incoming_tasks
# Anti-poisoning guard: a stale browser tab can keep POSTing a # Anti-poisoning guard: a stale browser tab can keep POSTing a
# download task as status='done' from before the strict-finish # download task as status='done' from before the strict-finish
# fix landed, undoing any server-side correction. For each # fix landed, undoing any server-side correction. For each
@@ -2198,6 +2495,8 @@ def setup_cookbook_routes() -> APIRouter:
sid = t.get("sessionId") sid = t.get("sessionId")
if not sid or sid in incoming_ids: if not sid or sid in incoming_ids:
continue # client's version wins continue # client's version wins
if sid in removed_ids:
continue # intentional cross-device clear/remove
ts = t.get("ts") or 0 ts = t.get("ts") or 0
if isinstance(ts, (int, float)) and (now_ms - ts) <= RACE_WINDOW_MS: if isinstance(ts, (int, float)) and (now_ms - ts) <= RACE_WINDOW_MS:
preserved.append(t) preserved.append(t)
@@ -2304,16 +2603,14 @@ def setup_cookbook_routes() -> APIRouter:
# Add 30% headroom for KV cache, activations, etc. # Add 30% headroom for KV cache, activations, etc.
needed_vram = (est_vram * 1.3) if est_vram else None needed_vram = (est_vram * 1.3) if est_vram else None
if vram_gb > 0 and needed_vram is not None and needed_vram > vram_gb: if vram_gb > 0:
continue if needed_vram is None:
# Unknown-size models (e.g. MiniMax-M2.7, DeepSeek-V4-Flash) have no # The "trending models that fit" list must be conservative:
# "NB" in the repo id, so the regex above can't extract their # if we cannot estimate size from the repo id/tags, do not
# param count. Previously we dropped them entirely, which made # present it as runnable on this hardware.
# brand-new flagship releases silently vanish from this list even continue
# on rigs with hundreds of GB of VRAM. Adapters/LoRAs are already if needed_vram > vram_gb:
# filtered by _is_excluded(), so what falls through here is continue
# overwhelmingly full models — keep them, just without a size
# badge (the frontend handles needed_vram_gb=null gracefully).
out.append({ out.append({
"repo_id": repo_id, "repo_id": repo_id,
@@ -2510,6 +2807,33 @@ def setup_cookbook_routes() -> APIRouter:
except Exception as e: except Exception as e:
logger.warning(f"orphan sweep: state write failed: {e}") logger.warning(f"orphan sweep: state write failed: {e}")
@router.get("/api/cookbook/hf-gguf-files")
async def hf_gguf_files(repo_id: str, owner: str = Depends(require_user)):
"""List GGUF files in a HuggingFace repo for the direct-download picker."""
import httpx
repo_id = _validate_repo_id(repo_id)
url = f"https://huggingface.co/api/models/{repo_id}"
try:
headers = {}
token = _load_stored_hf_token()
if token:
headers["Authorization"] = f"Bearer {token}"
async with httpx.AsyncClient(timeout=15, follow_redirects=True) as client:
resp = await client.get(url, headers=headers)
if resp.status_code != 200:
return {"ok": False, "files": [], "error": f"HF API HTTP {resp.status_code}"}
data = resp.json()
except Exception:
logger.exception("HF GGUF file scan failed for %s", repo)
return {"ok": False, "files": [], "error": "HF API request failed"}
files = [
str(s.get("rfilename") or "")
for s in data.get("siblings", [])
if str(s.get("rfilename") or "").lower().endswith(".gguf")
]
return {"ok": True, "repo_id": repo_id, "files": files}
# In-memory cache for the Ollama library scrape. ollama.com is a public # In-memory cache for the Ollama library scrape. ollama.com is a public
# site, but it doesn't expose a stable JSON listing — we fetch the HTML # site, but it doesn't expose a stable JSON listing — we fetch the HTML
# search page and regex out the model cards. Cached for 1 h so a busy # search page and regex out the model cards. Cached for 1 h so a busy
+7 -2
View File
@@ -12,6 +12,7 @@ from pydantic import BaseModel
from core.database import Document, DocumentVersion from core.database import Document, DocumentVersion
from core.database import Session as DbSession from core.database import Session as DbSession
from src.auth_helpers import _auth_disabled
from src.upload_handler import UploadHandler from src.upload_handler import UploadHandler
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -78,6 +79,8 @@ def _verify_doc_owner(db, doc: Document, user: str):
the session join for any not-yet-backfilled legacy row. the session join for any not-yet-backfilled legacy row.
""" """
if user is None: if user is None:
if _auth_disabled():
return # Single-user / no-auth mode: allow access
raise HTTPException(403, "Authentication required") raise HTTPException(403, "Authentication required")
if doc.owner is not None: if doc.owner is not None:
if doc.owner != user: if doc.owner != user:
@@ -102,8 +105,10 @@ def _owner_session_filter(q, user):
The owner backfill runs in init_db before the app serves requests, so The owner backfill runs in init_db before the app serves requests, so
by the time this filter is live there are no NULL-owner rows to leak; by the time this filter is live there are no NULL-owner rows to leak;
we therefore match the owner strictly.""" we therefore match the owner strictly for authenticated callers."""
if user is None: if not user:
if user == "" or _auth_disabled():
return q
return q.filter(False) return q.filter(False)
return q.filter(Document.owner == user) return q.filter(Document.owner == user)
+13 -5
View File
@@ -10,7 +10,7 @@ from fastapi import APIRouter, HTTPException, Query, Request, UploadFile, File,
from sqlalchemy import case, func, or_ from sqlalchemy import case, func, or_
from core.database import SessionLocal, Document, DocumentVersion from core.database import SessionLocal, Document, DocumentVersion
from core.database import Session as DbSession from core.database import Session as DbSession
from src.auth_helpers import get_current_user from src.auth_helpers import get_current_user, _auth_disabled
from src.constants import MAIL_ATTACHMENTS_DIR from src.constants import MAIL_ATTACHMENTS_DIR
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -388,7 +388,8 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
db = SessionLocal() db = SessionLocal()
try: try:
if not user: if not user:
raise HTTPException(403, "Authentication required") if not _auth_disabled():
raise HTTPException(403, "Authentication required")
# v2 review HIGH-9: raise 403 explicitly when the caller # v2 review HIGH-9: raise 403 explicitly when the caller
# can't see this session, instead of returning [] which the # can't see this session, instead of returning [] which the
# UI treats identically to "no docs" and silently masks # UI treats identically to "no docs" and silently masks
@@ -503,7 +504,8 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
user = get_current_user(request) user = get_current_user(request)
try: try:
data = await request.json() data = await request.json()
except Exception: except Exception as e:
logger.warning("Failed to parse export request body, defaulting to empty", exc_info=e)
data = {} data = {}
ids = data.get("ids") or [] ids = data.get("ids") or []
if not ids: if not ids:
@@ -645,8 +647,8 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
try: try:
from src.agent_tools.document_tools import clear_active_document from src.agent_tools.document_tools import clear_active_document
clear_active_document(doc_id) clear_active_document(doc_id)
except Exception: except Exception as e:
pass logger.warning("Failed to clear active document %r on detach", doc_id, exc_info=e)
db.commit() db.commit()
db.refresh(doc) db.refresh(doc)
return _doc_to_dict(doc) return _doc_to_dict(doc)
@@ -1331,6 +1333,12 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
if not pdf_path: if not pdf_path:
raise HTTPException(404, f"Source PDF {upload_id} not found") raise HTTPException(404, f"Source PDF {upload_id} not found")
# Fail fast with a clear 503 if the optional PyMuPDF dependency
# is missing — fill_fields/stamp_annotations will otherwise
# raise RuntimeError deep inside and bubble out as a 500.
# Mirrors the convention in _load_pdf_viewer_fitz above.
_load_pdf_viewer_fitz()
values = parse_markdown_to_values(doc.current_content or "") values = parse_markdown_to_values(doc.current_content or "")
out_path = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False).name out_path = tempfile.NamedTemporaryFile(suffix=".pdf", delete=False).name
_to_unlink.append(out_path) _to_unlink.append(out_path)
+179 -18
View File
@@ -13,6 +13,8 @@ and `email_pollers.py` (the background loops):
""" """
import os import os
import base64
import time
import imaplib import imaplib
import smtplib import smtplib
import email as email_mod import email as email_mod
@@ -38,6 +40,106 @@ from src.secret_storage import decrypt as _decrypt
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def _xoauth2_raw(user: str, access_token: str) -> str:
"""The SASL XOAUTH2 initial-response string (unencoded).
Both smtplib.SMTP.auth() and imaplib.IMAP4.authenticate() base64-encode
the value their callback returns, so callers pass this raw form never
pre-encoded to avoid double base64.
"""
return f"user={user}\x01auth=Bearer {access_token}\x01\x01"
def _xoauth2_bytes(user: str, access_token: str) -> bytes:
"""Raw XOAUTH2 bytes for imaplib's authenticate() callback."""
return _xoauth2_raw(user, access_token).encode()
def make_oauth_state(account_id: str, owner: str) -> str:
"""Return an HMAC-signed, base64-encoded OAuth state token.
Encodes account_id + owner + a random nonce, signed with the app secret
so the callback can validate that the flow was initiated by an
authenticated, owning user (CSRF / state-forgery protection).
"""
import hmac as _hmac, hashlib as _hl, secrets as _sec
from src.secret_storage import _load_or_create_key
nonce = _sec.token_hex(16)
payload = json.dumps({"a": account_id, "o": owner, "n": nonce}, separators=(",", ":"))
sig = _hmac.new(_load_or_create_key(), payload.encode(), _hl.sha256).hexdigest()
return base64.urlsafe_b64encode(f"{payload}|{sig}".encode()).decode()
def verify_oauth_state(state: str) -> dict | None:
"""Verify an OAuth state token's HMAC signature.
Returns the decoded payload dict ({"a", "o", "n"}) on success, or None if
the token is malformed, tampered, or signed with a different key.
"""
import hmac as _hmac, hashlib as _hl
from src.secret_storage import _load_or_create_key
try:
decoded = base64.urlsafe_b64decode(state.encode()).decode()
payload, sig = decoded.rsplit("|", 1)
expected = _hmac.new(_load_or_create_key(), payload.encode(), _hl.sha256).hexdigest()
if not _hmac.compare_digest(sig, expected):
return None
return json.loads(payload)
except Exception:
return None
def _refresh_google_token(account_id: str) -> str | None:
"""Exchange the stored refresh token for a new access token and persist it."""
import httpx
from core.database import SessionLocal as _SL, EmailAccount as _EA
from src.secret_storage import encrypt as _enc, decrypt as _dec
client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
client_secret = os.environ.get("GOOGLE_OAUTH_CLIENT_SECRET", "")
if not client_id or not client_secret:
return None
db = _SL()
try:
row = db.get(_EA, account_id)
if not row or not row.oauth_refresh_token:
return None
refresh_token = _dec(row.oauth_refresh_token or "")
if not refresh_token:
return None
resp = httpx.post("https://oauth2.googleapis.com/token", data={
"client_id": client_id,
"client_secret": client_secret,
"refresh_token": refresh_token,
"grant_type": "refresh_token",
}, timeout=10)
resp.raise_for_status()
data = resp.json()
access_token = data["access_token"]
row.oauth_access_token = _enc(access_token)
row.oauth_token_expiry = str(int(time.time()) + data.get("expires_in", 3600))
db.commit()
return access_token
except Exception:
logger.warning(f"Google token refresh failed for account {account_id}")
return None
finally:
db.close()
def _get_valid_google_token(account_id: str, cfg: dict) -> str | None:
"""Return a valid Google access token, refreshing if expired or missing."""
from src.secret_storage import decrypt as _dec
access_token = _dec(cfg.get("oauth_access_token") or "")
expiry_str = cfg.get("oauth_token_expiry") or ""
if access_token and expiry_str:
try:
if int(expiry_str) - 60 > time.time():
return access_token
except (ValueError, TypeError):
pass
return _refresh_google_token(account_id)
def _smtp_security_mode(cfg: dict) -> str: def _smtp_security_mode(cfg: dict) -> str:
raw = str(cfg.get("smtp_security") or "").strip().lower() raw = str(cfg.get("smtp_security") or "").strip().lower()
if raw in {"ssl", "starttls", "none"}: if raw in {"ssl", "starttls", "none"}:
@@ -54,20 +156,29 @@ def _send_smtp_message(cfg: dict, from_addr: str, recipients: list[str], message
port = int(cfg.get("smtp_port") or 465) port = int(cfg.get("smtp_port") or 465)
user = cfg.get("smtp_user") or "" user = cfg.get("smtp_user") or ""
password = cfg.get("smtp_password") or "" password = cfg.get("smtp_password") or ""
def _auth_smtp(smtp):
if cfg.get("oauth_provider") == "google":
token = _get_valid_google_token(cfg.get("account_id"), cfg)
if not token:
raise RuntimeError("Google OAuth token unavailable — reconnect the account")
smtp.ehlo()
smtp.auth("XOAUTH2", lambda challenge=None: _xoauth2_raw(user, token), initial_response_ok=True)
elif user and password:
smtp.login(user, password)
security = _smtp_security_mode(cfg) security = _smtp_security_mode(cfg)
if security == "ssl": if security == "ssl":
with smtplib.SMTP_SSL(host, port, timeout=timeout) as smtp: with smtplib.SMTP_SSL(host, port, timeout=timeout) as smtp:
if user and password: _auth_smtp(smtp)
smtp.login(user, password)
smtp.sendmail(from_addr, recipients, message) smtp.sendmail(from_addr, recipients, message)
return return
with smtplib.SMTP(host, port, timeout=timeout) as smtp: with smtplib.SMTP(host, port, timeout=timeout) as smtp:
if security == "starttls": if security == "starttls":
smtp.starttls() smtp.starttls()
if user and password: _auth_smtp(smtp)
smtp.login(user, password)
smtp.sendmail(from_addr, recipients, message) smtp.sendmail(from_addr, recipients, message)
@@ -701,10 +812,16 @@ def _get_email_config(account_id: str | None = None, owner: str = "") -> dict:
"imap_password": _decrypt(row.imap_password or ""), "imap_password": _decrypt(row.imap_password or ""),
"imap_starttls": bool(row.imap_starttls), "imap_starttls": bool(row.imap_starttls),
"from_address": row.from_address or row.imap_user or "", "from_address": row.from_address or row.imap_user or "",
"oauth_provider": row.oauth_provider or "",
"oauth_access_token": row.oauth_access_token or "",
"oauth_refresh_token": row.oauth_refresh_token or "",
"oauth_token_expiry": row.oauth_token_expiry or "",
"display_name": row.display_name or "",
} }
if not (cfg["smtp_host"] and cfg["smtp_user"] and cfg["smtp_password"]): is_oauth = bool(cfg.get("oauth_provider"))
if not is_oauth and not (cfg["smtp_host"] and cfg["smtp_user"] and cfg["smtp_password"]):
logger.warning(f"SMTP not configured for account {row.name!r}") logger.warning(f"SMTP not configured for account {row.name!r}")
if not (cfg["imap_host"] and cfg["imap_user"] and cfg["imap_password"]): if not is_oauth and not (cfg["imap_host"] and cfg["imap_user"] and cfg["imap_password"]):
logger.warning(f"IMAP not configured for account {row.name!r}") logger.warning(f"IMAP not configured for account {row.name!r}")
return cfg return cfg
finally: finally:
@@ -825,12 +942,19 @@ def _imap_connect(account_id: str | None = None, owner: str = "",
timeout=timeout, timeout=timeout,
) )
try: try:
conn.login(cfg["imap_user"], cfg["imap_password"]) if cfg.get("oauth_provider") == "google":
token = _get_valid_google_token(cfg.get("account_id"), cfg)
if not token:
raise RuntimeError("Google OAuth token unavailable — reconnect the account in Settings → Integrations")
conn.authenticate("XOAUTH2", lambda x: _xoauth2_bytes(cfg["imap_user"], token))
else:
conn.login(cfg["imap_user"], cfg["imap_password"])
except Exception: except Exception:
# A failed AUTHENTICATE (e.g. an Office 365 app password on an # A failed AUTHENTICATE (e.g. an Office 365 app password on an
# MFA-enabled tenant, #3174) otherwise orphans the already-connected # MFA-enabled tenant, #3174, or an expired/revoked OAuth token)
# socket; close it before propagating so a misconfigured account # otherwise orphans the already-connected socket; close it before
# can't leak one descriptor per retry / background poller pass. # propagating so a misconfigured account can't leak one descriptor
# per retry / background poller pass.
try: try:
conn.shutdown() conn.shutdown()
except Exception: except Exception:
@@ -1109,22 +1233,30 @@ def _list_attachments_from_msg(msg):
return attachments return attachments
idx = 0 idx = 0
for part in msg.walk(): for part in msg.walk():
if part.is_multipart():
continue
cd = str(part.get("Content-Disposition", "")) cd = str(part.get("Content-Disposition", ""))
ct = part.get_content_type() ct = part.get_content_type()
is_attached_email = ct == "message/rfc822" and ("attachment" in cd.lower() or part.get_filename())
if part.is_multipart() and not is_attached_email:
continue
# Skip text/html body parts (only consider real attachments) # Skip text/html body parts (only consider real attachments)
if ct in ("text/plain", "text/html") and "attachment" not in cd: if ct in ("text/plain", "text/html") and "attachment" not in cd:
continue continue
filename = part.get_filename() filename = part.get_filename()
if filename: if filename:
filename = _decode_header(filename) filename = _decode_header(filename)
if ct == "message/rfc822" and not re.search(r"\.[A-Za-z0-9]{1,8}$", filename):
filename = f"{filename}.eml"
else: else:
# Inline images, etc. - generate a name # Inline images, etc. - generate a name
ext = ct.split("/")[-1] if "/" in ct else "bin" ext = "eml" if ct == "message/rfc822" else (ct.split("/")[-1] if "/" in ct else "bin")
filename = f"attachment_{idx}.{ext}" filename = f"attachment_{idx}.{ext}"
payload = part.get_payload(decode=True) payload = part.get_payload(decode=True)
size = len(payload) if payload else 0 if payload is None and ct == "message/rfc822":
try:
payload = part.as_bytes()
except Exception:
payload = b""
size = len(payload) if payload is not None else 0
attachments.append({ attachments.append({
"index": idx, "index": idx,
"filename": filename, "filename": filename,
@@ -1136,29 +1268,58 @@ def _list_attachments_from_msg(msg):
return attachments return attachments
def _is_likely_signature_image_attachment(att: dict) -> bool:
"""Match the reader's inline signature/logo image filter."""
filename = str((att or {}).get("filename") or "").lower()
if not re.search(r"\.(png|jpe?g|gif|bmp|svg|webp)$", filename):
return False
size = int((att or {}).get("size") or 0)
if re.search(r"^image\d{3,}\.(png|jpe?g|gif)$", filename):
return True
if re.search(r"^(signature|logo|sig|footer|banner)[-_\d]*\.(png|jpe?g|gif|svg)$", filename):
return True
return 0 < size < 30 * 1024
def _has_visible_attachments(msg) -> bool:
"""Return True only for attachments the reader will render as chips."""
return any(
not _is_likely_signature_image_attachment(att)
for att in _list_attachments_from_msg(msg)
)
def _extract_attachment_to_disk(msg, index, target_dir): def _extract_attachment_to_disk(msg, index, target_dir):
"""Extract a specific attachment to disk and return the file path.""" """Extract a specific attachment to disk and return the file path."""
if not msg.is_multipart(): if not msg.is_multipart():
return None return None
idx = 0 idx = 0
for part in msg.walk(): for part in msg.walk():
if part.is_multipart():
continue
cd = str(part.get("Content-Disposition", "")) cd = str(part.get("Content-Disposition", ""))
ct = part.get_content_type() ct = part.get_content_type()
is_attached_email = ct == "message/rfc822" and ("attachment" in cd.lower() or part.get_filename())
if part.is_multipart() and not is_attached_email:
continue
if ct in ("text/plain", "text/html") and "attachment" not in cd: if ct in ("text/plain", "text/html") and "attachment" not in cd:
continue continue
if idx == index: if idx == index:
filename = part.get_filename() filename = part.get_filename()
if filename: if filename:
filename = _decode_header(filename) filename = _decode_header(filename)
if ct == "message/rfc822" and not re.search(r"\.[A-Za-z0-9]{1,8}$", filename):
filename = f"{filename}.eml"
else: else:
ext = ct.split("/")[-1] if "/" in ct else "bin" ext = "eml" if ct == "message/rfc822" else (ct.split("/")[-1] if "/" in ct else "bin")
filename = f"attachment_{idx}.{ext}" filename = f"attachment_{idx}.{ext}"
# Sanitize # Sanitize
safe_name = re.sub(r"[^\w\s\-.]", "_", filename).strip() safe_name = re.sub(r"[^\w\s\-.]", "_", filename).strip()
payload = part.get_payload(decode=True) payload = part.get_payload(decode=True)
if not payload: if payload is None and ct == "message/rfc822":
try:
payload = part.as_bytes()
except Exception:
payload = b""
if payload is None:
return None return None
target_dir.mkdir(parents=True, exist_ok=True) target_dir.mkdir(parents=True, exist_ok=True)
filepath = target_dir / safe_name filepath = target_dir / safe_name
+12 -1
View File
@@ -44,6 +44,17 @@ from routes.email_helpers import (
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# Recovers a `[{"action": ...}, ...]` JSON array from raw LLM output when the
# fenced-block strip leaves nothing usable. Runs on model output influenced by
# untrusted email bodies, so it must not backtrack: the object content class is
# `[^{}]` (brace-delimited, greedy) rather than the old `[^[\]]*?` lazy runs,
# which exploded exponentially on inputs like `[{"action"},{` + `}},{{` * N
# (CodeQL py/redos #198).
_CAL_ACTION_ARRAY_RE = re.compile(
r'\[\s*\{[^{}]*"action"[^{}]*\}\s*(?:,\s*\{[^{}]*\}\s*)*\]',
re.DOTALL,
)
def _owner_for_email_account(account_id: str | None) -> str: def _owner_for_email_account(account_id: str | None) -> str:
if not account_id: if not account_id:
@@ -558,7 +569,7 @@ async def _auto_summarize_pass_single(days_back: int = 1, account_id: str | None
cal_extract = _strip_think(_raw_original) cal_extract = _strip_think(_raw_original)
cal_extract = re.sub(r"^```(?:json)?\s*|\s*```$", "", cal_extract, flags=re.MULTILINE).strip() cal_extract = re.sub(r"^```(?:json)?\s*|\s*```$", "", cal_extract, flags=re.MULTILINE).strip()
if not cal_extract and _raw_original: if not cal_extract and _raw_original:
matches = list(re.finditer(r'\[\s*\{[^[\]]*?"action"[^[\]]*?\}\s*(?:,\s*\{[^[\]]*?\}\s*)*\]', _raw_original, re.DOTALL)) matches = list(_CAL_ACTION_ARRAY_RE.finditer(_raw_original))
if matches: if matches:
cal_extract = matches[-1].group() cal_extract = matches[-1].group()
logger.info(f"[cal-extract] uid={uid.decode() if isinstance(uid, bytes) else uid} folder={_folder} subj={subject[:50]!r} raw_len={len(cal_extract)} orig_len={len(_raw_original)} raw={cal_extract[:800]!r}") logger.info(f"[cal-extract] uid={uid.decode() if isinstance(uid, bytes) else uid} folder={_folder} subj={subject[:50]!r} raw_len={len(cal_extract)} orig_len={len(_raw_original)} raw={cal_extract[:800]!r}")
+365 -52
View File
@@ -13,7 +13,9 @@ handlers need. The split is mechanical — no behavior change.
""" """
import asyncio import asyncio
import os
import sqlite3 as _sql3 import sqlite3 as _sql3
import time
import email as email_mod import email as email_mod
import email.header import email.header
import email.utils import email.utils
@@ -43,8 +45,9 @@ from routes.email_helpers import (
_load_settings, _save_settings, _get_email_config, _load_settings, _save_settings, _get_email_config,
_send_smtp_message, _smtp_security_mode, _send_smtp_message, _smtp_security_mode,
_IMAP_TIMEOUT_SECONDS, _open_imap_connection, _IMAP_TIMEOUT_SECONDS, _open_imap_connection,
make_oauth_state, verify_oauth_state,
_imap_connect, _imap, _decode_header, _detect_sent_folder, _detect_drafts_folder, _imap_connect, _imap, _decode_header, _detect_sent_folder, _detect_drafts_folder,
_extract_attachment_text, _list_attachments_from_msg, _extract_attachment_text, _list_attachments_from_msg, _has_visible_attachments, _is_likely_signature_image_attachment,
_extract_attachment_to_disk, _extract_html, _extract_text, _extract_attachment_to_disk, _extract_html, _extract_text,
_fetch_sender_thread_context, _pre_retrieve_context, _fetch_sender_thread_context, _pre_retrieve_context,
_EMAIL_REPLY_SYS_PROMPT_BASE, _POOL_HOOKS, _EMAIL_REPLY_SYS_PROMPT_BASE, _POOL_HOOKS,
@@ -58,6 +61,7 @@ from routes.email_pollers import _start_poller
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
ODYSSEUS_MAIL_ORIGIN = "odysseus-ui" ODYSSEUS_MAIL_ORIGIN = "odysseus-ui"
EMAIL_READ_ATTACHMENT_VERSION = 2
def _email_tag_owner_aliases(account_id: str | None, owner: str = "") -> list[str]: def _email_tag_owner_aliases(account_id: str | None, owner: str = "") -> list[str]:
@@ -76,15 +80,16 @@ def _email_tag_owner_aliases(account_id: str | None, owner: str = "") -> list[st
cfg.get("smtp_user") or "", cfg.get("smtp_user") or "",
cfg.get("from_address") or "", cfg.get("from_address") or "",
]) ])
except Exception: except Exception as _e:
logger.warning("Failed to resolve email account alias", exc_info=_e)
resolved_account_id = None resolved_account_id = None
row = db.get(_EA, resolved_account_id) if resolved_account_id else None row = db.get(_EA, resolved_account_id) if resolved_account_id else None
if row: if row:
aliases.extend([row.owner or "", row.imap_user or "", row.from_address or ""]) aliases.extend([row.owner or "", row.imap_user or "", row.from_address or ""])
finally: finally:
db.close() db.close()
except Exception: except Exception as _e:
pass logger.warning("Failed to load email aliases", exc_info=_e)
out = [] out = []
for a in aliases: for a in aliases:
a = (a or "").strip() a = (a or "").strip()
@@ -244,6 +249,21 @@ def _imap_uid_fetch(conn, uid_set: str | bytes, query: str):
return conn.uid("FETCH", _uid_bytes(uid_set), query) return conn.uid("FETCH", _uid_bytes(uid_set), query)
def _imap_search_quote(value: str) -> str:
return '"' + str(value or "").replace("\\", "\\\\").replace('"', '\\"') + '"'
def _message_id_chain(*values: str) -> list[str]:
seen = set()
out = []
for value in values:
for mid in re.findall(r"<[^>]+>", value or ""):
if mid not in seen:
seen.add(mid)
out.append(mid)
return out
def _uid_from_fetch_meta(meta_b: bytes) -> str: def _uid_from_fetch_meta(meta_b: bytes) -> str:
m = re.search(rb"\bUID\s+(\d+)\b", meta_b) m = re.search(rb"\bUID\s+(\d+)\b", meta_b)
return m.group(1).decode() if m else "" return m.group(1).decode() if m else ""
@@ -285,7 +305,9 @@ def _group_uid_fetch_records(msg_data) -> list:
def _smtp_ready(cfg: dict) -> bool: def _smtp_ready(cfg: dict) -> bool:
return bool(cfg.get("smtp_host") and cfg.get("smtp_user") and cfg.get("smtp_password")) if not cfg.get("smtp_host") or not cfg.get("smtp_user"):
return False
return bool(cfg.get("smtp_password") or cfg.get("oauth_provider"))
def _resolve_send_config(account_id: str | None = None, owner: str = "") -> dict: def _resolve_send_config(account_id: str | None = None, owner: str = "") -> dict:
@@ -360,6 +382,21 @@ def _apply_odysseus_headers(msg, kind: str | None = None, ref_id: str | None = N
msg["X-Odysseus-Ref"] = re.sub(r"[^A-Za-z0-9_.:-]", "-", ref_id)[:128] msg["X-Odysseus-Ref"] = re.sub(r"[^A-Za-z0-9_.:-]", "-", ref_id)[:128]
def _normalize_addr_field(field: str) -> str:
"""Strip the malformed-but-common trailing/leading commas and stray
whitespace from a To/Cc/Bcc string before it lands in the MIME header
or the SMTP envelope. Users often paste a single address with a
trailing comma (e.g. `felix@pewdiepie.com,`) and most MTAs reject the
resulting `To: felix@pewdiepie.com,` line as a syntax error. Collapse
any run of separator junk between addresses too."""
if not field:
return field
# Split on commas, drop empty tokens, rejoin with a single ', '.
parts = [p.strip() for p in field.split(",")]
parts = [p for p in parts if p]
return ", ".join(parts)
def _envelope_recipients(*fields: str) -> list: def _envelope_recipients(*fields: str) -> list:
"""Extract bare SMTP envelope addresses from one or more To/Cc/Bcc header """Extract bare SMTP envelope addresses from one or more To/Cc/Bcc header
strings. A naive `field.split(",")` corrupts display names that contain a strings. A naive `field.split(",")` corrupts display names that contain a
@@ -988,6 +1025,65 @@ def setup_email_routes():
except Exception: except Exception:
pass pass
def _related_thread_attachments_sync(
folder: str,
account_id: str | None,
owner: str,
current_uid: str,
current_message_id: str,
in_reply_to: str,
references: str,
limit: int = 12,
) -> list[dict]:
"""Return visible attachments from referenced messages in this folder."""
wanted_ids = _message_id_chain(references, in_reply_to)
current_mid = (current_message_id or "").strip()
wanted_ids = [mid for mid in wanted_ids if mid and mid != current_mid]
if not wanted_ids:
return []
related: list[dict] = []
try:
with _imap(account_id, owner=owner) as conn:
conn.select(_q(folder), readonly=True)
# Search newest referenced messages first; cap work so opening
# a long thread stays bounded.
for mid in reversed(wanted_ids[-10:]):
if len(related) >= limit:
break
status, data = _imap_uid_search(conn, f'(HEADER Message-ID {_imap_search_quote(mid)})')
if status != "OK" or not data or not data[0]:
continue
for uid_b in reversed(data[0].split()[-3:]):
source_uid = uid_b.decode(errors="ignore")
if not source_uid or source_uid == str(current_uid):
continue
st2, msg_data = _imap_uid_fetch(conn, source_uid, "(BODY.PEEK[])")
if st2 != "OK" or not msg_data or not isinstance(msg_data[0], tuple):
continue
msg = email_mod.message_from_bytes(msg_data[0][1])
source_from = _decode_header(msg.get("From", ""))
source_subject = _decode_header(msg.get("Subject", ""))
source_date = msg.get("Date", "")
for att in _list_attachments_from_msg(msg):
if _is_likely_signature_image_attachment(att):
continue
enriched = dict(att)
enriched.update({
"source_uid": source_uid,
"source_folder": folder,
"source_message_id": (msg.get("Message-ID") or "").strip(),
"source_from": source_from,
"source_subject": source_subject,
"source_date": source_date,
})
related.append(enriched)
if len(related) >= limit:
break
except Exception as e:
logger.debug(f"related thread attachment lookup failed uid={current_uid}: {e}")
return related
@router.get("/list") @router.get("/list")
async def list_emails( async def list_emails(
folder: str = Query("INBOX"), folder: str = Query("INBOX"),
@@ -1258,6 +1354,17 @@ def setup_email_routes():
sender_name, sender_addr = email.utils.parseaddr(sender) sender_name, sender_addr = email.utils.parseaddr(sender)
parsed_date = email.utils.parsedate_to_datetime(date_str) if date_str else None parsed_date = email.utils.parsedate_to_datetime(date_str) if date_str else None
attachments = _list_attachments_from_msg(msg) attachments = _list_attachments_from_msg(msg)
related_attachments = []
if not _has_visible_attachments(msg):
related_attachments = _related_thread_attachments_sync(
folder,
account_id,
owner,
uid,
message_id,
in_reply_to,
references,
)
if mark_seen: if mark_seen:
# Set \Seen in a separate readwrite session so concurrent reads # Set \Seen in a separate readwrite session so concurrent reads
@@ -1366,6 +1473,8 @@ def setup_email_routes():
"body": body, "body": body,
"body_html": body_html, "body_html": body_html,
"attachments": attachments, "attachments": attachments,
"related_attachments": related_attachments,
"attachment_version": EMAIL_READ_ATTACHMENT_VERSION,
"cached_summary": cached_summary, "cached_summary": cached_summary,
"cached_ai_reply": cached_ai_reply, "cached_ai_reply": cached_ai_reply,
"boundaries": cached_boundaries, "boundaries": cached_boundaries,
@@ -1396,6 +1505,12 @@ def setup_email_routes():
"""Read email body. Cached for 30m, sync IMAP work runs in a thread.""" """Read email body. Cached for 30m, sync IMAP work runs in a thread."""
ck = _read_cache_key(account_id, folder, uid, owner=owner) ck = _read_cache_key(account_id, folder, uid, owner=owner)
cached = _read_cache_get(ck) cached = _read_cache_get(ck)
if cached is not None:
# Older cached read responses lack the thread-attachment fallback.
# Fetch once so replies that reference prior attachments can show
# those files without waiting for cache expiry.
if cached.get("attachment_version") != EMAIL_READ_ATTACHMENT_VERSION:
cached = None
if cached is not None: if cached is not None:
if mark_seen: if mark_seen:
try: try:
@@ -1530,6 +1645,12 @@ def setup_email_routes():
return {"error": f"Attachment index {index} not found"} return {"error": f"Attachment index {index} not found"}
from pathlib import Path as _Path from pathlib import Path as _Path
target_root = os.path.abspath(str(target_dir))
filepath_str = os.path.abspath(str(filepath))
if os.path.commonpath([target_root, filepath_str]) != target_root:
logger.warning("Rejected attachment path outside extraction dir: %s", filepath)
return {"error": "Invalid attachment path"}
filepath = _Path(filepath_str)
base = _Path(filepath).name base = _Path(filepath).name
if base.startswith("."): if base.startswith("."):
return {"error": "Invalid filename", "filename": base} return {"error": "Invalid filename", "filename": base}
@@ -1584,6 +1705,65 @@ def setup_email_routes():
return None return None
doc_session_id = _resolve_doc_session() doc_session_id = _resolve_doc_session()
def _create_markdown_doc(content: str, summary: str):
from src.database import SessionLocal as _SL, Document as _Doc, DocumentVersion as _DV
doc_id = str(uuid.uuid4())
ver_id = str(uuid.uuid4())
_db = _SL()
try:
_db.query(_Doc).filter(_Doc.is_active == True).update({"is_active": False})
_db.add(_Doc(
id=doc_id, session_id=doc_session_id, title=title,
language="markdown", current_content=content,
version_count=1, is_active=True,
))
_db.add(_DV(
id=ver_id, document_id=doc_id, version_number=1,
content=content, summary=summary, source="upload",
))
_db.commit()
finally:
_db.close()
_tag_doc_with_source(doc_id)
return doc_id
def _attached_email_markdown(raw_bytes: bytes):
if not raw_bytes:
return f"# Attached email: {base}\n\n_(empty email attachment)_"
try:
attached_msg = email_mod.message_from_bytes(raw_bytes)
except Exception:
logger.exception("Failed to parse attached email %s", base)
return f"# Attached email: {base}\n\nCould not parse this email attachment."
attached_subject = _decode_header(attached_msg.get("Subject", "")) or base
attached_from = _decode_header(attached_msg.get("From", ""))
attached_to = _decode_header(attached_msg.get("To", ""))
attached_cc = _decode_header(attached_msg.get("Cc", ""))
attached_date = attached_msg.get("Date", "")
attached_body = _extract_text(attached_msg).strip()
attached_atts = _list_attachments_from_msg(attached_msg)
lines = [f"# Attached email: {attached_subject}", ""]
if attached_from:
lines.append(f"**From:** {attached_from}")
if attached_to:
lines.append(f"**To:** {attached_to}")
if attached_cc:
lines.append(f"**Cc:** {attached_cc}")
if attached_date:
lines.append(f"**Date:** {attached_date}")
lines.extend(["", "## Body", "", attached_body or "_(no readable body)_"])
if attached_atts:
lines.extend(["", "## Attachments", ""])
for att in attached_atts:
size = int(att.get("size") or 0)
size_label = f"{size} B" if size < 1024 else f"{round(size / 1024)} KB"
name = att.get("filename") or f"attachment_{att.get('index', '')}"
ctype = att.get("content_type") or "application/octet-stream"
lines.append(f"- {name} ({ctype}, {size_label})")
return "\n".join(lines).strip()
# ── PDF path (existing) ──────────────────────────────────── # ── PDF path (existing) ────────────────────────────────────
if ext == ".pdf": if ext == ".pdf":
import shutil as _shutil import shutil as _shutil
@@ -1630,6 +1810,39 @@ def setup_email_routes():
_tag_doc_with_source(doc_id) _tag_doc_with_source(doc_id)
return {"doc_id": doc_id, "filename": filepath.name} return {"doc_id": doc_id, "filename": filepath.name}
# ── Attached email (.eml / message/rfc822) ────────────────
if ext == ".eml":
def _attachment_bytes_from_msg():
if not msg.is_multipart():
return b""
idx = 0
for part in msg.walk():
cd = str(part.get("Content-Disposition", ""))
ct = part.get_content_type()
is_attached_email = ct == "message/rfc822" and ("attachment" in cd.lower() or part.get_filename())
if part.is_multipart() and not is_attached_email:
continue
if ct in ("text/plain", "text/html") and "attachment" not in cd:
continue
if idx == index:
payload = part.get_payload(decode=True)
if payload is None and ct == "message/rfc822":
try:
payload = part.as_bytes()
except Exception:
payload = b""
return payload or b""
idx += 1
return b""
try:
content = _attached_email_markdown(_attachment_bytes_from_msg())
except Exception:
logger.exception("Failed to read email attachment %s", base)
return {"error": "Failed to read email attachment", "filename": base}
doc_id = _create_markdown_doc(content, "Imported attached email")
return {"doc_id": doc_id, "filename": filepath.name}
# ── DOCX path: extract text → markdown document ─────────── # ── DOCX path: extract text → markdown document ───────────
if ext == ".docx": if ext == ".docx":
try: try:
@@ -1667,25 +1880,7 @@ def setup_email_routes():
lines.append("") lines.append("")
content = "\n".join(lines).strip() or f"_(empty {base})_" content = "\n".join(lines).strip() or f"_(empty {base})_"
from src.database import SessionLocal as _SL, Document as _Doc, DocumentVersion as _DV doc_id = _create_markdown_doc(content, "Imported from DOCX")
doc_id = str(uuid.uuid4())
ver_id = str(uuid.uuid4())
_db = _SL()
try:
_db.query(_Doc).filter(_Doc.is_active == True).update({"is_active": False})
_db.add(_Doc(
id=doc_id, session_id=doc_session_id, title=title,
language="markdown", current_content=content,
version_count=1, is_active=True,
))
_db.add(_DV(
id=ver_id, document_id=doc_id, version_number=1,
content=content, summary="Imported from DOCX", source="upload",
))
_db.commit()
finally:
_db.close()
_tag_doc_with_source(doc_id)
return {"doc_id": doc_id, "filename": filepath.name} return {"doc_id": doc_id, "filename": filepath.name}
# ── Plain text / markdown ──────────────────────────────── # ── Plain text / markdown ────────────────────────────────
@@ -1694,25 +1889,7 @@ def setup_email_routes():
content = filepath.read_text(encoding="utf-8", errors="replace") content = filepath.read_text(encoding="utf-8", errors="replace")
except Exception as e: except Exception as e:
return {"error": f"Failed to read text file: {e}", "filename": base} return {"error": f"Failed to read text file: {e}", "filename": base}
from src.database import SessionLocal as _SL, Document as _Doc, DocumentVersion as _DV doc_id = _create_markdown_doc(content, "Imported from email attachment")
doc_id = str(uuid.uuid4())
ver_id = str(uuid.uuid4())
_db = _SL()
try:
_db.query(_Doc).filter(_Doc.is_active == True).update({"is_active": False})
_db.add(_Doc(
id=doc_id, session_id=doc_session_id, title=title,
language="markdown", current_content=content,
version_count=1, is_active=True,
))
_db.add(_DV(
id=ver_id, document_id=doc_id, version_number=1,
content=content, summary="Imported from email attachment", source="upload",
))
_db.commit()
finally:
_db.close()
_tag_doc_with_source(doc_id)
return {"doc_id": doc_id, "filename": filepath.name} return {"doc_id": doc_id, "filename": filepath.name}
return {"error": f"Unsupported attachment type: {ext}", "filename": base} return {"error": f"Unsupported attachment type: {ext}", "filename": base}
@@ -2021,7 +2198,10 @@ def setup_email_routes():
outer = MIMEMultipart("alternative") outer = MIMEMultipart("alternative")
body_container = outer body_container = outer
outer["From"] = cfg["from_address"] to = _normalize_addr_field(to or "")
cc = _normalize_addr_field(cc or "")
bcc = _normalize_addr_field(bcc or "")
outer["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
outer["To"] = to outer["To"] = to
if cc: if cc:
outer["Cc"] = cc outer["Cc"] = cc
@@ -2165,12 +2345,10 @@ def setup_email_routes():
try: try:
conn = sqlite3.connect(SCHEDULED_DB) conn = sqlite3.connect(SCHEDULED_DB)
conn.row_factory = sqlite3.Row conn.row_factory = sqlite3.Row
# The MCP server can't easily set owner, so it stores '' — fall
# back to those rows in addition to the caller's owner.
rows = conn.execute( rows = conn.execute(
"""SELECT id, to_addr, subject, body, created_at, account_id """SELECT id, to_addr, subject, body, created_at, account_id
FROM scheduled_emails FROM scheduled_emails
WHERE status = 'agent_draft' AND (owner = ? OR owner = '') WHERE status = 'agent_draft' AND owner = ?
ORDER BY created_at DESC""", ORDER BY created_at DESC""",
(owner or "",), (owner or "",),
).fetchall() ).fetchall()
@@ -2191,7 +2369,7 @@ def setup_email_routes():
cur = conn.execute( cur = conn.execute(
"""UPDATE scheduled_emails """UPDATE scheduled_emails
SET status = 'pending', send_at = ? SET status = 'pending', send_at = ?
WHERE id = ? AND status = 'agent_draft' AND (owner = ? OR owner = '')""", WHERE id = ? AND status = 'agent_draft' AND owner = ?""",
(datetime.utcnow().isoformat(), sid, owner or ""), (datetime.utcnow().isoformat(), sid, owner or ""),
) )
conn.commit() conn.commit()
@@ -2212,7 +2390,7 @@ def setup_email_routes():
conn = sqlite3.connect(SCHEDULED_DB) conn = sqlite3.connect(SCHEDULED_DB)
cur = conn.execute( cur = conn.execute(
"""UPDATE scheduled_emails SET status = 'cancelled' """UPDATE scheduled_emails SET status = 'cancelled'
WHERE id = ? AND status = 'agent_draft' AND (owner = ? OR owner = '')""", WHERE id = ? AND status = 'agent_draft' AND owner = ?""",
(sid, owner or ""), (sid, owner or ""),
) )
conn.commit() conn.commit()
@@ -2285,6 +2463,7 @@ def setup_email_routes():
try: try:
cfg = _resolve_send_config(req.account_id, owner=owner) cfg = _resolve_send_config(req.account_id, owner=owner)
except Exception as e: except Exception as e:
logger.warning(f"No SMTP-capable account resolved: {e}")
return {"success": False, "error": str(e) or "No SMTP-capable email account configured"} return {"success": False, "error": str(e) or "No SMTP-capable email account configured"}
# Use 'mixed' if we have attachments, 'alternative' otherwise # Use 'mixed' if we have attachments, 'alternative' otherwise
@@ -2297,7 +2476,10 @@ def setup_email_routes():
outer = MIMEMultipart("alternative") outer = MIMEMultipart("alternative")
body_container = outer body_container = outer
outer["From"] = cfg["from_address"] req.to = _normalize_addr_field(req.to or "")
req.cc = _normalize_addr_field(req.cc or "")
req.bcc = _normalize_addr_field(req.bcc or "")
outer["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
outer["To"] = req.to outer["To"] = req.to
if req.cc: if req.cc:
outer["Cc"] = req.cc outer["Cc"] = req.cc
@@ -2348,6 +2530,10 @@ def setup_email_routes():
_account_id = cfg.get("account_id") or req.account_id # capture for the IMAP append in the closure _account_id = cfg.get("account_id") or req.account_id # capture for the IMAP append in the closure
_in_reply_to = (req.in_reply_to or "").strip() _in_reply_to = (req.in_reply_to or "").strip()
_oauth_provider = cfg.get("oauth_provider") or ""
_oauth_access_token = cfg.get("oauth_access_token") or ""
_oauth_refresh_token = cfg.get("oauth_refresh_token") or ""
_oauth_token_expiry = cfg.get("oauth_token_expiry") or ""
def _deliver(): def _deliver():
try: try:
@@ -2358,6 +2544,11 @@ def setup_email_routes():
"smtp_security": _smtp_security, "smtp_security": _smtp_security,
"smtp_user": _smtp_user, "smtp_user": _smtp_user,
"smtp_password": _smtp_pw, "smtp_password": _smtp_pw,
"account_id": _account_id,
"oauth_provider": _oauth_provider,
"oauth_access_token": _oauth_access_token,
"oauth_refresh_token": _oauth_refresh_token,
"oauth_token_expiry": _oauth_token_expiry,
}, },
_from, _from,
_recipients, _recipients,
@@ -2470,7 +2661,7 @@ def setup_email_routes():
msg.attach(MIMEText(_draft_html, "html", "utf-8")) msg.attach(MIMEText(_draft_html, "html", "utf-8"))
else: else:
msg = MIMEText(req.body, "plain", "utf-8") msg = MIMEText(req.body, "plain", "utf-8")
msg["From"] = cfg["from_address"] msg["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
msg["To"] = req.to msg["To"] = req.to
if req.cc: if req.cc:
msg["Cc"] = req.cc msg["Cc"] = req.cc
@@ -3122,6 +3313,8 @@ def setup_email_routes():
"from_address": r.from_address or "", "from_address": r.from_address or "",
"has_imap_password": bool(r.imap_password), "has_imap_password": bool(r.imap_password),
"has_smtp_password": bool(r.smtp_password), "has_smtp_password": bool(r.smtp_password),
"oauth_provider": r.oauth_provider or "",
"display_name": r.display_name or "",
}) })
return {"accounts": out} return {"accounts": out}
finally: finally:
@@ -3154,6 +3347,7 @@ def setup_email_routes():
smtp_user=(data.get("smtp_user") or "").strip(), smtp_user=(data.get("smtp_user") or "").strip(),
smtp_password=_enc(data.get("smtp_password") or ""), smtp_password=_enc(data.get("smtp_password") or ""),
from_address=(data.get("from_address") or "").strip(), from_address=(data.get("from_address") or "").strip(),
display_name=(data.get("display_name") or "").strip(),
# SECURITY: stamp the creator so all subsequent reads / mutations # SECURITY: stamp the creator so all subsequent reads / mutations
# can filter by user. Without this every new account leaks to # can filter by user. Without this every new account leaks to
# every other user. # every other user.
@@ -3188,7 +3382,7 @@ def setup_email_routes():
if not row: if not row:
return {"ok": False, "error": "Account not found"} return {"ok": False, "error": "Account not found"}
# Simple fields # Simple fields
for key in ("name", "imap_host", "imap_user", "smtp_host", "smtp_user", "from_address"): for key in ("name", "imap_host", "imap_user", "smtp_host", "smtp_user", "from_address", "display_name"):
if key in data: if key in data:
setattr(row, key, (data[key] or "").strip()) setattr(row, key, (data[key] or "").strip())
for key in ("imap_port", "smtp_port"): for key in ("imap_port", "smtp_port"):
@@ -3377,4 +3571,123 @@ def setup_email_routes():
finally: finally:
db.close() db.close()
# ── Google OAuth2 routes ──
@router.get("/oauth/google/authorize")
async def google_oauth_authorize(account_id: str = Query(...), request: Request = None, owner: str = Depends(require_user)):
import urllib.parse
_assert_owns_account(account_id, owner)
client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
if not client_id:
raise HTTPException(400, "GOOGLE_OAUTH_CLIENT_ID not set — add it to .env")
redirect_uri = (
os.environ.get("GOOGLE_OAUTH_REDIRECT_URI")
or f"http://{request.headers.get('host', 'localhost:7000')}/api/email/oauth/google/callback"
)
state = make_oauth_state(account_id, owner)
params = urllib.parse.urlencode({
"client_id": client_id,
"redirect_uri": redirect_uri,
"response_type": "code",
"scope": "https://mail.google.com/ email",
"access_type": "offline",
"prompt": "consent",
"state": state,
})
from fastapi.responses import RedirectResponse as _RR
return _RR(f"https://accounts.google.com/o/oauth2/v2/auth?{params}")
@router.get("/oauth/google/callback")
async def google_oauth_callback(
code: str = Query(None),
state: str = Query(None),
error: str = Query(None),
request: Request = None,
):
import urllib.parse
from fastapi.responses import RedirectResponse as _RR
if error:
return _RR("/?section=integrations&email_oauth_error=google_error")
if not code or not state:
return _RR("/?section=integrations&email_oauth_error=missing_code")
state_data = verify_oauth_state(state)
if not state_data:
return _RR("/?section=integrations&email_oauth_error=invalid_state")
account_id = state_data.get("a", "")
owner = state_data.get("o", "")
client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
client_secret = os.environ.get("GOOGLE_OAUTH_CLIENT_SECRET", "")
redirect_uri = (
os.environ.get("GOOGLE_OAUTH_REDIRECT_URI")
or f"http://{request.headers.get('host', 'localhost:7000')}/api/email/oauth/google/callback"
)
import httpx as _httpx
try:
resp = _httpx.post("https://oauth2.googleapis.com/token", data={
"code": code,
"client_id": client_id,
"client_secret": client_secret,
"redirect_uri": redirect_uri,
"grant_type": "authorization_code",
}, timeout=10)
resp.raise_for_status()
data = resp.json()
except Exception:
logger.warning("Google token exchange failed")
return _RR("/?section=integrations&email_oauth_error=token_exchange_failed")
access_token = data.get("access_token", "")
refresh_token = data.get("refresh_token", "")
expiry = str(int(time.time()) + data.get("expires_in", 3600))
# Fetch the email address from userinfo so we can auto-fill imap_user.
email_addr = ""
display_name = ""
try:
ui = _httpx.get("https://www.googleapis.com/oauth2/v1/userinfo",
headers={"Authorization": f"Bearer {access_token}"}, timeout=10)
if ui.is_success:
ui_data = ui.json()
email_addr = ui_data.get("email", "")
display_name = ui_data.get("name", "")
except Exception:
pass
from core.database import SessionLocal, EmailAccount
from src.secret_storage import encrypt as _enc
db = SessionLocal()
try:
row = db.query(EmailAccount).filter(EmailAccount.id == account_id).first()
if not row:
return _RR("/?section=integrations&email_oauth_error=account_not_found")
# SECURITY: verify the account belongs to the initiating user.
if owner and row.owner and row.owner != owner:
logger.warning("OAuth callback owner mismatch — rejecting token write")
return _RR("/?section=integrations&email_oauth_error=ownership_error")
row.oauth_provider = "google"
row.oauth_access_token = _enc(access_token)
if refresh_token:
row.oauth_refresh_token = _enc(refresh_token)
row.oauth_token_expiry = expiry
# Auto-fill Google IMAP/SMTP settings if not already configured.
if not row.imap_host:
row.imap_host = "imap.gmail.com"
row.imap_port = 993
row.imap_starttls = False
if not row.smtp_host:
row.smtp_host = "smtp.gmail.com"
row.smtp_port = 587
if email_addr:
if not row.imap_user:
row.imap_user = email_addr
if not row.smtp_user:
row.smtp_user = email_addr
if not row.from_address:
row.from_address = email_addr
if not row.name or row.name == row.id:
row.name = email_addr
if display_name and not row.display_name:
row.display_name = display_name
db.commit()
finally:
db.close()
return _RR("/?section=integrations&email_oauth_success=1")
return router return router
+1
View File
@@ -9,6 +9,7 @@ from pathlib import Path
from fastapi import APIRouter, HTTPException, Form, Depends from fastapi import APIRouter, HTTPException, Form, Depends
from core.constants import EMBEDDING_ENDPOINT_FILE, FASTEMBED_CACHE_DIR from core.constants import EMBEDDING_ENDPOINT_FILE, FASTEMBED_CACHE_DIR
from core.middleware import require_admin from core.middleware import require_admin
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
+2 -13
View File
@@ -67,14 +67,6 @@ def _gallery_image_path(filename: str) -> Path:
raise HTTPException(400, "Unsafe gallery filename") raise HTTPException(400, "Unsafe gallery filename")
if safe_name != original: if safe_name != original:
raise HTTPException(400, "Unsafe gallery filename") raise HTTPException(400, "Unsafe gallery filename")
if not path.exists():
cwd_root = (Path.cwd() / "data" / "generated_images").resolve()
cwd_path = (cwd_root / safe_name).resolve()
try:
if os.path.commonpath([str(cwd_root), str(cwd_path)]) == str(cwd_root) and cwd_path.exists():
return cwd_path
except Exception:
pass
return path return path
@@ -232,8 +224,6 @@ def setup_gallery_routes() -> APIRouter:
@router.post("/api/gallery/{image_id}/replace") @router.post("/api/gallery/{image_id}/replace")
async def gallery_replace(request: Request, image_id: str): async def gallery_replace(request: Request, image_id: str):
"""Replace an existing gallery image file with a new one.""" """Replace an existing gallery image file with a new one."""
from pathlib import Path
user = get_current_user(request) user = get_current_user(request)
db = SessionLocal() db = SessionLocal()
try: try:
@@ -249,9 +239,8 @@ def setup_gallery_routes() -> APIRouter:
raise HTTPException(400, "No image provided") raise HTTPException(400, "No image provided")
content = await read_upload_limited(file, GALLERY_UPLOAD_MAX_BYTES, "Gallery replacement") content = await read_upload_limited(file, GALLERY_UPLOAD_MAX_BYTES, "Gallery replacement")
img_dir = Path(GENERATED_IMAGES_DIR) GALLERY_IMAGE_DIR.mkdir(parents=True, exist_ok=True)
img_dir.mkdir(parents=True, exist_ok=True) img_path = _gallery_image_path(img.filename)
img_path = img_dir / _sanitize_gallery_filename(img.filename)
img_path.write_bytes(content) img_path.write_bytes(content)
# Refresh dimensions in case the editor resized the canvas. # Refresh dimensions in case the editor resized the canvas.
+111 -4
View File
@@ -1,8 +1,13 @@
import json
import os
import re import re
import shlex
import subprocess
from copy import deepcopy from copy import deepcopy
from fastapi import APIRouter, HTTPException from fastapi import APIRouter, HTTPException
from core.platform_compat import run_ssh_command
from routes._validators import validate_remote_host, validate_ssh_port from routes._validators import validate_remote_host, validate_ssh_port
@@ -107,6 +112,73 @@ def _apply_manual_hardware(system, manual_mode="", manual_gpu_count="", manual_v
return system return system
def _run_model_probe(host: str, ssh_port: str, cmd: str) -> str:
try:
if host:
r = run_ssh_command(
host,
ssh_port or None,
cmd,
timeout=15,
connect_timeout=5,
strict_host_key_checking=False,
text=True,
)
else:
r = subprocess.run(["bash", "-lc", cmd], capture_output=True, text=True, timeout=15)
if r.returncode == 0:
return (r.stdout or "").strip()
except Exception:
return ""
return ""
def _inspect_model_path(model_path: str, host: str = "", ssh_port: str = "") -> dict:
"""Read lightweight metadata from a local or SSH-visible HF model folder."""
path = (model_path or "").strip()
if not path or path.startswith(("http://", "https://")):
return {}
if not (path.startswith("/") or path.startswith("~")):
return {}
qpath = shlex.quote(path)
qconfig = shlex.quote(os.path.join(path, "config.json"))
out = {}
exists = _run_model_probe(host, ssh_port, f"test -d {qpath} && printf found || printf missing")
if exists != "found":
target = host or "local container"
out["model_probe_error"] = f"Model path is not visible on {target}: {path}"
return out
raw_config = _run_model_probe(host, ssh_port, f"test -f {qconfig} && sed -n '1,240p' {qconfig}")
if raw_config:
try:
cfg = json.loads(raw_config)
except Exception:
cfg = {}
for key in ("context_length", "max_position_embeddings", "n_ctx_train", "model_max_length", "max_seq_len"):
value = cfg.get(key)
if isinstance(value, (int, float)) and value > 0:
out["model_ctx_max"] = int(value)
break
else:
out["model_probe_error"] = f"config.json not found in model path: {path}"
size_cmd = (
f"find {qpath} -type f \\( -name '*.safetensors' -o -name '*.bin' -o -name '*.gguf' \\) "
"-printf '%s\\n' 2>/dev/null | awk '{s+=$1} END {if (s>0) printf \"%.6f\", s/1073741824}'"
)
weights = _run_model_probe(host, ssh_port, size_cmd)
try:
weights_gb = float(weights)
except Exception:
weights_gb = 0.0
if weights_gb > 0:
out["model_weights_gb"] = round(weights_gb, 3)
elif "model_probe_error" not in out:
out["model_probe_error"] = f"No model weight files found in: {path}"
return out
def setup_hwfit_routes(): def setup_hwfit_routes():
router = APIRouter(prefix="/api/hwfit", tags=["hwfit"]) router = APIRouter(prefix="/api/hwfit", tags=["hwfit"])
@@ -235,7 +307,7 @@ def setup_hwfit_routes():
return {"system": system, "models": results} return {"system": system, "models": results}
@router.get("/profiles") @router.get("/profiles")
def get_serve_profiles(model: str = "", host: str = "", ssh_port: str = "", platform: str = "", fresh: bool = False, serve_weights_gb: float = 0.0, serve_quant: str = ""): def get_serve_profiles(model: str = "", model_path: str = "", host: str = "", ssh_port: str = "", platform: str = "", fresh: bool = False, serve_weights_gb: float = 0.0, serve_quant: str = ""):
"""Compute llama.cpp serve profiles (Quality/Balanced/Speed) for `model` """Compute llama.cpp serve profiles (Quality/Balanced/Speed) for `model`
against the detected hardware on `host` (or local). Returns concrete against the detected hardware on `host` (or local). Returns concrete
flags (n_gpu_layers, n_cpu_moe, cache_type, ctx) the serve UI can apply. flags (n_gpu_layers, n_cpu_moe, cache_type, ctx) the serve UI can apply.
@@ -260,8 +332,23 @@ def setup_hwfit_routes():
# "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct". # "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct".
s = (s or "").lower().strip() s = (s or "").lower().strip()
s = s.split("/")[-1] # drop org prefix s = s.split("/")[-1] # drop org prefix
s = re.sub(r"[-_.]?gguf$", "", s) # drop trailing gguf marker for suffix in ("-gguf", "_gguf", ".gguf", "gguf"):
s = re.sub(r"[-_.](q\d[^/]*|iq\d[^/]*|fp8|bf16|f16|awq[^/]*|gptq[^/]*)$", "", s) if s.endswith(suffix):
s = s[: -len(suffix)]
break
cut_at = None
for idx, ch in enumerate(s):
if ch not in "-_." or idx + 1 >= len(s):
continue
suffix = s[idx + 1:]
if (
suffix in {"fp8", "bf16", "f16"}
or suffix.startswith(("awq", "gptq", "iq"))
or (suffix.startswith("q") and len(suffix) > 1 and suffix[1].isdigit())
):
cut_at = idx
if cut_at is not None:
s = s[:cut_at]
return s return s
m = catalog.get(model) m = catalog.get(model)
@@ -272,8 +359,16 @@ def setup_hwfit_routes():
if nn and (nn == want or want.endswith(nn) or nn.endswith(want)): if nn and (nn == want or want.endswith(nn) or nn.endswith(want)):
m = entry m = entry
break break
path_meta = _inspect_model_path(model_path or model, host=host, ssh_port=ssh_port)
if m is None: if m is None:
return {"system": system, "profiles": [], "error": "model not in catalog"} return {
"system": system,
"profiles": [],
"error": "model not in catalog",
"model_ctx_max": int(path_meta.get("model_ctx_max") or 0),
"model_weights_gb": float(path_meta.get("model_weights_gb") or 0),
"model_probe_error": path_meta.get("model_probe_error") or "",
}
# Surface the model's trained context limit so the serve UI can clamp a # Surface the model's trained context limit so the serve UI can clamp a
# user-typed context down to it (asking for ctx > n_ctx_train overflows # user-typed context down to it (asking for ctx > n_ctx_train overflows
# and, with a quantized KV cache, can crash the GPU). # and, with a quantized KV cache, can crash the GPU).
@@ -283,6 +378,16 @@ def setup_hwfit_routes():
if isinstance(v, (int, float)) and v > 0: if isinstance(v, (int, float)) and v > 0:
model_ctx_max = int(v) model_ctx_max = int(v)
break break
path_ctx_max = int(path_meta.get("model_ctx_max") or 0)
if path_ctx_max > 0:
model_ctx_max = max(model_ctx_max, path_ctx_max)
model_weights_gb = float(path_meta.get("model_weights_gb") or 0)
if model_weights_gb <= 0:
for k in ("min_vram_gb", "required_gb", "size_gb", "recommended_ram_gb", "min_ram_gb"):
v = m.get(k)
if isinstance(v, (int, float)) and v > 0:
model_weights_gb = float(v)
break
return { return {
"system": system, "system": system,
"profiles": compute_serve_profiles( "profiles": compute_serve_profiles(
@@ -291,6 +396,8 @@ def setup_hwfit_routes():
serve_quant=(serve_quant or None), serve_quant=(serve_quant or None),
), ),
"model_ctx_max": model_ctx_max, "model_ctx_max": model_ctx_max,
"model_weights_gb": model_weights_gb,
"model_probe_error": path_meta.get("model_probe_error") or "",
} }
@router.get("/image-models") @router.get("/image-models")
+33 -58
View File
@@ -273,65 +273,30 @@ def setup_memory_routes(memory_manager: MemoryManager, session_manager: SessionM
async def api_audit_memories(request: Request, session: str = Form(None)): async def api_audit_memories(request: Request, session: str = Form(None)):
"""Deduplicate and consolidate memories via LLM. """Deduplicate and consolidate memories via LLM.
Uses the default model from settings, or falls back to a session's model. Uses task/utility/default settings through the shared resolver, with
the active session as fallback when no task or utility model is set.
Returns before and after memory counts. Returns before and after memory counts.
""" """
from routes.model_routes import _load_settings, _normalize_base, build_chat_url
from core.database import ModelEndpoint
import json as _json
endpoint_url = model = None
headers = {}
# Try utility model from settings first — memory audit is a background
# task and should prefer the lighter utility model over the main chat model.
from src.task_endpoint import resolve_task_endpoint
user = _owner(request) user = _owner(request)
t_url, t_model, t_headers = resolve_task_endpoint(owner=user) fallback_url = fallback_model = None
if t_url and t_model: fallback_headers = None
endpoint_url, model, headers = t_url, t_model, t_headers if session:
else: try:
# Fall back to default model if no task/utility model configured sess = session_manager.get_session(session)
settings = _load_settings() _assert_session_owner(sess, user)
ep_id = settings.get("default_endpoint_id", "") fallback_url = sess.endpoint_url
default_model = settings.get("default_model", "") fallback_model = sess.model
if ep_id: fallback_headers = sess.headers
db = SessionLocal() except KeyError:
try: pass
ep = db.query(ModelEndpoint).filter(
ModelEndpoint.id == ep_id, ModelEndpoint.is_enabled == True
).first()
if ep:
base = _normalize_base(ep.base_url)
endpoint_url = build_chat_url(base)
model = default_model
if not model and ep.models:
try:
models = _json.loads(ep.models) if isinstance(ep.models, str) else ep.models
if models:
model = models[0]
except Exception:
pass
if ep.api_key:
headers = {"Authorization": f"Bearer {ep.api_key}"}
finally:
db.close()
# Fall back to session model if no default configured endpoint_url, model, headers = resolve_task_endpoint(
if not endpoint_url and session: fallback_url, fallback_model, fallback_headers, owner=user
try: )
sess = session_manager.get_session(session)
_assert_session_owner(sess, _owner(request))
endpoint_url = sess.endpoint_url
model = sess.model
headers = sess.headers
except KeyError:
pass
if not endpoint_url or not model: if not endpoint_url or not model:
raise HTTPException(400, "No default model configured — set one in Settings") raise HTTPException(400, "No default model configured — set one in Settings")
user = _owner(request)
result = await audit_memories( result = await audit_memories(
memory_manager, memory_manager,
memory_vector, memory_vector,
@@ -369,18 +334,28 @@ def setup_memory_routes(memory_manager: MemoryManager, session_manager: SessionM
model = None model = None
headers = {} headers = {}
user = _owner(request)
if session: if session:
try: try:
sess = session_manager.get_session(session) sess = session_manager.get_session(session)
_assert_session_owner(sess, _owner(request)) _assert_session_owner(sess, user)
endpoint_url, model, headers = resolve_task_endpoint(
sess.endpoint_url, sess.model, sess.headers, owner=_owner(request)
)
except KeyError: except KeyError:
logger.warning("Session %s not found, falling back to utility endpoint", session) sess = None
endpoint_url, model, headers = resolve_endpoint("utility", owner=_owner(request)) except HTTPException as exc:
if exc.status_code != 404:
raise
sess = None
if sess is None:
logger.warning("Session %s not found or inaccessible, falling back to utility endpoint", session)
endpoint_url, model, headers = resolve_endpoint("utility", owner=user)
else:
endpoint_url, model, headers = resolve_task_endpoint(
sess.endpoint_url, sess.model, sess.headers, owner=user
)
else: else:
endpoint_url, model, headers = resolve_task_endpoint(owner=_owner(request)) endpoint_url, model, headers = resolve_task_endpoint(owner=user)
if not endpoint_url or not model: if not endpoint_url or not model:
raise HTTPException(400, "No LLM model configured. Set a default model in Settings.") raise HTTPException(400, "No LLM model configured. Set a default model in Settings.")
+114 -35
View File
@@ -5,6 +5,7 @@ import re
import uuid import uuid
import json import json
import hashlib import hashlib
import ipaddress
import socket import socket
import time as _time import time as _time
import logging import logging
@@ -16,6 +17,7 @@ from fastapi import APIRouter, HTTPException, Form, Query, Body, Request, Respon
from pydantic import BaseModel from pydantic import BaseModel
from fastapi.responses import StreamingResponse from fastapi.responses import StreamingResponse
from core.database import SessionLocal, ModelEndpoint, Session as DbSession from core.database import SessionLocal, ModelEndpoint, Session as DbSession
from core.log_safety import redact_url as _redact_url_for_log
from core.middleware import require_admin from core.middleware import require_admin
from src.llm_core import _detect_provider, _host_match, ANTHROPIC_MODELS from src.llm_core import _detect_provider, _host_match, ANTHROPIC_MODELS
from src.tls_overrides import llm_verify from src.tls_overrides import llm_verify
@@ -26,7 +28,7 @@ from src.endpoint_resolver import (
build_models_url, build_models_url,
build_headers, build_headers,
) )
from src.auth_helpers import _auth_disabled, owner_filter from src.auth_helpers import _auth_disabled, effective_user, owner_filter
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -405,8 +407,11 @@ def _endpoint_refresh_timeout(ep: Any, category: str) -> float:
except Exception: except Exception:
val = 0 val = 0
if val > 0: if val > 0:
return float(max(1, min(30, val))) return float(max(1, min(60, val)))
return 2.5 if category == "local" else 2.0 # llama.cpp and other local OpenAI-compatible servers can block briefly
# while warming/loading. A 2s local timeout makes working endpoints flicker
# offline before /v1/models is ready.
return 10.0 if category == "local" else 2.0
def _manual_refresh_timeout(ep: Any, category: str, requested: Any = None) -> float: def _manual_refresh_timeout(ep: Any, category: str, requested: Any = None) -> float:
@@ -473,7 +478,7 @@ def _explicit_model_list_timeout(base_url: str, endpoint_kind: str = "auto", req
category = _classify_endpoint(base_url, kind) category = _classify_endpoint(base_url, kind)
if kind in ("api", "proxy") or category == "api": if kind in ("api", "proxy") or category == "api":
return 30.0 return 30.0
return 3.0 if _is_ollama_base(base_url) else 2.0 return 15.0 if category == "local" else (3.0 if _is_ollama_base(base_url) else 2.0)
def _cached_model_ids(ep: Any) -> List[str]: def _cached_model_ids(ep: Any) -> List[str]:
@@ -562,6 +567,8 @@ def _safe_build_models_url(base_url: str) -> str:
"""Build a /models URL without letting optional provider imports break probes.""" """Build a /models URL without letting optional provider imports break probes."""
try: try:
return build_models_url(base_url) return build_models_url(base_url)
except ValueError:
raise
except Exception as exc: except Exception as exc:
logger.debug("Model URL detection failed for %s: %s", base_url, exc) logger.debug("Model URL detection failed for %s: %s", base_url, exc)
return f"{(base_url or '').rstrip('/')}/models" return f"{(base_url or '').rstrip('/')}/models"
@@ -633,7 +640,7 @@ def _probe_single_model(base: str, api_key: str, model_id: str, timeout: int = 1
try: try:
t0 = _time.time() t0 = _time.time()
r = httpx.post(target_url, headers=h, json=payload, timeout=timeout) r = httpx.post(target_url, headers=h, json=payload, timeout=timeout, verify=llm_verify())
latency = round((_time.time() - t0) * 1000) latency = round((_time.time() - t0) * 1000)
if r.is_success: if r.is_success:
return {"status": "ok", "latency_ms": latency} return {"status": "ok", "latency_ms": latency}
@@ -659,13 +666,20 @@ def _probe_single_model(base: str, api_key: str, model_id: str, timeout: int = 1
# Hostnames / IP prefixes that indicate a local endpoint # Hostnames / IP prefixes that indicate a local endpoint
_LOCAL_HOSTS = {"localhost", "127.0.0.1", "0.0.0.0", "::1"} _LOCAL_HOSTS = {"localhost", "127.0.0.1", "0.0.0.0", "::1"}
_PRIVATE_PREFIXES = ("10.", "172.16.", "172.17.", "172.18.", "172.19.", _PRIVATE_NETWORKS = (
"172.20.", "172.21.", "172.22.", "172.23.", "172.24.", ipaddress.ip_network("10.0.0.0/8"),
"172.25.", "172.26.", "172.27.", "172.28.", "172.29.", ipaddress.ip_network("172.16.0.0/12"),
"172.30.", "172.31.", "192.168.") ipaddress.ip_network("192.168.0.0/16"),
)
_TAILSCALE_CGNAT = ipaddress.ip_network("100.64.0.0/10")
_TAILSCALE_RE = re.compile(r"^100\.(6[4-9]|[7-9]\d|1[01]\d|12[0-7])\.") def _local_ip_literal(host: str) -> bool:
try:
ip = ipaddress.ip_address(host)
except ValueError:
return False
return any(ip in network for network in _PRIVATE_NETWORKS) or ip in _TAILSCALE_CGNAT
def _classify_endpoint(base_url: str, endpoint_kind: str = "auto") -> str: def _classify_endpoint(base_url: str, endpoint_kind: str = "auto") -> str:
@@ -679,9 +693,7 @@ def _classify_endpoint(base_url: str, endpoint_kind: str = "auto") -> str:
return "api" return "api"
try: try:
host = urlparse(base_url).hostname or "" host = urlparse(base_url).hostname or ""
if host in _LOCAL_HOSTS or host.startswith(_PRIVATE_PREFIXES): if host in _LOCAL_HOSTS or _local_ip_literal(host):
return "local"
if _TAILSCALE_RE.match(host):
return "local" return "local"
except Exception: except Exception:
pass pass
@@ -703,6 +715,16 @@ def _effective_endpoint_kind(ep: Any, base_url: str) -> str:
return "auto" return "auto"
def _is_loading_model_response(resp: Any) -> bool:
if getattr(resp, "status_code", None) != 503:
return False
try:
body = resp.text or ""
except Exception:
body = ""
return "loading model" in body.lower()
def _probe_endpoint(base_url: str, api_key: str = None, timeout: int = 5) -> List[str]: def _probe_endpoint(base_url: str, api_key: str = None, timeout: int = 5) -> List[str]:
"""Probe a base URL's /models endpoint and return list of model IDs. """Probe a base URL's /models endpoint and return list of model IDs.
@@ -767,16 +789,19 @@ def _probe_endpoint(base_url: str, api_key: str = None, timeout: int = 5) -> Lis
models.append(_e) models.append(_e)
return [m for m in models if _is_chat_model(m)] return [m for m in models if _is_chat_model(m)]
except httpx.HTTPStatusError as e: except httpx.HTTPStatusError as e:
if e.response is not None and _is_loading_model_response(e.response):
logger.info("Endpoint still loading model at %s", _redact_url_for_log(url))
return []
if api_key: if api_key:
status = e.response.status_code if e.response is not None else "unknown" status = e.response.status_code if e.response is not None else "unknown"
logger.warning(f"Failed to probe {url} with API key: HTTP {status}") logger.warning("Failed to probe %s with API key: HTTP %s", _redact_url_for_log(url), status)
return [] return []
logger.warning(f"Failed to probe {url}: {e}") logger.warning("Failed to probe %s: %s", _redact_url_for_log(url), e)
except Exception as e: except Exception as e:
if api_key: if api_key:
logger.warning(f"Failed to probe {url} with API key: {e}") logger.warning("Failed to probe %s with API key: %s", _redact_url_for_log(url), e)
return [] return []
logger.warning(f"Failed to probe {url}: {e}") logger.warning("Failed to probe %s: %s", _redact_url_for_log(url), e)
# Older Ollama builds and some proxies expose native /api/tags even when # Older Ollama builds and some proxies expose native /api/tags even when
# the OpenAI-compatible /v1/models path is unavailable. # the OpenAI-compatible /v1/models path is unavailable.
@@ -816,6 +841,15 @@ def _ping_endpoint(base_url: str, api_key: str = None, timeout: float = 1.5) ->
or "ollama" in (parsed_base.hostname or "").lower() or "ollama" in (parsed_base.hostname or "").lower()
) )
def _is_loading_model_response(r) -> bool:
if getattr(r, "status_code", None) != 503:
return False
try:
body = r.text or ""
except Exception:
body = ""
return "loading model" in body.lower()
def _result_from_response(r) -> Dict[str, Any]: def _result_from_response(r) -> Dict[str, Any]:
if 300 <= r.status_code < 400: if 300 <= r.status_code < 400:
loc = r.headers.get("location", "") loc = r.headers.get("location", "")
@@ -832,6 +866,13 @@ def _ping_endpoint(base_url: str, api_key: str = None, timeout: float = 1.5) ->
"status_code": r.status_code, "status_code": r.status_code,
"error": None, "error": None,
} }
if _is_loading_model_response(r):
return {
"reachable": True,
"loading": True,
"status_code": r.status_code,
"error": "Loading model",
}
return {"reachable": False, "status_code": r.status_code, "error": f"HTTP {r.status_code}"} return {"reachable": False, "status_code": r.status_code, "error": f"HTTP {r.status_code}"}
last_error: Optional[str] = None last_error: Optional[str] = None
@@ -864,7 +905,7 @@ def _ping_endpoint(base_url: str, api_key: str = None, timeout: float = 1.5) ->
if 400 <= sc < 500 and sc not in (401, 403): if 400 <= sc < 500 and sc not in (401, 403):
models_url = _safe_build_models_url(base) models_url = _safe_build_models_url(base)
try: try:
r2 = httpx.get(models_url, headers=headers, timeout=timeout, verify=llm_verify()) r2 = httpx.get(models_url, headers=headers,timeout=timeout, verify=llm_verify())
result2 = _result_from_response(r2) result2 = _result_from_response(r2)
if result2["reachable"]: if result2["reachable"]:
return result2 return result2
@@ -1048,9 +1089,11 @@ def setup_model_routes(model_discovery):
except Exception: except Exception:
return 0.0 return 0.0
def _failure_delay(fails: int) -> float: def _failure_delay(fails: int, *, empty_local: bool = False) -> float:
if fails <= 0: if fails <= 0:
return 0.0 return 0.0
if empty_local:
return min(5.0 * (2 ** max(0, fails - 1)), 30.0)
return min(_REFRESH_FAILURE_BASE * (2 ** max(0, fails - 1)), _REFRESH_FAILURE_MAX) return min(_REFRESH_FAILURE_BASE * (2 ** max(0, fails - 1)), _REFRESH_FAILURE_MAX)
def _should_refresh_endpoint(ep: Any, now: float, force: bool = False) -> tuple[bool, Dict[str, Any]]: def _should_refresh_endpoint(ep: Any, now: float, force: bool = False) -> tuple[bool, Dict[str, Any]]:
@@ -1081,7 +1124,12 @@ def setup_model_routes(model_discovery):
fails = int(state.get("fail_count") or 0) fails = int(state.get("fail_count") or 0)
if fails and not force: if fails and not force:
last_failure = float(state.get("last_failure") or 0.0) last_failure = float(state.get("last_failure") or 0.0)
if now - last_failure < _failure_delay(fails): empty_local = (
not cached
and category == "local"
and str(getattr(ep, "id", "") or "").startswith("local-")
)
if now - last_failure < _failure_delay(fails, empty_local=empty_local):
return False, info return False, info
if cached and not force: if cached and not force:
interval = _endpoint_refresh_interval(ep, category) interval = _endpoint_refresh_interval(ep, category)
@@ -1255,13 +1303,16 @@ def setup_model_routes(model_discovery):
# Require auth; "" is the unconfigured single-user mode, treated as # Require auth; "" is the unconfigured single-user mode, treated as
# "see everything" by _fetch_models. # "see everything" by _fetch_models.
try: try:
from src.auth_helpers import get_current_user as _gcu if getattr(request.state, "api_token", False):
owner = _gcu(request) or "" scopes = set(getattr(request.state, "api_token_scopes", []) or [])
except Exception: if "chat" not in scopes:
owner = "" raise HTTPException(403, "API token is not scoped for chat")
# Reject anonymous in configured deployments — no leaking the model if not getattr(request.state, "api_token_owner", None):
# list to unauthenticated callers. raise HTTPException(403, "API token has no owner")
try: owner = effective_user(request) or ""
# Reject anonymous in configured deployments — no leaking the model
# list to unauthenticated callers.
auth_mgr = getattr(request.app.state, "auth_manager", None) auth_mgr = getattr(request.app.state, "auth_manager", None)
if not owner and not _auth_disabled() and auth_mgr is not None and getattr(auth_mgr, "is_configured", False): if not owner and not _auth_disabled() and auth_mgr is not None and getattr(auth_mgr, "is_configured", False):
raise HTTPException(401, "Not authenticated") raise HTTPException(401, "Not authenticated")
@@ -1393,7 +1444,7 @@ def setup_model_routes(model_discovery):
t0 = _time.time() t0 = _time.time()
ping = _ping_endpoint(base, ep.api_key, timeout=1.5) ping = _ping_endpoint(base, ep.api_key, timeout=1.5)
entry["latency_ms"] = round((_time.time() - t0) * 1000) entry["latency_ms"] = round((_time.time() - t0) * 1000)
entry["status"] = "online" if ping.get("reachable") or cached_count else "offline" entry["status"] = "loading" if ping.get("loading") else ("online" if ping.get("reachable") or cached_count else "offline")
entry["error"] = ping.get("error") entry["error"] = ping.get("error")
entry["model_count"] = cached_count or (len(ANTHROPIC_MODELS) if provider == "anthropic" else 0) entry["model_count"] = cached_count or (len(ANTHROPIC_MODELS) if provider == "anthropic" else 0)
except Exception as e: except Exception as e:
@@ -1567,9 +1618,37 @@ def setup_model_routes(model_discovery):
# "everything's already cached" path because this branch only # "everything's already cached" path because this branch only
# runs for endpoints with an empty cached_models. # runs for endpoints with an empty cached_models.
if not all_models and not pinned and r.is_enabled: if not all_models and not pinned and r.is_enabled:
ping = _ping_endpoint(r.base_url, r.api_key, timeout=3.5) base_for_ping = _normalize_base(r.base_url)
kind_for_ping = _effective_endpoint_kind(r, base_for_ping)
ping_timeout = 10.0 if _classify_endpoint(base_for_ping, kind_for_ping) == "local" else 3.5
ping = _ping_endpoint(r.base_url, r.api_key, timeout=ping_timeout)
if ping.get("reachable"): if ping.get("reachable"):
status = "empty" status = "loading" if ping.get("loading") else "empty"
if ping.get("loading"):
base = _normalize_base(r.base_url)
kind = _effective_endpoint_kind(r, base)
results.append({
"id": r.id,
"name": r.name,
"base_url": r.base_url,
"has_key": bool(r.api_key),
"api_key_fingerprint": _api_key_fingerprint(r.api_key),
"is_enabled": r.is_enabled,
"models": visible,
"pinned_models": pinned,
"hidden_count": len(hidden),
"online": True,
"status": status,
"ping_error": (ping or {}).get("error") if ping else None,
"model_type": getattr(r, "model_type", None) or "llm",
"supports_tools": getattr(r, "supports_tools", None),
"endpoint_kind": kind,
"category": _classify_endpoint(base, kind),
"model_refresh_mode": _endpoint_refresh_mode(r, kind),
"model_refresh_interval": getattr(r, "model_refresh_interval", None),
"model_refresh_timeout": getattr(r, "model_refresh_timeout", None),
})
continue
# Best-effort: if the probe came back reachable, try # Best-effort: if the probe came back reachable, try
# to populate cached_models in the background so the # to populate cached_models in the background so the
# NEXT picker load shows "online" instead of "empty". # NEXT picker load shows "online" instead of "empty".
@@ -1577,7 +1656,7 @@ def setup_model_routes(model_discovery):
# "empty" status, and the existing background refresh # "empty" status, and the existing background refresh
# path will eventually fill it in too. # path will eventually fill it in too.
try: try:
probed = _probe_endpoint(r.base_url, r.api_key, timeout=5) probed = _probe_endpoint(r.base_url, r.api_key, timeout=max(5, int(ping_timeout)))
if probed: if probed:
r.cached_models = json.dumps(probed) r.cached_models = json.dumps(probed)
db.commit() db.commit()
@@ -1755,7 +1834,7 @@ def setup_model_routes(model_discovery):
model_ids = _probe_endpoint(base_url, api_key.strip() or None, timeout=explicit_timeout) if should_probe else [] model_ids = _probe_endpoint(base_url, api_key.strip() or None, timeout=explicit_timeout) if should_probe else []
ping = {"reachable": False, "error": None} ping = {"reachable": False, "error": None}
if (should_probe or requested_kind in ("api", "proxy")) and not model_ids: if (should_probe or requested_kind in ("api", "proxy")) and not model_ids:
ping = _ping_endpoint(base_url, api_key.strip() or None, timeout=min(explicit_timeout, 2.0)) ping = _ping_endpoint(base_url, api_key.strip() or None, timeout=min(explicit_timeout, 10.0))
if require_model_list and not model_ids: if require_model_list and not model_ids:
raise HTTPException(400, _model_endpoint_error_message(base_url, ping)) raise HTTPException(400, _model_endpoint_error_message(base_url, ping))
@@ -1822,7 +1901,7 @@ def setup_model_routes(model_discovery):
"models": _merge_model_ids(model_ids, _pinned), "models": _merge_model_ids(model_ids, _pinned),
"pinned_models": _pinned, "pinned_models": _pinned,
"online": bool(model_ids) or bool(_pinned) or bool(ping.get("reachable")), "online": bool(model_ids) or bool(_pinned) or bool(ping.get("reachable")),
"status": "online" if (model_ids or _pinned) else ("empty" if ping.get("reachable") else "offline"), "status": "online" if (model_ids or _pinned) else ("loading" if ping.get("loading") else ("empty" if ping.get("reachable") else "offline")),
"ping_error": ping.get("error") if ping else None, "ping_error": ping.get("error") if ping else None,
"endpoint_kind": requested_kind, "endpoint_kind": requested_kind,
"category": _classify_endpoint(base_url, requested_kind), "category": _classify_endpoint(base_url, requested_kind),
@@ -1847,11 +1926,11 @@ def setup_model_routes(model_discovery):
configured_timeout = _parse_positive_int(model_refresh_timeout, minimum=1, maximum=60) configured_timeout = _parse_positive_int(model_refresh_timeout, minimum=1, maximum=60)
probe_timeout = _explicit_model_list_timeout(base_url, requested_kind, configured_timeout) probe_timeout = _explicit_model_list_timeout(base_url, requested_kind, configured_timeout)
models = _probe_endpoint(base_url, api_key.strip() or None, timeout=probe_timeout) models = _probe_endpoint(base_url, api_key.strip() or None, timeout=probe_timeout)
ping = {"reachable": True, "error": None} if models else _ping_endpoint(base_url, api_key.strip() or None, timeout=min(probe_timeout, 2.0)) ping = {"reachable": True, "error": None} if models else _ping_endpoint(base_url, api_key.strip() or None, timeout=min(probe_timeout, 10.0))
return { return {
"base_url": base_url, "base_url": base_url,
"online": bool(models) or bool(ping.get("reachable")), "online": bool(models) or bool(ping.get("reachable")),
"status": "online" if models else ("empty" if ping.get("reachable") else "offline"), "status": "online" if models else ("loading" if ping.get("loading") else ("empty" if ping.get("reachable") else "offline")),
"ping_error": ping.get("error") if ping else None, "ping_error": ping.get("error") if ping else None,
"models": models, "models": models,
"count": len(models), "count": len(models),
+19 -9
View File
@@ -10,7 +10,8 @@ from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel from pydantic import BaseModel
from core.database import SessionLocal, Note from core.database import SessionLocal, Note
from src.auth_helpers import get_current_user from core.middleware import INTERNAL_TOOL_USER
from src.auth_helpers import require_user
from src.constants import DATA_DIR from src.constants import DATA_DIR
from sqlalchemy.orm.attributes import flag_modified from sqlalchemy.orm.attributes import flag_modified
@@ -334,10 +335,11 @@ async def dispatch_reminder(
# Loud diagnostic so we can see WHY a reminder didn't send (the # Loud diagnostic so we can see WHY a reminder didn't send (the
# previous "silently no-op when cfg has no smtp_host" was invisible). # previous "silently no-op when cfg has no smtp_host" was invisible).
logger.info( logger.info(
f"dispatch_reminder[email] note_id={note_id} owner={owner!r} " "dispatch_reminder[email] note_id=%s owner=%r "
f"smtp_host={cfg.get('smtp_host')!r} smtp_user={cfg.get('smtp_user')!r} " "has_smtp_host=%s has_smtp_user=%s has_from=%s has_recipient=%s",
f"from={from_addr!r} recipient={recipient!r} " note_id, owner,
f"account_name={cfg.get('account_name')!r}" bool(cfg.get("smtp_host")), bool(cfg.get("smtp_user")),
bool(from_addr), bool(recipient),
) )
missing = [] missing = []
if not cfg.get("smtp_host"): if not cfg.get("smtp_host"):
@@ -570,10 +572,19 @@ def setup_note_routes(task_scheduler=None):
router = APIRouter(prefix="/api/notes", tags=["notes"]) router = APIRouter(prefix="/api/notes", tags=["notes"])
def _owner(request: Request) -> Optional[str]: def _owner(request: Request) -> Optional[str]:
return get_current_user(request) # require_user, not bare get_current_user: a request that reaches
# these owner-scoped routes with NO identity (auth-middleware
# regression, SSRF from a sibling service) must fail closed (401)
# when auth is configured — not be treated as the single-user mode
# and handed blanket access to every account's notes. The documented
# anonymous modes (AUTH_ENABLED=false, LOCALHOST_BYPASS on loopback,
# unconfigured first-run) still resolve to None, the single-user
# path. fire_reminder below already gated this way; the CRUD routes
# did not.
return require_user(request) or None
def _is_admin_or_single_user(request: Request, user: str | None) -> bool: def _is_admin_or_single_user(request: Request, user: str | None) -> bool:
if user == "internal-tool": if user == INTERNAL_TOOL_USER:
return True return True
if not user: if not user:
# require_user() already admitted this request, which only happens # require_user() already admitted this request, which only happens
@@ -805,8 +816,7 @@ def setup_note_routes(task_scheduler=None):
Returns {synthesis, email_sent}. Returns {synthesis, email_sent}.
""" """
# Gate against anonymous callers — LLM synthesis can burn tokens. # Gate against anonymous callers — LLM synthesis can burn tokens.
from src.auth_helpers import require_user as _ru user = require_user(request)
user = _ru(request)
body = await request.json() body = await request.json()
note_id = str(body.get("note_id") or "").strip() note_id = str(body.get("note_id") or "").strip()
if not note_id: if not note_id:
+91 -6
View File
@@ -2,8 +2,9 @@
"""Routes for personal documents management.""" """Routes for personal documents management."""
import os import os
import logging import logging
import shutil
import uuid import uuid
from typing import List, Tuple from typing import Any, Dict, List, Tuple
from fastapi import APIRouter, HTTPException, Query, Request, UploadFile, File, Depends from fastapi import APIRouter, HTTPException, Query, Request, UploadFile, File, Depends
from src.request_models import DirectoryRequest from src.request_models import DirectoryRequest
from core.constants import BASE_DIR, PERSONAL_DIR, PERSONAL_UPLOADS_DIR from core.constants import BASE_DIR, PERSONAL_DIR, PERSONAL_UPLOADS_DIR
@@ -18,14 +19,15 @@ UPLOADS_DIR = PERSONAL_UPLOADS_DIR
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def _personal_upload_dir_for_owner(owner: str | None) -> str: def _personal_upload_dir_for_owner(owner: str | None, *, create: bool = True) -> str:
"""Return the per-owner upload directory used for direct RAG uploads.""" """Return the per-owner upload directory used for direct RAG uploads."""
owner_segment = secure_filename((owner or "local").strip())[:80] or "local" owner_segment = secure_filename((owner or "local").strip())[:80] or "local"
upload_dir = os.path.abspath(os.path.join(UPLOADS_DIR, owner_segment)) upload_dir = os.path.abspath(os.path.join(UPLOADS_DIR, owner_segment))
base_abs = os.path.abspath(UPLOADS_DIR) base_abs = os.path.abspath(UPLOADS_DIR)
if os.path.commonpath([upload_dir, base_abs]) != base_abs: if os.path.commonpath([upload_dir, base_abs]) != base_abs:
raise ValueError("Unsafe upload owner path") raise ValueError("Unsafe upload owner path")
os.makedirs(upload_dir, exist_ok=True) if create:
os.makedirs(upload_dir, exist_ok=True)
return upload_dir return upload_dir
@@ -44,6 +46,87 @@ def _unique_personal_upload_path(upload_dir: str, original_name: str | None) ->
raise ValueError("Unsafe upload filename") raise ValueError("Unsafe upload filename")
return file_path, filename, safe_name return file_path, filename, safe_name
def _unique_existing_target(path: str) -> str:
"""Return a non-existing sibling path for rename collision handling."""
if not os.path.exists(path):
return path
stem, ext = os.path.splitext(path)
while True:
candidate = f"{stem}-{uuid.uuid4().hex[:10]}{ext}"
if not os.path.exists(candidate):
return candidate
def _remove_empty_tree(path: str) -> None:
"""Best-effort removal of empty directories under ``path``."""
if not os.path.isdir(path):
return
for root, dirs, _files in os.walk(path, topdown=False):
for dirname in dirs:
candidate = os.path.join(root, dirname)
try:
os.rmdir(candidate)
except OSError:
pass
try:
os.rmdir(path)
except OSError:
pass
def rename_personal_upload_owner(
old_owner: str,
new_owner: str,
*,
personal_docs_manager: Any = None,
rag_manager: Any = None,
) -> Dict[str, Any]:
"""Move direct personal uploads and rewrite RAG owner metadata on user rename."""
old_dir = _personal_upload_dir_for_owner(old_owner, create=False)
new_dir = _personal_upload_dir_for_owner(new_owner, create=False)
path_map: Dict[str, str] = {}
moved_files = 0
if os.path.isdir(old_dir) and old_dir != new_dir:
os.makedirs(new_dir, exist_ok=True)
for root, _dirs, files in os.walk(old_dir):
rel_root = os.path.relpath(root, old_dir)
target_root = new_dir if rel_root == "." else os.path.join(new_dir, rel_root)
os.makedirs(target_root, exist_ok=True)
for filename in files:
source = os.path.abspath(os.path.join(root, filename))
target = _unique_existing_target(os.path.abspath(os.path.join(target_root, filename)))
shutil.move(source, target)
path_map[source] = target
moved_files += 1
_remove_empty_tree(old_dir)
if personal_docs_manager is not None:
rename_directory = getattr(personal_docs_manager, "rename_directory", None)
if callable(rename_directory):
rename_directory(old_dir, new_dir, path_map=path_map)
rag_result = None
if rag_manager is not None:
rename_owner = getattr(rag_manager, "rename_owner", None)
if callable(rename_owner):
rag_result = rename_owner(
old_owner,
new_owner,
path_map=path_map,
path_prefixes=[(old_dir, new_dir)],
)
return {
"old_dir": old_dir,
"new_dir": new_dir,
"moved_files": moved_files,
"path_map": path_map,
"rag_result": rag_result,
}
def setup_personal_routes(personal_docs_manager, rag_manager, rag_available): def setup_personal_routes(personal_docs_manager, rag_manager, rag_available):
""" """
Setup personal documents related routes. Setup personal documents related routes.
@@ -275,11 +358,13 @@ def setup_personal_routes(personal_docs_manager, rag_manager, rag_available):
except Exception as e: except Exception as e:
logger.warning(f"RAG removal failed for {filepath}: {e}") logger.warning(f"RAG removal failed for {filepath}: {e}")
# Delete file from disk if it's in uploads dir # Delete file from disk if it's in the caller's own uploads dir.
# Scope to the per-owner subdir, not the shared uploads root, so one
# admin can't delete another user's personal files by path.
deleted_from_disk = False deleted_from_disk = False
try: try:
abs_target = os.path.abspath(filepath) abs_target = os.path.realpath(filepath)
base_abs = os.path.abspath(UPLOADS_DIR) base_abs = os.path.realpath(_personal_upload_dir_for_owner(owner, create=False))
in_uploads = ( in_uploads = (
abs_target == base_abs abs_target == base_abs
or os.path.commonpath([abs_target, base_abs]) == base_abs or os.path.commonpath([abs_target, base_abs]) == base_abs
+4 -2
View File
@@ -12,8 +12,10 @@ from typing import Optional
from fastapi import APIRouter, HTTPException, Query, Request from fastapi import APIRouter, HTTPException, Query, Request
from fastapi.responses import HTMLResponse, StreamingResponse from fastapi.responses import HTMLResponse, StreamingResponse
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
from core.middleware import INTERNAL_TOOL_USER
from src.endpoint_resolver import resolve_endpoint from src.endpoint_resolver import resolve_endpoint
from src.auth_helpers import _auth_disabled, get_current_user from src.auth_helpers import _auth_disabled, get_current_user
from core.auth import RESERVED_USERNAMES
from src.constants import DEEP_RESEARCH_DIR from src.constants import DEEP_RESEARCH_DIR
_SESSION_ID_RE = re.compile(r"^[a-zA-Z0-9-]{1,128}$") _SESSION_ID_RE = re.compile(r"^[a-zA-Z0-9-]{1,128}$")
@@ -385,9 +387,9 @@ def setup_research_routes(research_handler, session_manager=None) -> APIRouter:
"""Launch a research job from the dedicated panel.""" """Launch a research job from the dedicated panel."""
from src.auth_helpers import require_privilege from src.auth_helpers import require_privilege
user = require_privilege(request, "can_use_research") user = require_privilege(request, "can_use_research")
if user == "internal-tool": if user == INTERNAL_TOOL_USER:
tool_owner = (request.headers.get("X-Odysseus-Owner") or "").strip() tool_owner = (request.headers.get("X-Odysseus-Owner") or "").strip()
if tool_owner and tool_owner not in {"internal-tool", "api", "demo", "system"}: if tool_owner and tool_owner not in RESERVED_USERNAMES:
auth_mgr = getattr(request.app.state, "auth_manager", None) auth_mgr = getattr(request.app.state, "auth_manager", None)
if auth_mgr is not None and getattr(auth_mgr, "is_configured", False): if auth_mgr is not None and getattr(auth_mgr, "is_configured", False):
try: try:
+16 -5
View File
@@ -11,7 +11,7 @@ from core.session_manager import SessionManager
from core.models import ChatMessage from core.models import ChatMessage
from src.request_models import SessionResponse from src.request_models import SessionResponse
from core.database import Session as DbSession, SessionLocal, Document, GalleryImage, utcnow_naive from core.database import Session as DbSession, SessionLocal, Document, GalleryImage, utcnow_naive
from src.auth_helpers import get_current_user, effective_user, _auth_disabled, owner_filter from src.auth_helpers import effective_user, _auth_disabled, owner_filter
from src.session_actions import is_session_recently_active from src.session_actions import is_session_recently_active
@@ -328,7 +328,7 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
endpoint_id: str = Form(""), endpoint_id: str = Form(""),
): ):
skip_val = str(skip_validation).lower() == "true" skip_val = str(skip_validation).lower() == "true"
user = get_current_user(request) user = effective_user(request)
endpoint_api_key = "" endpoint_api_key = ""
endpoint_base_url = "" endpoint_base_url = ""
_reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url) _reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url)
@@ -477,7 +477,7 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
db.close() db.close()
# Switch model/endpoint mid-session # Switch model/endpoint mid-session
if model is not None and endpoint_url is not None: if model is not None and endpoint_url is not None:
user = get_current_user(request) user = effective_user(request)
_reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url) _reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url)
endpoint_api_key = "" endpoint_api_key = ""
endpoint_base_url = "" endpoint_base_url = ""
@@ -1004,6 +1004,7 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
""" """
from src.llm_core import llm_call from src.llm_core import llm_call
user = effective_user(request) user = effective_user(request)
single_user_mode = not user and _auth_disabled()
user_sessions = session_manager.get_sessions_for_user(user) user_sessions = session_manager.get_sessions_for_user(user)
# Delete empty and throwaway sessions before sorting # Delete empty and throwaway sessions before sorting
@@ -1022,7 +1023,12 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
} }
_THROWAWAY_MAX_MESSAGES = 4 # only delete if <= this many messages _THROWAWAY_MAX_MESSAGES = 4 # only delete if <= this many messages
try: try:
rows = db.query(DbSession).filter(DbSession.archived == False, DbSession.owner == user).limit(2000).all() rows_q = db.query(DbSession).filter(DbSession.archived == False)
if user:
rows_q = rows_q.filter(DbSession.owner == user)
elif not single_user_mode:
rows_q = rows_q.filter(DbSession.owner == user)
rows = rows_q.limit(2000).all()
folder_map = {r.id: r.folder for r in rows} folder_map = {r.id: r.folder for r in rows}
# Precompute per-session message counts in TWO aggregate queries # Precompute per-session message counts in TWO aggregate queries
# instead of 13 queries PER session — with many chats the per-row # instead of 13 queries PER session — with many chats the per-row
@@ -1242,7 +1248,12 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
db = SessionLocal() db = SessionLocal()
try: try:
for sid, folder_name in assignments.items(): for sid, folder_name in assignments.items():
db_session = db.query(DbSession).filter(DbSession.id == sid, DbSession.owner == user).first() db_session_q = db.query(DbSession).filter(DbSession.id == sid)
if user:
db_session_q = db_session_q.filter(DbSession.owner == user)
elif not single_user_mode:
db_session_q = db_session_q.filter(DbSession.owner == user)
db_session = db_session_q.first()
if db_session: if db_session:
db_session.folder = folder_name db_session.folder = folder_name
db_session.updated_at = datetime.utcnow() db_session.updated_at = datetime.utcnow()
+360 -10
View File
@@ -15,6 +15,7 @@ from collections import namedtuple
from pathlib import Path from pathlib import Path
from typing import Dict, Any from typing import Dict, Any
from core.platform_compat import IS_APPLE_SILICON, which_tool from core.platform_compat import IS_APPLE_SILICON, which_tool
from core.middleware import INTERNAL_TOOL_USER
from src.optional_deps import prepare_optional_dependency_import from src.optional_deps import prepare_optional_dependency_import
# POSIX-only: `pty`/`fcntl` transitively import `termios`, which does NOT exist # POSIX-only: `pty`/`fcntl` transitively import `termios`, which does NOT exist
@@ -55,7 +56,7 @@ def _require_admin(request: Request):
# In-process tool loopback. The AuthMiddleware already validated the # In-process tool loopback. The AuthMiddleware already validated the
# internal token + loopback client before setting this marker, so # internal token + loopback client before setting this marker, so
# honour it here as admin-equivalent. # honour it here as admin-equivalent.
if user == "internal-tool": if user == INTERNAL_TOOL_USER:
return return
if not user or user == "api": if not user or user == "api":
raise HTTPException(403, "Admin only") raise HTTPException(403, "Admin only")
@@ -330,6 +331,9 @@ def add_user_install_bins_to_path():
candidates.append(os.path.join(site.USER_BASE, 'bin')) candidates.append(os.path.join(site.USER_BASE, 'bin'))
except Exception: except Exception:
pass pass
candidates.append(os.path.expanduser('~/bin'))
candidates.append(os.path.expanduser('~/llama.cpp/build/bin'))
candidates.append(os.path.expanduser('~/llama.cpp/build-vulkan/bin'))
candidates.append(os.path.expanduser('~/.local/bin')) candidates.append(os.path.expanduser('~/.local/bin'))
parts = os.environ.get('PATH', '').split(os.pathsep) if os.environ.get('PATH') else [] parts = os.environ.get('PATH', '').split(os.pathsep) if os.environ.get('PATH') else []
changed = False changed = False
@@ -961,12 +965,84 @@ def setup_shell_routes() -> APIRouter:
return StreamingResponse(generate(), media_type="text/event-stream") return StreamingResponse(generate(), media_type="text/event-stream")
def _os_id_from_release(text: str) -> str:
"""Map /etc/os-release contents to a canonical family for our matrix."""
if not text:
return ""
ids = []
for line in text.splitlines():
line = line.strip()
if line.startswith("ID=") or line.startswith("ID_LIKE="):
ids += line.split("=", 1)[1].strip().strip('"').split()
ids = [i.lower() for i in ids]
if any(x in ids for x in ("debian", "ubuntu", "linuxmint", "pop", "elementary")):
return "debian"
if any(x in ids for x in ("arch", "manjaro", "endeavouros", "cachyos", "garuda")):
return "arch"
if any(x in ids for x in ("fedora", "rhel", "centos", "rocky", "almalinux", "ol")):
return "fedora"
if "alpine" in ids:
return "alpine"
if any(x in ids for x in ("suse", "opensuse", "opensuse-leap", "opensuse-tumbleweed", "sles")):
return "suse"
return ""
# Matrix lookup keyed on (os_family, backend) → (pkg_mgr_cmd_template, pkg_list_per_dep).
# Each `system_prereqs` name resolves to a list of OS-specific package
# names that get joined into the final `sudo apt install -y …` etc.
# command. Backend-specific extras (CUDA toolkit, ROCm, Vulkan headers)
# are added only when the detected backend needs them.
_PKG_NAMES = {
# canonical-name → {os_id: [actual_pkg_names_on_this_os]}
"cmake": {"debian": ["cmake"], "arch": ["cmake"], "fedora": ["cmake"], "alpine": ["cmake"], "suse": ["cmake"], "macos": ["cmake"]},
"build-essential": {"debian": ["build-essential"], "arch": ["base-devel"], "fedora": ["gcc", "gcc-c++", "make"], "alpine": ["build-base"], "suse": ["gcc-c++", "make"], "macos": []},
"g++": {"debian": ["g++"], "arch": ["gcc"], "fedora": ["gcc-c++"], "alpine": ["g++"], "suse": ["gcc-c++"], "macos": []},
"gcc": {"debian": ["gcc"], "arch": ["gcc"], "fedora": ["gcc"], "alpine": ["gcc"], "suse": ["gcc"], "macos": []},
"make": {"debian": ["make"], "arch": ["make"], "fedora": ["make"], "alpine": ["make"], "suse": ["make"], "macos": []},
"git": {"debian": ["git"], "arch": ["git"], "fedora": ["git"], "alpine": ["git"], "suse": ["git"], "macos": ["git"]},
"tmux": {"debian": ["tmux"], "arch": ["tmux"], "fedora": ["tmux"], "alpine": ["tmux"], "suse": ["tmux"], "macos": ["tmux"]},
}
_BACKEND_EXTRAS = {
"cuda": {"debian": ["nvidia-cuda-toolkit"], "arch": ["cuda"], "fedora": ["cuda-toolkit"], "alpine": [], "suse": ["cuda"], "macos": []},
"rocm": {"debian": ["rocm-dev"], "arch": ["rocm-hip-sdk"], "fedora": ["rocm-devel"], "alpine": [], "suse": ["rocm-dev"], "macos": []},
"vulkan": {"debian": ["libvulkan-dev", "vulkan-tools"], "arch": ["vulkan-headers", "vulkan-tools"], "fedora": ["vulkan-headers", "vulkan-tools"], "alpine": ["vulkan-loader-dev", "vulkan-tools"], "suse": ["vulkan-devel", "vulkan-tools"], "macos": []},
}
_PKG_MGR = {
"debian": "sudo apt install -y {pkgs}",
"arch": "sudo pacman -S --needed {pkgs}",
"fedora": "sudo dnf install -y {pkgs}",
"alpine": "sudo apk add {pkgs}",
"suse": "sudo zypper install -n {pkgs}",
"macos": "brew install {pkgs}",
}
def _install_cmd_for_target(os_id: str, backend: str, missing: list[str]) -> str:
"""Build a single OS+backend-aware install command for the missing prereqs."""
if not os_id or os_id not in _PKG_MGR:
return ""
pkgs: list[str] = []
seen: set[str] = set()
for m in missing:
for p in _PKG_NAMES.get(m, {}).get(os_id, []):
if p not in seen:
pkgs.append(p); seen.add(p)
# Add backend-specific extras only when the build would actually
# consume them (a CUDA toolkit isn't useful on a Vulkan box).
backend = (backend or "").lower()
for p in _BACKEND_EXTRAS.get(backend, {}).get(os_id, []):
if p not in seen:
pkgs.append(p); seen.add(p)
if not pkgs:
return ""
return _PKG_MGR[os_id].format(pkgs=" ".join(pkgs))
@router.get("/api/cookbook/packages") @router.get("/api/cookbook/packages")
async def list_packages( async def list_packages(
request: Request, request: Request,
host: str | None = None, host: str | None = None,
ssh_port: str | None = None, ssh_port: str | None = None,
venv: str | None = None, venv: str | None = None,
backend: str | None = None,
): ):
"""Check which optional packages are installed. """Check which optional packages are installed.
@@ -1015,6 +1091,12 @@ def setup_shell_routes() -> APIRouter:
"kind": "system", "kind": "system",
"install_hint": "Install Docker on the selected server and allow this user to run docker.", "install_hint": "Install Docker on the selected server and allow this user to run docker.",
}, },
# Note: cmake / gcc / git are not separate dependency rows —
# they're declared as `system_prereqs` on llama_cpp (and any
# other engine that compiles from source) so they appear as
# an inline status note on that engine's row instead of
# cluttering the panel with raw OS package names that aren't
# meaningful product-level dependencies on their own.
# ── LLM ── installs on GPU servers for model serving/downloading # ── LLM ── installs on GPU servers for model serving/downloading
{ {
"name": "hf_transfer", "name": "hf_transfer",
@@ -1026,9 +1108,16 @@ def setup_shell_routes() -> APIRouter:
{ {
"name": "llama_cpp", "name": "llama_cpp",
"pip": "llama-cpp-python[server]", "pip": "llama-cpp-python[server]",
"desc": "Serve GGUF models via llama.cpp", "desc": "Great for single-GPU or CPU inference with GGUF models",
"category": "LLM", "category": "LLM",
"target": "remote", "target": "remote",
# Build-toolchain prereqs. Cookbook's launch bootstrap
# compiles llama-server from source when no prebuilt
# binary is present; without these the build aborts
# with `cmake: command not found`. Surfaced inline on
# this row so the user doesn't have to chase three
# separate OS-package rows.
"system_prereqs": ["cmake", "g++", "git"],
}, },
{ {
"name": "sglang", "name": "sglang",
@@ -1040,7 +1129,7 @@ def setup_shell_routes() -> APIRouter:
{ {
"name": "vllm", "name": "vllm",
"pip": "vllm", "pip": "vllm",
"desc": "High-throughput LLM serving engine", "desc": "Great for high-throughput multi-GPU inference",
"category": "LLM", "category": "LLM",
"target": "remote", "target": "remote",
}, },
@@ -1103,6 +1192,7 @@ def setup_shell_routes() -> APIRouter:
# venv over SSH so a remote `pip install` actually reflects here. # venv over SSH so a remote `pip install` actually reflects here.
remote_status: dict = {} remote_status: dict = {}
remote_details: dict = {} remote_details: dict = {}
remote_probe_error = ""
remote_names = [ remote_names = [
p["name"] p["name"]
for p in packages for p in packages
@@ -1141,16 +1231,56 @@ def setup_shell_routes() -> APIRouter:
break break
except ValueError as e: except ValueError as e:
raise HTTPException(400, str(e)) raise HTTPException(400, str(e))
except Exception: except Exception as e:
remote_status = {} remote_status = {}
if host and remote_system_names: remote_probe_error = f"SSH package probe failed: {str(e)[:160]}"
if "llama_cpp" in remote_names:
try:
inner = (
'export PATH="$HOME/.local/bin:$HOME/bin:'
'$HOME/llama.cpp/build/bin:$HOME/llama.cpp/build-vulkan/bin:$PATH"; '
"command -v llama-server 2>/dev/null || true"
)
argv = _ssh_base_argv(host, ssh_port) + [inner]
proc = await asyncio.create_subprocess_exec(
*argv,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
out, _err = await asyncio.wait_for(proc.communicate(), timeout=8)
llama_server_path = out.decode("utf-8", errors="replace").strip().splitlines()
llama_server_path = llama_server_path[-1].strip() if llama_server_path else ""
if llama_server_path:
remote_status["llama_cpp"] = True
probe = remote_details.setdefault("llama_cpp", {})
if isinstance(probe, dict):
probe.setdefault("binaries", {})["llama-server"] = llama_server_path
except Exception as e:
if not remote_probe_error:
remote_probe_error = f"SSH llama-server probe failed: {str(e)[:160]}"
pass
# Union of system_names + every package's system_prereqs. Probing
# the prereqs alongside the main system deps in a single SSH call
# avoids a second round-trip per Cookbook → Dependencies refresh.
prereq_names: set[str] = set()
for p in packages:
for pr in p.get("system_prereqs") or []:
prereq_names.add(str(pr))
all_system_names = list(set(remote_system_names) | prereq_names)
# Detect the target's OS family + read /etc/os-release in the same
# SSH round-trip as the prereq probe — used downstream to render a
# single OS-specific install command per row instead of dumping
# every distro's syntax onto the user.
target_os_id: str = ""
if host and all_system_names:
try: try:
checks = [] checks = []
for name in remote_system_names: for name in all_system_names:
qn = shlex.quote(name) qn = shlex.quote(name)
checks.append( checks.append(
f"if command -v {qn} >/dev/null 2>&1; then echo {qn}=1; else echo {qn}=0; fi" f"if command -v {qn} >/dev/null 2>&1; then echo {qn}=1; else echo {qn}=0; fi"
) )
checks.append("echo '---OSREL---'; cat /etc/os-release 2>/dev/null || true")
inner = " ; ".join(checks) inner = " ; ".join(checks)
argv = _ssh_base_argv(host, ssh_port) + [inner] argv = _ssh_base_argv(host, ssh_port) + [inner]
proc = await asyncio.create_subprocess_exec( proc = await asyncio.create_subprocess_exec(
@@ -1160,20 +1290,45 @@ def setup_shell_routes() -> APIRouter:
) )
out, _err = await asyncio.wait_for(proc.communicate(), timeout=12) out, _err = await asyncio.wait_for(proc.communicate(), timeout=12)
txt = out.decode("utf-8", errors="replace").strip() txt = out.decode("utf-8", errors="replace").strip()
_section, _osrel_lines = "probe", []
for line in txt.splitlines(): for line in txt.splitlines():
if line.strip() == "---OSREL---":
_section = "osrel"; continue
if _section == "osrel":
_osrel_lines.append(line)
continue
name, sep, value = line.strip().partition("=") name, sep, value = line.strip().partition("=")
if sep and name in remote_system_names: if sep and name in all_system_names:
remote_status[name] = value == "1" remote_status[name] = value == "1"
target_os_id = _os_id_from_release("\n".join(_osrel_lines))
except ValueError as e: except ValueError as e:
raise HTTPException(400, str(e)) raise HTTPException(400, str(e))
except Exception: except Exception as e:
if not remote_probe_error:
remote_probe_error = f"SSH system probe failed: {str(e)[:160]}"
pass pass
elif not host:
# Local target — probe in-process so the inline install command
# still appears in the dep panel when the cookbook container
# itself is the selected server.
try:
with open("/etc/os-release", encoding="utf-8") as f:
target_os_id = _os_id_from_release(f.read())
except Exception:
target_os_id = ""
if sys.platform == "darwin":
target_os_id = "macos"
for pkg in packages: for pkg in packages:
on_remote = bool(host and pkg.get("target") == "remote") on_remote = bool(host and pkg.get("target") == "remote")
probe = None probe = None
if on_remote: if on_remote:
pkg["installed"] = bool(remote_status.get(pkg["name"], False)) if remote_probe_error and pkg["name"] not in remote_status:
pkg["installed"] = None
pkg["probe_error"] = remote_probe_error
pkg["status_note"] = remote_probe_error
else:
pkg["installed"] = bool(remote_status.get(pkg["name"], False))
probe = remote_details.get(pkg["name"]) probe = remote_details.get(pkg["name"])
if isinstance(probe, dict): if isinstance(probe, dict):
pkg["details"] = probe pkg["details"] = probe
@@ -1229,6 +1384,104 @@ def setup_shell_routes() -> APIRouter:
# 500 the entire packages panel; report it as not usable. # 500 the entire packages panel; report it as not usable.
pkg["installed"] = False pkg["installed"] = False
# llama_cpp partial-state probe: when the package is installed
# but the wheel was built CPU-only AND the target has NVIDIA
# hardware, mark the row as partial (yellow/orange) with a
# one-click upgrade to the CUDA wheel. Without this the row
# reads "ready" green while inference runs at 3 tok/s on GPU
# silicon — actively misleading.
if pkg["name"] == "llama_cpp" and pkg.get("installed"):
_native_llama_server = bool(
isinstance(probe, dict)
and isinstance(probe.get("binaries"), dict)
and probe["binaries"].get("llama-server")
)
_gpu_capable = False
_has_nvidia_target = False
if _native_llama_server:
# Native llama-server is the launcher path Cookbook now
# prefers. Do not mark this as a CPU-only Python wheel just
# because llama-cpp-python is absent from the selected venv.
_gpu_capable = True
elif on_remote and host:
try:
# Activate the configured venv FIRST so the probe
# runs against the same python the launch script
# would activate. Without this prefix, bare
# `python3` was checked — which can disagree with
# the venv's wheel (e.g. user-site has CUDA wheel
# but venv has CPU-only), and the dep panel then
# showed "ready" green while every launch fell to
# CPU.
_vp = _venv_activate_prefix(venv)
probe = (
f'{_vp}python3 -c "import llama_cpp; import sys; '
'sys.exit(0 if llama_cpp.llama_supports_gpu_offload() else 1)" '
'&& echo llama_cpp_gpu=1 || echo llama_cpp_gpu=0; '
'command -v nvidia-smi >/dev/null 2>&1 '
'&& nvidia-smi -L 2>/dev/null | grep -q "GPU " '
'&& echo nvidia=1 || echo nvidia=0'
)
argv = _ssh_base_argv(host, ssh_port) + [probe]
proc = await asyncio.create_subprocess_exec(
*argv, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE,
)
out, _ = await asyncio.wait_for(proc.communicate(), timeout=8)
txt = out.decode("utf-8", errors="replace")
if "llama_cpp_gpu=1" in txt:
_gpu_capable = True
if "nvidia=1" in txt:
_has_nvidia_target = True
except Exception:
pass
else:
try:
import llama_cpp as _lcp # type: ignore
_gpu_capable = bool(_lcp.llama_supports_gpu_offload())
except Exception:
_gpu_capable = False
_has_nvidia_target = shutil.which("nvidia-smi") is not None
if (not _gpu_capable) and _has_nvidia_target:
pkg["partial"] = True
pkg["partial_reason"] = "Installed but CPU-only wheel — GPU detected on this target. Upgrade to a CUDA wheel for ~10× faster inference."
pkg["partial_action"] = "reinstall_llama_cpp_cuda"
# Attach per-package system_prereqs status. We probed each
# prereq name above; surface "Missing build deps: …" ONLY
# when the package itself is not installed — if the package
# works (e.g. llama-cpp-python already imports cleanly), the
# build toolchain is irrelevant and surfacing it as a red
# flag confuses users ("ready" + "missing" on the same row).
_prereqs = list(pkg.get("system_prereqs") or [])
if _prereqs:
if on_remote:
_pr_present = {n: bool(remote_status.get(n)) for n in _prereqs}
else:
_pr_present = {n: shutil.which(n) is not None for n in _prereqs}
pkg["system_prereqs_status"] = _pr_present
_missing = [n for n, ok in _pr_present.items() if not ok]
# Suppress the "missing build deps" hint when the package
# itself is installed — build deps are only relevant if
# the user would need to recompile from source.
if pkg.get("installed"):
_missing = []
if _missing:
# Build a target-specific install command from the
# (os_family, backend) matrix when we know both. Fall
# back to the multi-distro hint only when the target's
# OS can't be classified (e.g. ssh probe failed).
_resolved_os = target_os_id or "debian" # safest default
_cmd = _install_cmd_for_target(_resolved_os, backend or "", _missing)
if _cmd and target_os_id:
_hint = "Missing build deps for this target: " + ", ".join(_missing)
pkg["install_cmd_for_target"] = _cmd
pkg["install_cmd_os"] = target_os_id
pkg["install_cmd_backend"] = (backend or "").lower()
else:
_hint = "Missing build deps: " + ", ".join(_missing) + ". Install via apt: cmake build-essential git / pacman: cmake base-devel git / dnf: cmake gcc-c++ make git / brew: cmake git."
_existing_note = pkg.get("status_note") or ""
pkg["status_note"] = (_existing_note + "" + _hint) if _existing_note else _hint
pkg["build_deps_missing"] = _missing
if pkg.get("installed"): if pkg.get("installed"):
update_status = _package_pip_update_status(pkg, probe) update_status = _package_pip_update_status(pkg, probe)
pkg["pip_update_available"] = update_status.available pkg["pip_update_available"] = update_status.available
@@ -1288,6 +1541,102 @@ def setup_shell_routes() -> APIRouter:
return {"ok": True, "output": stdout.decode()[-200:]} return {"ok": True, "output": stdout.decode()[-200:]}
return {"ok": False, "error": stderr.decode()[-300:]} return {"ok": False, "error": stderr.decode()[-300:]}
@router.post("/api/cookbook/install-system-deps")
async def install_system_deps(request: Request):
"""Install OS-level system packages (cmake/build-essential/git/tmux)
on a remote target or in the local container. Admin only.
Bounded by a per-package allowlist anything outside the catalog
is rejected so the route can't be coerced into installing arbitrary
OS packages. Uses `sudo -n` (passwordless) so the call returns a
clear "needs sudo password" error instead of hanging when interactive
sudo is required.
"""
_require_admin(request)
body = await request.json()
raw = body.get("packages") or []
host = (body.get("remote_host") or "").strip()
ssh_port = body.get("ssh_port")
# Names users can request — must match canonical names used in the
# deps catalog's `system_prereqs` field and on the System rows.
ALLOWED = {"cmake", "build-essential", "g++", "gcc", "git", "tmux", "make"}
pkgs = [str(p).strip() for p in raw if str(p).strip() in ALLOWED]
if not pkgs:
return {"ok": False, "error": "no installable packages requested (allowlist: " + ", ".join(sorted(ALLOWED)) + ")"}
# Re-map to the right package name per OS. apt/dpkg use the names
# as-is; pacman has base-devel for build-essential, etc.
def _apt(names): return list(names)
def _pacman(names):
return ["base-devel" if n == "build-essential" else n for n in names]
def _dnf(names):
out = []
for n in names:
if n == "build-essential": out += ["gcc", "gcc-c++", "make"]
elif n == "g++": out += ["gcc-c++"]
else: out.append(n)
return out
def _brew(names):
return [n for n in names if n not in ("build-essential", "g++", "gcc", "make")]
# Build a single shell snippet that detects the package manager and
# runs the right install. Non-interactive sudo (-n) only — if sudo
# asks for a password the script reports it instead of hanging.
apt_pkgs = " ".join(shlex.quote(p) for p in _apt(pkgs))
pac_pkgs = " ".join(shlex.quote(p) for p in _pacman(pkgs))
dnf_pkgs = " ".join(shlex.quote(p) for p in _dnf(pkgs))
brew_pkgs = " ".join(shlex.quote(p) for p in _brew(pkgs))
# Error messages go to stderr (>&2) so the route's error field
# gets populated. Without the redirect, `echo "ERROR…"` on stdout
# left stderr empty and the frontend toast fell through to a
# bare "HTTP 200" instead of surfacing the real reason.
script = (
'set -e; '
'if ! sudo -n true 2>/dev/null; then '
' echo "ERROR: passwordless sudo unavailable on this target. Run once: sudo apt install -y ' + " ".join(pkgs) + ' (or your distro equivalent: pacman -S, dnf install, brew install). After that, Cookbook can install the rest." >&2; exit 2; fi; '
'if command -v apt-get >/dev/null 2>&1; then '
f' sudo -n env DEBIAN_FRONTEND=noninteractive apt-get update -qq && sudo -n env DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends {apt_pkgs}; '
'elif command -v pacman >/dev/null 2>&1; then '
f' sudo -n pacman -Sy --needed --noconfirm {pac_pkgs}; '
'elif command -v dnf >/dev/null 2>&1; then '
f' sudo -n dnf install -y {dnf_pkgs}; '
'elif command -v brew >/dev/null 2>&1; then '
f' brew install {brew_pkgs}; '
'else '
' echo "ERROR: no supported package manager (apt/pacman/dnf/brew) on this target." >&2; exit 3; fi'
)
try:
if host:
argv = _ssh_base_argv(host, ssh_port) + [script]
else:
argv = ["bash", "-lc", script]
except ValueError as e:
raise HTTPException(400, str(e))
try:
proc = await asyncio.create_subprocess_exec(
*argv, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE
)
out, err = await asyncio.wait_for(proc.communicate(), timeout=180)
except asyncio.TimeoutError:
return {"ok": False, "error": "Install timed out after 180s"}
ok = (proc.returncode == 0)
# Combine stderr + (last lines of stdout) into a single error
# blob when ok=False — some package managers print useful failure
# context to stdout, and a script that exits via `echo ...; exit N`
# without `>&2` would otherwise hand back an empty error string
# and force the frontend to show a bare "HTTP 200".
err_txt = err.decode("utf-8", errors="replace").strip()
out_txt = out.decode("utf-8", errors="replace").strip()
if not ok:
tail_out = out_txt[-500:] if out_txt else ""
combined = err_txt or tail_out or f"exit code {proc.returncode}"
else:
combined = None
return {
"ok": ok,
"exit_code": proc.returncode,
"output": out_txt[-1000:],
"error": combined,
}
@router.post("/api/cookbook/rebuild-engine") @router.post("/api/cookbook/rebuild-engine")
async def rebuild_engine(request: Request): async def rebuild_engine(request: Request):
"""Clear the cached llama.cpp build so the next serve recompiles. """Clear the cached llama.cpp build so the next serve recompiles.
@@ -1308,7 +1657,8 @@ def setup_shell_routes() -> APIRouter:
return {"ok": False, "error": f"Unsupported engine: {engine}"} return {"ok": False, "error": f"Unsupported engine: {engine}"}
host = str(body.get("remote_host") or "").strip() host = str(body.get("remote_host") or "").strip()
ssh_port = body.get("ssh_port") ssh_port = body.get("ssh_port")
cmd = _llama_cpp_rebuild_cmd() update_source = bool(body.get("update_source"))
cmd = _llama_cpp_rebuild_cmd(update_source=update_source)
try: try:
argv = ( argv = (
(_ssh_base_argv(host, ssh_port) + [cmd]) (_ssh_base_argv(host, ssh_port) + [cmd])
+2 -1
View File
@@ -11,6 +11,7 @@ from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel from pydantic import BaseModel
from core.database import SessionLocal, ScheduledTask, TaskRun from core.database import SessionLocal, ScheduledTask, TaskRun
from core.middleware import INTERNAL_TOOL_USER
from core.constants import internal_api_base from core.constants import internal_api_base
from src.auth_helpers import get_current_user from src.auth_helpers import get_current_user
from src.constants import DATA_DIR, EMAIL_URGENCY_CACHE_DIR from src.constants import DATA_DIR, EMAIL_URGENCY_CACHE_DIR
@@ -427,7 +428,7 @@ def setup_task_routes(task_scheduler) -> APIRouter:
# In-process tool-loopback marker — AuthMiddleware validated # In-process tool-loopback marker — AuthMiddleware validated
# the internal token + loopback client before stamping this, # the internal token + loopback client before stamping this,
# so treat as admin-equivalent. # so treat as admin-equivalent.
if user == "internal-tool": if user == INTERNAL_TOOL_USER:
return True return True
try: try:
from core.auth import AuthManager from core.auth import AuthManager
+80 -7
View File
@@ -3,11 +3,16 @@ import os
import time import time
import json import json
import asyncio import asyncio
import shutil
import uuid
from pathlib import Path
from fastapi import APIRouter, Request, File, UploadFile, HTTPException from fastapi import APIRouter, Request, File, UploadFile, HTTPException
from typing import List from typing import List
import logging import logging
from core.middleware import require_admin from core.middleware import require_admin
from src.auth_helpers import get_current_user from core.database import SessionLocal, GalleryImage
from src.auth_helpers import effective_user
from src.constants import GENERATED_IMAGES_DIR
from src.upload_handler import count_recent_uploads from src.upload_handler import count_recent_uploads
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -50,6 +55,69 @@ def setup_upload_routes(upload_handler):
raise HTTPException(404, "File not found") raise HTTPException(404, "File not found")
raise HTTPException(404, "File not found") raise HTTPException(404, "File not found")
def _promote_chat_image_to_gallery(meta: dict, owner: str | None) -> str | None:
"""Make chat-uploaded images visible in Gallery without changing chat storage."""
is_image_file = getattr(upload_handler, "is_image_file", None)
if not callable(is_image_file):
return None
if not is_image_file(meta.get("name", ""), meta.get("mime", "")):
return None
source_path = meta.get("path")
if not source_path or not os.path.isfile(source_path):
return None
db = SessionLocal()
try:
file_hash = meta.get("hash")
if file_hash:
q = db.query(GalleryImage).filter(
GalleryImage.file_hash == file_hash,
GalleryImage.is_active == True, # noqa: E712
)
if owner:
q = q.filter(GalleryImage.owner == owner)
existing = q.first()
if existing:
return existing.id
image_dir = Path(GENERATED_IMAGES_DIR)
image_dir.mkdir(parents=True, exist_ok=True)
ext = Path(meta.get("name") or source_path).suffix.lower()
if ext not in {".png", ".jpg", ".jpeg", ".webp", ".gif"}:
mime_ext = {
"image/png": ".png",
"image/jpeg": ".jpg",
"image/jpg": ".jpg",
"image/webp": ".webp",
"image/gif": ".gif",
}.get(meta.get("mime", ""))
ext = mime_ext or ".png"
filename = f"{uuid.uuid4().hex[:12]}{ext}"
dest_path = image_dir / filename
shutil.copy2(source_path, dest_path)
image_id = str(uuid.uuid4())
db.add(GalleryImage(
id=image_id,
filename=filename,
prompt=meta.get("name") or "Chat upload",
model="chat-upload",
owner=owner,
file_hash=file_hash,
width=meta.get("width"),
height=meta.get("height"),
file_size=meta.get("size"),
))
db.commit()
return image_id
except Exception as e:
db.rollback()
logger.warning("Failed to add chat image upload to gallery: %s", e)
return None
finally:
db.close()
@router.post("") @router.post("")
async def api_upload(request: Request, files: List[UploadFile] = File(...)): async def api_upload(request: Request, files: List[UploadFile] = File(...)):
@@ -78,8 +146,10 @@ def setup_upload_routes(upload_handler):
for u in files: for u in files:
try: try:
meta = upload_handler.save_upload(u, client_ip, owner=get_current_user(request)) owner = effective_user(request)
out.append({ meta = upload_handler.save_upload(u, client_ip, owner=owner)
gallery_id = _promote_chat_image_to_gallery(meta, owner)
item = {
"id": meta["id"], "id": meta["id"],
"name": meta["name"], "name": meta["name"],
"mime": meta["mime"], "mime": meta["mime"],
@@ -89,7 +159,10 @@ def setup_upload_routes(upload_handler):
"width": meta.get("width"), "width": meta.get("width"),
"height": meta.get("height"), "height": meta.get("height"),
"is_duplicate": meta.get("is_duplicate", False) "is_duplicate": meta.get("is_duplicate", False)
}) }
if gallery_id:
item["gallery_id"] = gallery_id
out.append(item)
except HTTPException: except HTTPException:
raise raise
except Exception as e: except Exception as e:
@@ -138,7 +211,7 @@ def setup_upload_routes(upload_handler):
original_name = info.get("name", file_id) original_name = info.get("name", file_id)
auth_mgr = getattr(request.app.state, "auth_manager", None) auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured) auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request) current_user = effective_user(request)
file_owner = info.get("owner") if info else None file_owner = info.get("owner") if info else None
if auth_configured: if auth_configured:
if not current_user: if not current_user:
@@ -204,7 +277,7 @@ def setup_upload_routes(upload_handler):
info = _load_upload_info(file_id) info = _load_upload_info(file_id)
auth_mgr = getattr(request.app.state, "auth_manager", None) auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured) auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request) current_user = effective_user(request)
file_owner = info.get("owner") if info else None file_owner = info.get("owner") if info else None
if auth_configured: if auth_configured:
if not current_user: if not current_user:
@@ -247,7 +320,7 @@ def setup_upload_routes(upload_handler):
raise HTTPException(404, "File not found") raise HTTPException(404, "File not found")
auth_mgr = getattr(request.app.state, "auth_manager", None) auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured) auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request) current_user = effective_user(request)
file_owner = info.get("owner") file_owner = info.get("owner")
if auth_configured: if auth_configured:
if not current_user: if not current_user:
+2 -3
View File
@@ -1,6 +1,5 @@
"""Webhook, API Token, and sync chat routes.""" """Webhook, API Token, and sync chat routes."""
import asyncio
import uuid import uuid
import logging import logging
from typing import Optional from typing import Optional
@@ -385,10 +384,10 @@ def setup_webhook_routes(
sess.add_message(ChatMessage("assistant", reply)) sess.add_message(ChatMessage("assistant", reply))
session_manager.save_sessions() session_manager.save_sessions()
asyncio.create_task(webhook_manager.fire("chat.completed", { webhook_manager.fire_and_forget("chat.completed", {
"session_id": session_id, "model": sess.model, "session_id": session_id, "model": sess.model,
"user_message": message[:2000], "response": reply[:2000], "user_message": message[:2000], "response": reply[:2000],
})) })
return {"response": reply, "session_id": session_id, "model": sess.model} return {"response": reply, "session_id": session_id, "model": sess.model}
+5 -1
View File
@@ -103,9 +103,13 @@ def cmd_list(args) -> None:
end = _parse_dt(args.end) if args.end else (start + timedelta(days=30)) end = _parse_dt(args.end) if args.end else (start + timedelta(days=30))
db = SessionLocal() db = SessionLocal()
try: try:
# Overlap semantics, matching the web route (routes/calendar_routes.py)
# and the recurring-expansion contract: an event is in the window when
# it starts before the window end AND ends after the window start. This
# includes multi-day / in-progress events that began before `start`.
q = db.query(CalendarEvent).filter( q = db.query(CalendarEvent).filter(
CalendarEvent.dtstart >= start,
CalendarEvent.dtstart < end, CalendarEvent.dtstart < end,
CalendarEvent.dtend > start,
) )
if args.calendar: if args.calendar:
cal = db.query(CalendarCal).filter(CalendarCal.name == args.calendar).first() cal = db.query(CalendarCal).filter(CalendarCal.name == args.calendar).first()
+47
View File
@@ -19,6 +19,10 @@ GPU_BANDWIDTH = {
"6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224, "6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224,
"mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229, "mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229,
"9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322, "9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322,
# NVIDIA GB10 Grace-Blackwell superchip (DGX Spark). Unified LPDDR5X memory,
# not Apple Silicon, so it lives in the generic GPU table — the Apple-only
# lookup never matches it (its name carries no "apple").
"gb10": 273,
} }
# Pre-sort keys by length descending for correct substring matching # Pre-sort keys by length descending for correct substring matching
@@ -126,6 +130,44 @@ def _lookup_bandwidth(system):
return None return None
def _canonical_cpu_backend(system):
"""Return the canonical CPU backend for cpu_only speed estimation.
Normalizes CPU-architecture aliases separately from the GPU backend, and
overrides GPU-only backends (CUDA/ROCm/Metal) so they do not inherit a
discrete-GPU fallback constant when the model is actually running on CPU.
"""
backend = (system.get("backend") or "").lower().strip()
cpu_arch = (system.get("cpu_arch") or "").lower().strip()
cpu_name = (system.get("cpu_name") or "").lower()
gpu_name = (system.get("gpu_name") or "").lower()
# Already-canonical CPU backends
if backend in ("cpu_x86", "cpu_arm"):
return backend
# Raw CPU-architecture aliases. Treat plain "arm" as 32-bit ARM, not the
# ARM64-class CPU fallback used for Apple Silicon/aarch64 machines.
if backend in ("x86_64", "amd64", "i386", "i686"):
return "cpu_x86"
if backend in ("arm64", "aarch64"):
return "cpu_arm"
# Prefer an explicit CPU architecture field when present
if cpu_arch:
if cpu_arch in ("x86_64", "amd64", "x86", "i386", "i686"):
return "cpu_x86"
if cpu_arch in ("arm64", "aarch64"):
return "cpu_arm"
# Apple Silicon enters ranking as backend="metal"; its CPU path is ARM.
if backend in ("metal", "mps", "apple") or "apple" in cpu_name or "apple" in gpu_name:
return "cpu_arm"
# Conservative default for CUDA/ROCm/discrete GPU backends and unknowns.
return "cpu_x86"
def _estimate_speed(model, quant, run_mode, system, offload_frac=0.0): def _estimate_speed(model, quant, run_mode, system, offload_frac=0.0):
"""Estimate tok/s. Uses active params for MoE (only active experts run per token). """Estimate tok/s. Uses active params for MoE (only active experts run per token).
@@ -143,6 +185,11 @@ def _estimate_speed(model, quant, run_mode, system, offload_frac=0.0):
bw = _lookup_bandwidth(system) bw = _lookup_bandwidth(system)
backend = system.get("backend", "cpu_x86") backend = system.get("backend", "cpu_x86")
# CPU-only inference must never inherit a GPU backend's fallback constant,
# even if the detected system happens to report a CUDA/Metal/ROCm backend.
if run_mode == "cpu_only":
backend = _canonical_cpu_backend(system)
if bw and run_mode in ("gpu", "cpu_offload"): if bw and run_mode in ("gpu", "cpu_offload"):
bpp = QUANT_BYTES_PER_PARAM.get(quant, 0.5) bpp = QUANT_BYTES_PER_PARAM.get(quant, 0.5)
model_gb = pb * bpp model_gb = pb * bpp
+74 -13
View File
@@ -282,7 +282,17 @@ def _detect_amd():
"gpus": cards, "gpus": cards,
"gpu_groups": groups, "gpu_groups": groups,
"homogeneous": len(groups) <= 1, "homogeneous": len(groups) <= 1,
"backend": "rocm", # Pick the actual runtime label: ROCm/HIP only when its
# toolchain is installed, otherwise Vulkan if vulkaninfo is
# present (mesa RADV works fine on RDNA/CDNA when ROCm
# packages are absent — see Strix Halo where ROCm support
# is still backporting). Reporting "rocm" on a Vulkan-only
# host misleads downstream env-var pinning
# (HIP_VISIBLE_DEVICES is a no-op there).
"backend": (
"rocm" if (_run(["which", "rocminfo"]) or _run(["which", "hipconfig"]))
else ("vulkan" if _run(["which", "vulkaninfo"]) else "rocm")
),
"unified_memory": is_apu, "unified_memory": is_apu,
# AMD ISA/family so downstream can tell datacenter Instinct (CDNA, # AMD ISA/family so downstream can tell datacenter Instinct (CDNA,
# where vLLM/SGLang run AWQ/GPTQ reliably) from consumer Radeon # where vLLM/SGLang run AWQ/GPTQ reliably) from consumer Radeon
@@ -320,7 +330,7 @@ def _detect_apple_silicon():
# Only Apple Silicon (arm64) has a Metal GPU worth serving LLMs on; Intel # Only Apple Silicon (arm64) has a Metal GPU worth serving LLMs on; Intel
# Macs fall through to the CPU path. # Macs fall through to the CPU path.
if "arm" not in arch and "aarch64" not in arch: if _canonical_cpu_arch(arch) != "arm64":
return None return None
# Chip name, e.g. "Apple M4 Max" — carries the Pro/Max/Ultra variant that # Chip name, e.g. "Apple M4 Max" — carries the Pro/Max/Ultra variant that
@@ -503,12 +513,57 @@ def _get_cpu_count():
return os.cpu_count() or 1 return os.cpu_count() or 1
def _canonical_cpu_arch(value):
arch = str(value or "").lower().strip().replace("-", "_")
if arch in ("x86_64", "amd64", "x64"):
return "x86_64"
if arch in ("i386", "i686", "x86"):
return "x86"
if arch in ("arm64", "aarch64"):
return "arm64"
if arch == "arm" or arch.startswith("armv"):
return "arm"
return arch
def _get_cpu_arch():
if _remote_host:
return _canonical_cpu_arch(_run(["uname", "-m"]) or "")
return _canonical_cpu_arch(platform.machine())
def _powershell_exe(): def _powershell_exe():
"""Pick the best PowerShell executable for LOCAL execution: prefer pwsh """Pick the best PowerShell executable for LOCAL execution: prefer pwsh
(PowerShell 7+), fall back to Windows PowerShell 5.1. Returns an absolute (PowerShell 7+), fall back to Windows PowerShell 5.1. Returns an absolute
path so we don't depend on a particular PATH ordering.""" path so we don't depend on a particular PATH ordering."""
return shutil.which("pwsh") or shutil.which("powershell") or "powershell" return shutil.which("pwsh") or shutil.which("powershell") or "powershell"
def _powershell_encoded_for_ssh(script: str):
"""Run a PowerShell script on a remote Windows host over SSH.
Nested quotes in powershell -Command break when passed through Windows
OpenSSH's cmd wrapper; -EncodedCommand avoids that.
"""
import base64
encoded = base64.b64encode(script.encode("utf-16-le")).decode("ascii")
return _run(f"powershell -NoProfile -EncodedCommand {encoded}")
def _probe_remote_platform():
"""Best-effort OS detection over SSH when the caller didn't pass platform."""
out = _run("echo %OS%")
if out and "Windows_NT" in out:
return "windows"
uname = (_run(["uname", "-s"]) or "").strip().lower()
if uname == "darwin":
# Mac uses the linux detection path (_detect_apple_silicon over SSH).
return "linux"
if uname == "linux":
out = _run("test -d /data/data/com.termux && echo termux || echo linux")
if out and "termux" in out:
return "termux"
return "linux"
def _detect_windows(): def _detect_windows():
"""Detect Windows hardware via PowerShell/WMI. """Detect Windows hardware via PowerShell/WMI.
@@ -528,6 +583,7 @@ def _detect_windows():
$r.cpu_name = $cpu.Name $r.cpu_name = $cpu.Name
$r.cpu_cores = (Get-CimInstance Win32_Processor | Measure-Object -Property NumberOfLogicalProcessors -Sum).Sum $r.cpu_cores = (Get-CimInstance Win32_Processor | Measure-Object -Property NumberOfLogicalProcessors -Sum).Sum
$r.arch = $cpu.AddressWidth $r.arch = $cpu.AddressWidth
$r.cpu_arch = if ($env:PROCESSOR_ARCHITEW6432) { $env:PROCESSOR_ARCHITEW6432 } else { $env:PROCESSOR_ARCHITECTURE }
# GPU detection via nvidia-smi (fastest) or WMI fallback # GPU detection via nvidia-smi (fastest) or WMI fallback
try { try {
$nv = nvidia-smi --query-gpu=memory.total,name --format=csv,noheader,nounits 2>$null $nv = nvidia-smi --query-gpu=memory.total,name --format=csv,noheader,nounits 2>$null
@@ -570,9 +626,8 @@ def _detect_windows():
""" """
) )
if _remote_host: if _remote_host:
# Remote: ship a single command string over SSH. The remote shell parses # Remote: use -EncodedCommand so OpenSSH/cmd quoting does not break the script.
# the quoting; PowerShell on the far side runs the -Command payload. out = _powershell_encoded_for_ssh(ps_cmd.strip())
out = _run(f'powershell -Command "{ps_cmd}"')
else: else:
# Local: pass a LIST argv straight to subprocess so the OS hands ps_cmd # Local: pass a LIST argv straight to subprocess so the OS hands ps_cmd
# to PowerShell verbatim — no fragile string-level quote escaping. Prefer # to PowerShell verbatim — no fragile string-level quote escaping. Prefer
@@ -599,6 +654,7 @@ def _detect_windows():
"available_ram_gb": d.get("avail_gb", 0), "available_ram_gb": d.get("avail_gb", 0),
"cpu_cores": _as_int(d.get("cpu_cores"), 1), "cpu_cores": _as_int(d.get("cpu_cores"), 1),
"cpu_name": _cpu_name, "cpu_name": _cpu_name,
"cpu_arch": _canonical_cpu_arch(d.get("cpu_arch")),
"has_gpu": bool(d.get("gpu_name")), "has_gpu": bool(d.get("gpu_name")),
"gpu_name": d.get("gpu_name"), "gpu_name": d.get("gpu_name"),
"gpu_vram_gb": d.get("gpu_vram_gb"), "gpu_vram_gb": d.get("gpu_vram_gb"),
@@ -742,6 +798,13 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
""" """
global _remote_host, _remote_port, _remote_platform global _remote_host, _remote_port, _remote_platform
if host and not platform:
_remote_host = host
_remote_port = ssh_port or None
platform = _probe_remote_platform()
_remote_host = None
_remote_port = None
cache_key = _cache_key(host, ssh_port, platform) cache_key = _cache_key(host, ssh_port, platform)
now = time.time() now = time.time()
if not fresh and cache_key in _cache_by_host: if not fresh and cache_key in _cache_by_host:
@@ -762,8 +825,8 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
_remote_platform = None _remote_platform = None
_cache_by_host[cache_key] = (now, result) _cache_by_host[cache_key] = (now, result)
return result return result
# If Windows detection failed, return error # SSH may work while the PowerShell hardware probe still fails.
result = {"error": f"Cannot connect to {host}", "host": host} result = {"error": f"Windows hardware probe failed for {host}", "host": host}
_remote_host = None _remote_host = None
_remote_platform = None _remote_platform = None
_cache_by_host[cache_key] = (now, result) _cache_by_host[cache_key] = (now, result)
@@ -794,6 +857,7 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
available_ram = round(_get_available_ram_gb(), 1) available_ram = round(_get_available_ram_gb(), 1)
cpu_cores = _get_cpu_count() cpu_cores = _get_cpu_count()
cpu_name = _get_cpu_name() cpu_name = _get_cpu_name()
cpu_arch = _get_cpu_arch()
gpu_info = _detect_apple_silicon() or _detect_nvidia() or _detect_amd() gpu_info = _detect_apple_silicon() or _detect_nvidia() or _detect_amd()
@@ -803,6 +867,7 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
"available_ram_gb": available_ram, "available_ram_gb": available_ram,
"cpu_cores": cpu_cores, "cpu_cores": cpu_cores,
"cpu_name": cpu_name, "cpu_name": cpu_name,
"cpu_arch": cpu_arch,
"has_gpu": True, "has_gpu": True,
"gpu_name": gpu_info["gpu_name"], "gpu_name": gpu_info["gpu_name"],
"gpu_vram_gb": gpu_info["gpu_vram_gb"], "gpu_vram_gb": gpu_info["gpu_vram_gb"],
@@ -817,17 +882,13 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
"unified_memory": gpu_info.get("unified_memory", False), "unified_memory": gpu_info.get("unified_memory", False),
} }
else: else:
if _remote_host: backend = "cpu_arm" if cpu_arch == "arm64" else "cpu_x86"
arch_out = _run(["uname", "-m"]) or ""
else:
import platform as _platform
arch_out = _platform.machine().lower()
backend = "cpu_arm" if "aarch64" in arch_out or "arm" in arch_out else "cpu_x86"
result = { result = {
"total_ram_gb": total_ram, "total_ram_gb": total_ram,
"available_ram_gb": available_ram, "available_ram_gb": available_ram,
"cpu_cores": cpu_cores, "cpu_cores": cpu_cores,
"cpu_name": cpu_name, "cpu_name": cpu_name,
"cpu_arch": cpu_arch,
"has_gpu": False, "has_gpu": False,
"gpu_name": None, "gpu_name": None,
"gpu_vram_gb": None, "gpu_vram_gb": None,
+163 -14
View File
@@ -15,6 +15,8 @@ from urllib.parse import urljoin, urlparse
import httpx import httpx
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from src.constants import WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES, WEB_FETCH_USER_AGENT
from .analytics import RateLimitError, error_logger from .analytics import RateLimitError, error_logger
from .cache import ( from .cache import (
CONTENT_CACHE_DIR, CONTENT_CACHE_DIR,
@@ -89,18 +91,128 @@ def _public_http_url(url: str) -> bool:
return False return False
def _get_public_url(url: str, headers: dict, timeout: int, max_redirects: int = 5) -> httpx.Response: class BodyTooLargeError(Exception):
"""The server declared a body larger than the hard fetch ceiling."""
def __init__(self, url: str, declared_bytes: int):
self.url = url
self.declared_bytes = declared_bytes
super().__init__(
f"response body is {declared_bytes:,} bytes, over the "
f"{WEB_FETCH_HARD_MAX_BYTES:,}-byte hard cap"
)
class _CappedFetch:
"""Result of a size-capped streaming GET.
Carries just what fetch_webpage_content needs from an httpx.Response,
plus the cap bookkeeping: the (possibly truncated) body, whether the
cap cut it short, and the size the server declared via Content-Length
(wire bytes; None when absent).
"""
__slots__ = ("status_code", "headers", "content", "truncated",
"declared_bytes", "encoding", "url")
def __init__(self, status_code, headers, content, truncated,
declared_bytes, encoding, url):
self.status_code = status_code
self.headers = headers
self.content = content
self.truncated = truncated
self.declared_bytes = declared_bytes
self.encoding = encoding
self.url = url
@property
def text(self) -> str:
return self.content.decode(self.encoding or "utf-8", errors="replace")
def raise_for_status(self):
if self.status_code >= 400:
request = httpx.Request("GET", self.url)
raise httpx.HTTPStatusError(
f"HTTP {self.status_code} for {self.url}",
request=request,
response=httpx.Response(self.status_code, request=request),
)
def _get_public_url(url: str, headers: dict, timeout: int, max_redirects: int = 5,
max_bytes: int = None) -> "_CappedFetch":
"""Capped streaming GET with SSRF-guarded manual redirects.
The body is streamed and buffering stops at ``max_bytes`` (default: the
soft cap), so an oversized resource cannot be pulled into memory or the
content cache in full. When Content-Length already declares a body over
the hard ceiling, the fetch is refused before any body bytes are read.
"""
cap = min(max_bytes or WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES)
current = url current = url
for _ in range(max_redirects + 1): for _ in range(max_redirects + 1):
if not _public_http_url(current): if not _public_http_url(current):
raise httpx.RequestError("Blocked private/internal URL", request=httpx.Request("GET", current)) raise httpx.RequestError("Blocked private/internal URL", request=httpx.Request("GET", current))
response = httpx.get(current, headers=headers, timeout=timeout, follow_redirects=False) # Force identity transfer-encoding. With gzip/deflate the wire bytes
if response.status_code not in (301, 302, 303, 307, 308): # (and Content-Length) can be a small fraction of the decoded body, so
return response # a tiny compressed response could pass the hard-cap preflight and then
location = response.headers.get("location") # expand past the ceiling in a single decoded chunk before the streamed
if not location: # cap below can slice it. Identity makes Content-Length the true body
return response # size and keeps each streamed chunk bounded by the network read.
current = urljoin(str(response.url), location) req_headers = dict(headers or {})
req_headers["Accept-Encoding"] = "identity"
with httpx.stream("GET", current, headers=req_headers, timeout=timeout,
follow_redirects=False) as response:
if response.status_code in (301, 302, 303, 307, 308):
location = response.headers.get("location")
if not location:
return _CappedFetch(response.status_code, response.headers, b"",
False, None, response.encoding, str(response.url))
current = urljoin(str(response.url), location)
continue
# A server can ignore the identity request and still return a
# compressed body; httpx.iter_bytes would then decode it, and a tiny
# gzip can balloon into one decoded chunk far past the cap before we
# slice. Refuse a compressed Content-Encoding so the streamed cap
# stays a real memory bound (Content-Length is the compressed wire
# length here, so the preflight and size metadata are unreliable too).
enc = (response.headers.get("content-encoding") or "").strip().lower()
if enc and enc != "identity":
raise httpx.RequestError(
f"Refusing compressed response (Content-Encoding: {enc}) after "
"requesting identity: cannot bound decoded body size",
request=httpx.Request("GET", current),
)
declared = None
raw_len = response.headers.get("content-length")
if raw_len and raw_len.isdigit():
declared = int(raw_len)
# Refuse before buffering anything when the server already tells
# us the body exceeds the absolute ceiling (Content-Length is wire
# bytes; the decompressed body can only be larger).
if declared is not None and declared > WEB_FETCH_HARD_MAX_BYTES:
raise BodyTooLargeError(current, declared)
chunks = []
read = 0
truncated = False
# We requested identity above, so iter_bytes yields the raw body in
# network-read-sized chunks (no decompression expansion); the cap
# therefore bounds what we actually buffer.
for chunk in response.iter_bytes():
read += len(chunk)
if read > cap:
keep = cap - (read - len(chunk))
if keep > 0:
chunks.append(chunk[:keep])
truncated = True
break
chunks.append(chunk)
return _CappedFetch(response.status_code, response.headers,
b"".join(chunks), truncated, declared,
response.encoding, str(response.url))
raise httpx.RequestError("Too many redirects", request=httpx.Request("GET", current)) raise httpx.RequestError("Too many redirects", request=httpx.Request("GET", current))
# PDF extraction (optional dependency) # PDF extraction (optional dependency)
@@ -222,9 +334,19 @@ def _empty_result(url: str, error: str = "") -> dict:
# ---------------------------------------------------------------------- # ----------------------------------------------------------------------
# Main content fetcher # Main content fetcher
# ---------------------------------------------------------------------- # ----------------------------------------------------------------------
def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) -> dict: def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0,
"""Fetch and extract meaningful content from a webpage with caching.""" max_bytes: int = None) -> dict:
cache_key = generate_cache_key(url) """Fetch and extract meaningful content from a webpage with caching.
``max_bytes`` raises the download budget per call (clamped to the hard
cap); the default is the soft cap. When the body is cut short the result
carries ``truncated``/``fetched_bytes``/``total_bytes`` so callers can
tell the model the content is partial (#3812).
"""
effective_cap = min(max_bytes or WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES)
# The cap is part of the cache identity: a truncated soft-cap fetch must
# not be served to a later full-budget request for the same URL.
cache_key = generate_cache_key(f"{url}#cap={effective_cap}")
cache_file = CONTENT_CACHE_DIR / f"{cache_key}.cache" cache_file = CONTENT_CACHE_DIR / f"{cache_key}.cache"
# Check cache # Check cache
@@ -247,18 +369,24 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
# Fetch # Fetch
try: try:
headers = { headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36", "User-Agent": WEB_FETCH_USER_AGENT,
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5", "Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate", # identity so the streamed size cap in _get_public_url stays honest
# (a compressed body can decode to far more than Content-Length).
"Accept-Encoding": "identity",
"Connection": "keep-alive", "Connection": "keep-alive",
} }
response = _get_public_url(url, headers=headers, timeout=timeout) response = _get_public_url(url, headers=headers, timeout=timeout,
max_bytes=effective_cap)
if response.status_code == 429: if response.status_code == 429:
raise RateLimitError(f"Rate limit hit for {url} (attempt {retry_attempt})") raise RateLimitError(f"Rate limit hit for {url} (attempt {retry_attempt})")
response.raise_for_status() response.raise_for_status()
except BodyTooLargeError as e:
error_logger.warning(f"Refused oversized body for {url}: {e}")
return _empty_result(url, f"TooLarge: {e}")
except httpx.HTTPStatusError as e: except httpx.HTTPStatusError as e:
error_logger.warning(f"HTTP {e.response.status_code} fetching {url}: {e}") error_logger.warning(f"HTTP {e.response.status_code} fetching {url}: {e}")
return _empty_result(url, f"HTTP {e.response.status_code}: {e}") return _empty_result(url, f"HTTP {e.response.status_code}: {e}")
@@ -269,9 +397,27 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
error_logger.error(str(e)) error_logger.error(str(e))
return _empty_result(url, str(e)) return _empty_result(url, str(e))
# Size bookkeeping shared by every content branch below. getattr keeps
# plain httpx.Response stand-ins (tests) working without the cap fields.
_size_fields = {
"truncated": getattr(response, "truncated", False),
"fetched_bytes": len(response.content),
"total_bytes": getattr(response, "declared_bytes", None),
}
# PDF handling # PDF handling
content_type = response.headers.get("Content-Type", "").lower() content_type = response.headers.get("Content-Type", "").lower()
if "application/pdf" in content_type or url.lower().endswith(".pdf"): if "application/pdf" in content_type or url.lower().endswith(".pdf"):
if _size_fields["truncated"]:
# A PDF cut mid-stream is not parseable; unlike text there is no
# useful partial result, so report the budget problem instead.
_declared = _size_fields["total_bytes"]
return _empty_result(
url,
f"TooLarge: PDF exceeds the {effective_cap:,}-byte fetch budget"
+ (f" (size {_declared:,} bytes)" if _declared else "")
+ "; retry with a larger budget if it fits under the hard cap",
)
if pdf_extract_text is None: if pdf_extract_text is None:
logger.error("pdfminer.six is not installed; cannot extract PDF text.") logger.error("pdfminer.six is not installed; cannot extract PDF text.")
pdf_text = "" pdf_text = ""
@@ -295,6 +441,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
"js_message": "", "js_message": "",
"success": bool(pdf_text), "success": bool(pdf_text),
"error": "" if pdf_text else "Failed to extract PDF text", "error": "" if pdf_text else "Failed to extract PDF text",
**_size_fields,
} }
_cache_result(cache_file, cache_key, result, url) _cache_result(cache_file, cache_key, result, url)
return result return result
@@ -329,6 +476,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
"js_message": "", "js_message": "",
"success": bool(text_body), "success": bool(text_body),
"error": "" if text_body else "Empty response body", "error": "" if text_body else "Empty response body",
**_size_fields,
} }
_cache_result(cache_file, cache_key, result, url) _cache_result(cache_file, cache_key, result, url)
return result return result
@@ -391,6 +539,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
"js_message": js_message, "js_message": js_message,
"success": True, "success": True,
"error": "", "error": "",
**_size_fields,
} }
_cache_result(cache_file, cache_key, result, url) _cache_result(cache_file, cache_key, result, url)
return result return result
+4 -6
View File
@@ -9,14 +9,12 @@ from urllib.parse import urljoin, urlparse, parse_qs
import httpx import httpx
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
from src.constants import SEARXNG_INSTANCE from src.constants import SEARXNG_INSTANCE, REQUEST_TIMEOUT, WEB_FETCH_USER_AGENT
from .analytics import RateLimitError, error_logger from .analytics import RateLimitError, error_logger
from .query import build_enhanced_query from .query import build_enhanced_query
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
REQUEST_TIMEOUT = 20
# Provider registry — maps setting value to (label, needs_key, needs_url) # Provider registry — maps setting value to (label, needs_key, needs_url)
PROVIDER_INFO = { PROVIDER_INFO = {
"searxng": ("SearXNG", False, True), "searxng": ("SearXNG", False, True),
@@ -140,7 +138,7 @@ def searxng_search_api(query: str, count: Optional[int] = None, categories: str
count = count if count is not None else _get_result_count() count = count if count is not None else _get_result_count()
instance = _get_search_instance() instance = _get_search_instance()
api_key = "" api_key = ""
headers = {"User-Agent": "Mozilla/5.0"} headers = {"User-Agent": WEB_FETCH_USER_AGENT}
if api_key: if api_key:
headers["Authorization"] = f"Bearer {api_key}" headers["Authorization"] = f"Bearer {api_key}"
# News/fresh queries do badly in the 'general' category — it favours # News/fresh queries do badly in the 'general' category — it favours
@@ -252,7 +250,7 @@ def searxng_search(query, max_results=10):
"""Search using SearXNG instance - parsing HTML.""" """Search using SearXNG instance - parsing HTML."""
instance = _get_search_instance() instance = _get_search_instance()
api_key = "" api_key = ""
req_headers = {"User-Agent": "Mozilla/5.0"} req_headers = {"User-Agent": WEB_FETCH_USER_AGENT}
if api_key: if api_key:
req_headers["Authorization"] = f"Bearer {api_key}" req_headers["Authorization"] = f"Bearer {api_key}"
try: try:
@@ -391,7 +389,7 @@ def duckduckgo_search(query: str, count: Optional[int] = None, time_filter: Opti
response = httpx.get( response = httpx.get(
"https://html.duckduckgo.com/html/", "https://html.duckduckgo.com/html/",
params={"q": query, "kp": _safesearch_for("duckduckgo_html")}, params={"q": query, "kp": _safesearch_for("duckduckgo_html")},
headers={"User-Agent": "Mozilla/5.0"}, headers={"User-Agent": WEB_FETCH_USER_AGENT},
timeout=REQUEST_TIMEOUT, timeout=REQUEST_TIMEOUT,
) )
response.raise_for_status() response.raise_for_status()
+20 -6
View File
@@ -16,8 +16,9 @@ sys.path.insert(0, BASE_DIR)
from src.constants import ( from src.constants import (
DATA_DIR, AUTH_FILE, UPLOAD_DIR, PERSONAL_DIR, PERSONAL_UPLOADS_DIR, DATA_DIR, AUTH_FILE, UPLOAD_DIR, PERSONAL_DIR, PERSONAL_UPLOADS_DIR,
TTS_CACHE_DIR, GENERATED_IMAGES_DIR, DEEP_RESEARCH_DIR, CHROMA_DIR, TTS_CACHE_DIR, GENERATED_IMAGES_DIR, DEEP_RESEARCH_DIR, CHROMA_DIR,
RAG_DIR, MEMORY_VECTORS_DIR, RAG_DIR, MEMORY_VECTORS_DIR, PASSWORD_MIN_LENGTH,
) )
from core.auth import RESERVED_USERNAMES
DIRS = [ DIRS = [
DATA_DIR, DATA_DIR,
@@ -59,15 +60,23 @@ def _prompt_admin_credentials():
print(" (Press Enter to accept defaults)") print(" (Press Enter to accept defaults)")
print() print()
username = input(" Username [admin]: ").strip().lower() while True:
if not username: username = input(" Username [admin]: ").strip().lower()
username = "admin" if not username:
username = "admin"
if username in RESERVED_USERNAMES:
print(f" '{username}' is a reserved username. Choose another.")
continue
break
while True: while True:
password = getpass.getpass(" Password: ") password = getpass.getpass(" Password: ")
if not password: if not password:
print(" Password cannot be empty.") print(" Password cannot be empty.")
continue continue
if len(password) < PASSWORD_MIN_LENGTH:
print(f" Password must be at least {PASSWORD_MIN_LENGTH} characters.")
continue
confirm = getpass.getpass(" Confirm password: ") confirm = getpass.getpass(" Confirm password: ")
if password != confirm: if password != confirm:
print(" Passwords don't match. Try again.") print(" Passwords don't match. Try again.")
@@ -93,8 +102,13 @@ def create_default_admin():
password = os.getenv("ODYSSEUS_ADMIN_PASSWORD", "").strip() password = os.getenv("ODYSSEUS_ADMIN_PASSWORD", "").strip()
if username and password: if username and password:
# Both provided via env — use them directly # Both provided via env — validate before using
pass if username in RESERVED_USERNAMES:
print(f" [error] ODYSSEUS_ADMIN_USER '{username}' is a reserved username")
return "failed"
if len(password) < PASSWORD_MIN_LENGTH:
print(f" [error] ODYSSEUS_ADMIN_PASSWORD must be at least {PASSWORD_MIN_LENGTH} characters")
return "failed"
elif sys.stdin.isatty() and not os.getenv("ODYSSEUS_SKIP_ADMIN_PROMPT"): elif sys.stdin.isatty() and not os.getenv("ODYSSEUS_SKIP_ADMIN_PROMPT"):
# Interactive terminal — ask the user # Interactive terminal — ask the user
username, password = _prompt_admin_credentials() username, password = _prompt_admin_credentials()
+412
View File
@@ -0,0 +1,412 @@
# Architecture Runtime Inventory
> **Purpose**: Phase 0 planning baseline for codebase readability improvements (#4071).
> **Parent issue**: [#4082](https://github.com/pewdiepie-archdaemon/odysseus/issues/4082)
> **Last updated**: dev@b58af42 | 2026-06-16
> **Status**: Draft — to be reviewed before follow-up slices open.
> **Snapshot basis**: Importer / file / import-line counts are refreshed to `dev@b58af42` (2026-06-16) and are recomputable via the commands in §3.4. **Line counts** in §2.1 / §2.2 are a snapshot from an earlier baseline and drift as `dev` moves — recompute any of them with `wc -l <file>`. This inventory tracks structure and risk, not live metrics.
This document maps the current runtime module structure, identifies high-risk boundaries, and recommends safe first refactor slices. It does **not** move files, change imports, or alter runtime behavior.
---
## 1. Current Structure Overview
### 1.1 Top-Level Layout
```
odysseus/
├── app.py # FastAPI app entrypoint (1,145 lines)
├── conf/ # Configuration (config.py, settings.py, settings_scrub.py)
├── src/ # 95 flat .py files + 2 subdirectories
│ ├── agent_tools/ # Tool helpers: document, filesystem, subprocess, web
│ └── search/ # Search subsystem
├── routes/ # 54 flat .py files — HTTP route handlers
├── core/ # 10 files — database models, auth, middleware, session
├── mcp_servers/ # 5 files — MCP server implementations
├── scripts/ # CLI tools and one-shot scripts
├── static/ # Frontend HTML/CSS/JS
├── tests/ # 583 test files (~54,800 lines)
└── services/ # (exists as needed)
```
### 1.2 Directory Flatness Metric
| Directory | Flat `.py` Files | Subdirectories | Concern |
|-----------|-----------------|----------------|---------|
| `src/` | **95** | 2 (`agent_tools/`, `search/`) | No domain grouping; 95 files in one directory |
| `routes/` | **54** | 0 | All route handlers in one flat directory |
| `core/` | 10 | 0 | Manageable, but `database.py` is oversized |
---
## 2. Largest Runtime Modules
### 2.1 Python Backend
| Rank | File | Lines | Classes | Functions | Risk |
|------|------|-------|---------|-----------|------|
| 1 | `src/tool_implementations.py` | **4,032** | 0 | ~48 | **HIGH** |
| 2 | `routes/email_routes.py` | **3,245** | — | — | **MEDIUM** |
| 3 | `routes/cookbook_routes.py` | **2,969** | — | — | **MEDIUM** |
| 4 | `src/agent_loop.py` | **2,961** | 0 | ~24 | **HIGH** |
| 5 | `src/task_scheduler.py` | **2,330** | — | 5 | MEDIUM |
| 6 | `routes/model_routes.py` | **2,266** | — | — | MEDIUM |
| 7 | `core/database.py` | **2,265** | 28 | ~59 helpers | **HIGH** |
| 8 | `src/builtin_actions.py` | **2,262** | 2 | ~24 | MEDIUM |
| 9 | `src/llm_core.py` | **2,164** | — | — | MEDIUM |
| 10 | `mcp_servers/email_server.py` | 2,197 | — | — | LOW (separate process) |
| 11 | `src/visual_report.py` | 1,918 | — | — | LOW |
| 12 | `routes/gallery_routes.py` | 1,896 | — | — | LOW |
| 13 | `src/ai_interaction.py` | 1,846 | — | — | MEDIUM |
| 14 | `routes/document_routes.py` | 1,717 | — | — | LOW |
| 15 | `routes/skills_routes.py` | 1,648 | — | — | LOW |
**Heuristic**: Files > 2,000 lines with 20+ public symbols and many importers are the highest-risk splits. Files 1,0002,000 lines are medium-risk if tightly coupled.
### 2.2 Frontend
| File | Lines | Concern |
|------|-------|---------|
| `static/style.css` | **36,653** | Entire app CSS in one file (tracked separately in #2617) |
| `static/js/document.js` | **9,776** | Single JS file for document functionality |
| `static/js/slashCommands.js` | 6,498 | |
| `static/js/settings.js` | 5,266 | |
| `static/js/emailLibrary.js` | 5,217 | |
| `static/js/notes.js` | 5,124 | |
| `static/js/chat.js` | 4,985 | |
| `static/app.js` | 4,090 | |
**Note**: Frontend modularization is tracked separately in #2617 (CSS) and is not the focus of this Phase 0 inventory. Frontend is listed here for completeness but follow-up slices should target Python backend boundaries first.
---
## 3. Import Dependency Graph
### 3.1 Who Depends on `core/database.py`
**102 files** import from `core.database` — this is the most depended-upon module:
- All route handlers (`routes/*.py`)
- Most `src/*.py` files
- `core/session_manager.py`, `core/auth.py`
- Multiple test files
**Implication**: Any split of `core/database.py` is the highest-risk refactor. It should be tackled **last**, never first.
### 3.2 Who Depends on `src/tool_implementations.py`
**17 files** import from `src.tool_implementations`:
- `src/agent_loop.py`, `src/builtin_actions.py`, `src/tool_index.py`
- `src/task_scheduler.py`, `src/tool_policy.py`
- Various tests
### 3.3 Who Depends on `src/agent_loop.py`
**22 files** import from `src.agent_loop`:
- `src/tool_policy.py`, `src/teacher_escalation.py`, `src/bg_monitor.py`
- `src/task_scheduler.py`
- Multiple test files
### 3.4 Cross-Layer Import Violations
**`src/` importing from `routes/`** (backwards dependency — domain logic depending on HTTP layer):
```
src/tool_implementations.py ──→ routes/calendar_routes.py
src/tool_implementations.py ──→ routes/cookbook_helpers.py
src/tool_implementations.py ──→ routes/email_helpers.py
src/tool_implementations.py ──→ routes/email_pollers.py
src/tool_implementations.py ──→ routes/email_routes.py
src/tool_implementations.py ──→ routes/model_routes.py
src/tool_implementations.py ──→ routes/note_routes.py
src/tool_implementations.py ──→ routes/prefs_routes.py
```
> These are **runtime imports** (inside function bodies, not at module top), which mitigates circular import risk but indicates fuzzy layer boundaries. Function-level inline imports from the HTTP layer into business logic are a code smell.
**Import counts (top-level)**:
| Direction | Count | Notes |
|-----------|-------|-------|
| `routes/``src/` | **374** | Expected: HTTP handlers call domain logic |
| `routes/``core/` | **126** | Expected: handlers access DB models |
| `src/``routes/` | **31** | **Unexpected**: domain logic reaching into HTTP layer (direct grep of import lines referencing `routes/`) |
| `src/``core/` | **106** | Acceptable but could be reduced with a data-access layer |
> **How the metrics in this document are computed** — recompute against current `dev` before treating any count as authoritative (the tree drifts; these numbers are a snapshot, not a live value):
> - `src/` flat `.py` files: `find src -maxdepth 1 -name '*.py' | wc -l`
> - `tests/` test files: `find tests -name 'test_*.py' | wc -l`
> - `core.database` importers: `grep -rlE '(from|import) +core\.database' --include='*.py' . | grep -v core/database.py | wc -l`
> - `src.agent_loop` importers: `grep -rlE '(from|import) +src\.agent_loop' --include='*.py' . | grep -v src/agent_loop.py | wc -l`
> - Cross-layer import lines: `grep -rhE '(from|import) +<pkg>' --include='*.py' <dir>/ | wc -l` (e.g. `(from|import) +routes` over `src/`)
---
## 4. Route Ownership Map
Routes can be grouped into logical feature domains. Current flat structure obscures these boundaries:
| Domain | Route Files | Total Lines | Review Complexity |
|--------|-------------|-------------|-------------------|
| **Email** | `email_routes.py`, `email_helpers.py`, `email_pollers.py` | 5,936 | HIGH — most complex domain |
| **Chat / Agent** | `chat_routes.py`, `chat_helpers.py`, `shell_routes.py`, `codex_routes.py`, `skills_routes.py` | 6,365 | HIGH — core interaction surface |
| **Cookbook** | `cookbook_routes.py`, `cookbook_helpers.py`, `cookbook_output.py` | 4,110 | MEDIUM |
| **Model / LLM** | `model_routes.py`, `assistant_routes.py`, `copilot_routes.py` | 2,764 | MEDIUM |
| **Calendar / Contacts** | `calendar_routes.py`, `contacts_routes.py` | 2,336 | MEDIUM |
| **Documents** | `document_routes.py`, `document_helpers.py` | 1,954 | LOW |
| **Auth** | `auth_routes.py`, `api_token_routes.py`, `device_flow.py` | 1,171 | LOW |
| **Tasks** | `task_routes.py` (standalone) | 1,157 | LOW |
| **Session** | `session_routes.py` (standalone) | 1,287 | LOW |
| **Gallery** | `gallery_routes.py`, `gallery_helpers.py` | 1,896 | LOW |
| **Memory** | `memory_routes.py` | — | LOW |
| **Research** | `research_routes.py` | — | LOW |
| **MCP** | `mcp_routes.py` | — | LOW |
| **Notes** | `note_routes.py` | — | LOW |
| **Other** | `prefs_routes.py`, `upload_routes.py`, `vault_routes.py`, `webhook_routes.py`, `workspace_routes.py`, `search_routes.py`, `history_routes.py`, `hwfit_routes.py`, `preset_routes.py`, `signature_routes.py`, `backup_routes.py`, `cleanup_routes.py`, `diagnostics_routes.py`, `embedding_routes.py`, `emoji_routes.py`, `font_routes.py`, `stt_routes.py`, `tts_routes.py`, `compare_routes.py`, `personal_routes.py`, `editor_draft_routes.py`, `admin_wipe_routes.py`, `chatgpt_subscription_routes.py` | 2,000+ | LOW individual, HIGH cumulative |
---
## 5. Tool Registry & Implementation Boundaries
### 5.1 Current Tool Architecture
| Component | File | Lines | Role |
|-----------|------|-------|------|
| Tool schemas | `src/tool_schemas.py` | 1,392 | JSON Schema tool definitions (Duck-TypedDict) |
| Tool index | `src/tool_index.py` | 542 | RAG-based tool retrieval from ChromaDB |
| Tool implementations | `src/tool_implementations.py` | 4,032 | 33 `do_*` functions — all tool execution logic |
| Tool security | `src/tool_security.py` | — | Owner-scoped tool blocking |
| Tool policy | `src/tool_policy.py` | — | Guide-only directive, plan-mode disabled tools |
| Tool utils | `src/tool_utils.py` | — | Shared tool helpers |
### 5.2 Tool Implementation Categories
The 33 `do_*` functions in `tool_implementations.py` fall into natural domain groups — the basis for slice 1's split in §6.2:
| Category | `do_*` functions | Count |
|----------|------------------|-------|
| **System / config** | `do_manage_skills`, `do_manage_tasks`, `do_manage_endpoints`, `do_manage_mcp`, `do_manage_webhooks`, `do_manage_tokens`, `do_manage_settings`, `do_api_call`, `do_app_api` | 9 |
| **Cookbook / model serving** | `do_download_model`, `do_serve_model`, `do_list_served_models`, `do_stop_served_model`, `do_tail_serve_output`, `do_list_downloads`, `do_cancel_download`, `do_search_hf_models`, `do_adopt_served_model`, `do_list_cookbook_servers`, `do_list_serve_presets`, `do_serve_preset`, `do_list_cached_models` | 13 |
| **Notes** | `do_manage_notes` | 1 |
| **Calendar** | `do_manage_calendar` | 1 |
| **Search** | `do_search_chats` | 1 |
| **Research** | `do_manage_research`, `do_trigger_research` | 2 |
| **Contacts** | `do_resolve_contact`, `do_manage_contact` | 2 |
| **Vault** | `do_vault_search`, `do_vault_get`, `do_vault_unlock` | 3 |
| **Image** | `do_edit_image` | 1 |
| | **Total** | **33** |
> Low-level tools (filesystem, subprocess, web fetch, document parsing) live in `src/agent_tools/`, **not** in `tool_implementations.py` — out of scope for this split.
---
## 6. Risk Assessment & Candidate Slice Ranking
> **Candidate proposals, not a committed plan.** The rankings, package shapes (e.g. `src/pkg/`, `src/domain/`, `src/infra/`, `src/api/`), split ordering, and route-grouping strategy below are **options for maintainer discussion**. Per #4082/#4071, slice ownership and order are settled by maintainers before any follow-up PR. §1–§3 above are the factual current-state inventory.
### 6.1 Risk Scale
| Level | Criteria |
|-------|----------|
| **LOW** | File has ≤3 importers AND ≤500 lines, OR is a pure refactor with clear boundaries |
| **MEDIUM** | File has 415 importers OR 5001,500 lines |
| **HIGH** | File has 16+ importers OR >2,000 lines, OR has cross-layer import violations |
### 6.2 Ranked Split Candidates
| Priority | Target | Risk | Rationale |
|----------|--------|------|-----------|
| **1** | `src/tool_implementations.py``src/tools/*.py` | **MEDIUM** | 4,032 lines → ~10 files by tool category. Already has natural boundaries. 17 importers, tracked in #3629. Use `__init__.py` shim to keep existing imports working. |
| **2** | `routes/` → domain subdirectories (one domain per PR) | **MEDIUM** | 54 flat files. Done **one domain at a time** (e.g. a standalone PR for the email domain, then chat, …), not a broad reorganization — route modules carry helper imports, registration assumptions, and test import paths. |
| **3** | `src/agent_loop.py``src/agent/loop.py` + submodules | **MEDIUM-HIGH** | 2,961 lines, 24 functions. Can extract prompt building, classification, verification, and runaway detection. Tracked in #3266. |
| **4** | `src/``src/pkg/`, `src/domain/`, `src/infra/`, `src/api/` | **MEDIUM** | Structural reorganization. Split flat `src/` into layered packages. Must come after routes and tools are stable. |
| **5** | `routes/email_*.py` consolidation | **LOW** | Already grouped by filename prefix. Low-risk cleanup within the email domain. |
| **6** | `core/database.py``src/infra/database/models/*.py` | **HIGH** | 28 classes, 102 importers. Highest-risk split. Must be **last** in any sequence. Requires careful import shim strategy. |
| **7** | Frontend CSS modularization | **MEDIUM** | 36,653 lines. Tracked in #2617. Separate timeline from backend work. |
| **8** | Frontend JS modularization | **MEDIUM** | 9,776 lines in `document.js`. Introduce ES modules at minimum. |
### 6.3 Candidate First 3 Behavior-Preserving Slices
**Slice 1: Split `tool_implementations.py`** (Lowest-risk high-impact)
- Create `src/tools/` package with one file per tool category
- Add `src/tools/__init__.py` re-exporting all symbols with current names
- Update 17 importers to use new paths (can be deferred via shim)
- Validation: `python -m pytest tests/ -x -q` + manual smoke test of tool execution
- Reference: #3629
**Slice 2: Group `routes/` by domain** (one domain per PR, not a broad sweep)
Route modules carry helper imports, router registration assumptions, and test import paths, so this must be done **one domain at a time** rather than as a single reorganization PR. Example sequence (each its own PR):
- PR 2a: move the **email** domain (`email_routes.py`, `email_helpers.py`, `email_pollers.py`) → `routes/email/` + shim
- PR 2b: move the **chat/agent** domain → `routes/chat/` + shim
- PR 2c: move the **cookbook** domain → `routes/cookbook/` + shim
- …and so on per domain from §4
Each PR: add `__init__.py` re-exporting old names, update `app.py` router imports, validation `python app.py` starts clean. **No behavior change** — pure file reorganization.
**Slice 3: Extract `agent_loop.py` submodules** (Improve reviewability)
- Move prompt assembly → `src/agent/prompt.py`
- Move request classification → `src/agent/classifier.py`
- Move sub-agent verification → `src/agent/verifier.py`
- Move runaway detection → `src/agent/runaway.py`
- Move context management → `src/agent/context.py`
- Keep `src/agent/loop.py` as the main orchestration module
- Validation: `python -m pytest tests/test_agent_loop.py tests/test_loop_breaker_runaway.py -v`
---
## 7. Safety Guardrails for Follow-Up Work
Per maintainer guidance in #4082 and #4071:
- [ ] **One domain/slice per PR** — never mix multiple reorganizations
- [ ] **No behavior changes** mixed with file moves — pure reorganization only
- [ ] **Keep compatibility shims**`__init__.py` re-exports for all existing import paths
- [ ] **Add or identify focused tests** before risky splits
- [ ] **Do not start with `core/database.py`** or broad route movement unless this inventory shows a safe boundary
- [ ] **Prefer small, reviewable slices** over large restructures
- [ ] **No packaging/runtime/tooling migration** mixed into file moves
- [ ] **No frontend framework migration** inside this stabilization lane
- [ ] **Validate with `python -m compileall`** — every PR must pass CI checks
- [ ] **Validate with `pytest`** — run the full test suite before opening each PR
---
## 8. Validation Commands
Each follow-up PR should be verifiable with these commands before submission:
```bash
# Syntax check — must pass with zero errors
python -m compileall src/ routes/ core/ conf/
# Full test suite — must match baseline pass rate
python -m pytest tests/ -x -q
# Import shim verification — existing import paths must still work
python -c "from src.tool_implementations import do_search_chats; print('OK')"
# App startup smoke test (if backend touched)
timeout 5 python app.py 2>&1 | head -5 || true
```
---
## 9. Open Questions
1. Is `#2538` (specs ground truth) the canonical behavior map baseline, and should this inventory be kept in sync with those specs once merged?
2. Should route grouping follow the domain map proposed here, or is there a different taxonomy preferred by maintainers?
3. For the `tool_implementations.py` split (#3629), is the tool categorization in §5.2 acceptable, or should it follow a different grouping?
4. Should compatibility shims (`__init__.py`) be temporary (removed in a follow-up wave) or permanent?
5. Should an ADR (Architecture Decision Record) document be started to track decisions made during this process?
---
## 10. Future Direction (NOT current state)
The following are **future refactor targets** (candidate directions **pending maintainer agreement**, not committed), recorded here so this inventory does not imply they exist today. None of them are present in the current `dev` tree:
- `main.py` — proposed rename of the `app.py` entrypoint. Today the app boots via `app.py`.
- `src/agent/` — proposed package to hold `agent_loop.py` submodules (prompt/classifier/verifier/runaway/context). Today `agent_loop.py` is a single flat file in `src/`.
- `src/infra/`, `src/domain/`, `src/pkg/`, `src/api/` — proposed layered reorganization of the flat `src/` directory (slice 4 in §6).
These become real only when the corresponding slices land.
---
## Appendix A: File Listing
### `src/` (95 files — 61 shown; run `ls src/*.py` for the full list)
```
agent_loop.py tool_implementations.py tool_schemas.py
tool_index.py tool_security.py tool_policy.py
tool_utils.py builtin_actions.py task_scheduler.py
llm_core.py model_context.py model_discovery.py
session_search.py context_budget.py context_compactor.py
ai_interaction.py action_intents.py agent_runs.py
app_helpers.py app_initializer.py config.py
database.py memory.py memory_provider.py
secret_storage.py prompt_security.py url_security.py
url_safety.py rate_limiter.py cleanup_service.py
readiness.py service_health.py exceptions.py
request_models.py assistant_log.py bg_monitor.py
builtin_mcp.py chat_helpers.py chroma_client.py
document_processor.py embedding_lanes.py deep_research.py
research_handler.py research_utils.py personal_docs.py
rag_manager.py rag_singleton.py topic_analyzer.py
visual_report.py youtube_handler.py pdf_forms.py
pdf_form_doc.py pdf_runtime.py caldav_writeback.py
email_thread_parser.py text_helpers.py user_time.py
teacher_escalation.py cookbook_serve_lifecycle.py
chatgpt_subscription.py mcp_manager.py
```
### `routes/` (54 files)
```
__init__.py _validators.py
auth_routes.py api_token_routes.py device_flow.py
chat_routes.py chat_helpers.py shell_routes.py
codex_routes.py skills_routes.py
email_routes.py email_helpers.py email_pollers.py
cookbook_routes.py cookbook_helpers.py cookbook_output.py
model_routes.py assistant_routes.py copilot_routes.py
calendar_routes.py contacts_routes.py
document_routes.py document_helpers.py
gallery_routes.py gallery_helpers.py
task_routes.py session_routes.py
note_routes.py memory_routes.py research_routes.py
mcp_routes.py search_routes.py history_routes.py
webhook_routes.py workspace_routes.py upload_routes.py
vault_routes.py prefs_routes.py preset_routes.py
signature_routes.py personal_routes.py hwfit_routes.py
backup_routes.py cleanup_routes.py diagnostics_routes.py
embedding_routes.py emoji_routes.py font_routes.py
stt_routes.py tts_routes.py compare_routes.py
editor_draft_routes.py chatgpt_subscription_routes.py admin_wipe_routes.py
```
### `core/` (10 files)
```
__init__.py constants.py database.py models.py
auth.py middleware.py session_manager.py exceptions.py
atomic_io.py platform_compat.py
```
---
## Appendix B: Key Import Relationships
```
core/database.py ←── 102 importers (routes/*, src/*, core/*, tests/*)
├── routes/auth_routes.py
├── routes/email_routes.py
├── src/builtin_actions.py
├── src/task_scheduler.py
├── src/tool_implementations.py (inline)
└── ...97 more
src/tool_implementations.py ←── 17 importers
├── src/agent_loop.py
├── src/builtin_actions.py
├── src/tool_index.py
├── src/task_scheduler.py
├── src/tool_policy.py
└── ...12 more (mostly tests)
src/agent_loop.py ←── 22 importers
├── src/tool_policy.py
├── src/teacher_escalation.py
├── src/bg_monitor.py
├── src/task_scheduler.py
└── 18 more (incl. tests)
```
+361 -38
View File
@@ -267,6 +267,10 @@ _DOMAIN_RULES = {
- Use `resolve_contact` to look up a contact's email or phone number by name. Searches the CardDAV address book and sent email history. - Use `resolve_contact` to look up a contact's email or phone number by name. Searches the CardDAV address book and sent email history.
- Use `manage_contact` to list, add, update, or delete contacts in the address book. - Use `manage_contact` to list, add, update, or delete contacts in the address book.
- Do NOT use `manage_memory` for contact lookups contact details live in the address book, not memory.""", - Do NOT use `manage_memory` for contact lookups contact details live in the address book, not memory.""",
"integrations": """\
## Integration/API rules
- To query or control a configured service integration (Home Assistant, Miniflux, Gitea, Linkding, Jellyfin, or any other registered service), use `api_call` with the integration name, HTTP method, path, and optional JSON body.
- Do not use shell, curl, or `app_api` to reach a user's connected integration when `api_call` is available.""",
} }
_DOMAIN_TOOL_MAP = { _DOMAIN_TOOL_MAP = {
@@ -277,9 +281,10 @@ _DOMAIN_TOOL_MAP = {
"notes_calendar_tasks": {"manage_notes", "manage_calendar", "manage_tasks"}, "notes_calendar_tasks": {"manage_notes", "manage_calendar", "manage_tasks"},
"ui": {"ui_control"}, "ui": {"ui_control"},
"sessions": {"create_session", "list_sessions", "manage_session", "send_to_session", "search_chats"}, "sessions": {"create_session", "list_sessions", "manage_session", "send_to_session", "search_chats"},
"files": {"bash", "python", "read_file", "write_file", "edit_file", "grep", "glob", "ls", "get_workspace"}, "files": {"bash", "python", "read_file", "write_file", "edit_file", "grep", "glob", "ls", "get_workspace", "manage_bg_jobs"},
"settings": {"manage_settings", "manage_endpoints", "manage_mcp", "manage_webhooks", "manage_tokens", "app_api"}, "settings": {"manage_settings", "manage_endpoints", "manage_mcp", "manage_webhooks", "manage_tokens", "app_api"},
"contacts": {"resolve_contact", "manage_contact"}, "contacts": {"resolve_contact", "manage_contact"},
"integrations": {"api_call"},
} }
def _domain_rules_for_tools(tool_names: set) -> list[str]: def _domain_rules_for_tools(tool_names: set) -> list[str]:
@@ -524,7 +529,7 @@ def get_builtin_overrides() -> dict:
ov = get_setting("builtin_tool_overrides", {}) ov = get_setting("builtin_tool_overrides", {})
return ov if isinstance(ov, dict) else {} return ov if isinstance(ov, dict) else {}
except Exception as e: except Exception as e:
logger.warning('Failed to load builtin tool overrides: %s', e) logger.warning("Failed to load builtin tool overrides, using defaults", exc_info=e)
return {} return {}
@@ -536,17 +541,44 @@ def _section_text(name: str, default: str) -> str:
return val if isinstance(val, str) and val.strip() else default return val if isinstance(val, str) and val.strip() else default
def _compact_tool_line(name: str, section: str) -> str:
"""One-line fenced-tool usage hint for compact/local prompts."""
text = (section or "").strip()
if not text:
return f"- `{name}`"
if text.startswith("- "):
return text
lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
usage = []
in_fence = False
for ln in lines:
if ln.startswith("```"):
usage.append(ln)
in_fence = not in_fence
if len(usage) >= 3:
break
continue
if in_fence and len(usage) < 3:
usage.append(ln)
if usage:
return f"- `{name}` — " + " ".join(usage)
return f"- `{name}` — " + lines[0][:160]
def _assemble_prompt(tool_names: set, disabled_tools: set = None, compact: bool = False) -> str: def _assemble_prompt(tool_names: set, disabled_tools: set = None, compact: bool = False) -> str:
"""Build the system prompt with only the specified tools included.""" """Build the system prompt with only the specified tools included."""
disabled = disabled_tools or set() disabled = disabled_tools or set()
included = tool_names - disabled included = tool_names - disabled
if compact: if compact:
tool_list = ", ".join(sorted(included)) if included else "none" tool_lines = []
for name, _default_section in TOOL_SECTIONS.items():
if name in included:
tool_lines.append(_compact_tool_line(name, _section_text(name, _default_section)))
parts = [ parts = [
"You are an AI assistant with tool access.", _AGENT_PREAMBLE,
f"Available tools: {tool_list}.", "## Available tools\n" + ("\n".join(tool_lines) if tool_lines else "none"),
_API_AGENT_RULES, _AGENT_RULES,
] ]
parts.extend(_domain_rules_for_tools(included)) parts.extend(_domain_rules_for_tools(included))
return "\n\n".join(parts) return "\n\n".join(parts)
@@ -612,11 +644,6 @@ _API_HOSTS = frozenset([
"api.perplexity.ai", "api.x.ai", "api.perplexity.ai", "api.x.ai",
"ollama.com", "api.venice.ai", "api.kimi.com", "ollama.com", "api.venice.ai", "api.kimi.com",
"api.githubcopilot.com", "api.githubcopilot.com",
# Local OpenAI-compatible endpoints (llama.cpp, vLLM, LM Studio, etc.).
# Without these, `_is_api_model` falls back to keyword sniffing on the
# model name, so well-behaved local servers don't get native tool
# schemas and the agent silently degrades to fenced-block parsing.
"localhost", "127.0.0.1", "host.docker.internal",
]) ])
_MCP_KEYWORDS = frozenset(["mcp", "browse", "browser", "website", "calendar", "event", "email", _MCP_KEYWORDS = frozenset(["mcp", "browse", "browser", "website", "calendar", "event", "email",
"gmail", "screenshot", "navigate", "click", "miniflux", "rss", "feed"]) "gmail", "screenshot", "navigate", "click", "miniflux", "rss", "feed"])
@@ -644,6 +671,28 @@ def _is_ollama_openai_compat_url(endpoint_url: str) -> bool:
return parsed.port == 11434 and (path == "/v1" or path.startswith("/v1/")) return parsed.port == 11434 and (path == "/v1" or path.startswith("/v1/"))
def _is_local_openai_compat_url(endpoint_url: str) -> bool:
try:
parsed = urlparse(endpoint_url or "")
except Exception:
return False
host = (parsed.hostname or "").lower()
path = (parsed.path or "").rstrip("/")
if not (path == "/v1" or path.startswith("/v1/")):
return False
if host in {"localhost", "127.0.0.1", "0.0.0.0", "host.docker.internal"}:
return True
if host.startswith("192.168.") or host.startswith("10."):
return True
if host.startswith("172."):
try:
second = int(host.split(".")[1])
return 16 <= second <= 31
except Exception:
return False
return False
def _endpoint_lookup_keys(endpoint_url: str) -> List[str]: def _endpoint_lookup_keys(endpoint_url: str) -> List[str]:
"""Candidate ModelEndpoint.base_url keys for a runtime chat URL.""" """Candidate ModelEndpoint.base_url keys for a runtime chat URL."""
raw = (endpoint_url or "").strip() raw = (endpoint_url or "").strip()
@@ -707,6 +756,17 @@ def _extract_last_user_message(messages: List[Dict]) -> str:
_LOW_SIGNAL_RE = re.compile(r"^[\W_]*$", re.UNICODE) _LOW_SIGNAL_RE = re.compile(r"^[\W_]*$", re.UNICODE)
_CASUAL_OPENING_RE = re.compile(
r"^\s*(?:h+i+|hey+|hello+|yo+|sup+|what'?s up|wass?up|hiya|howdy|"
r"lol|lmao|haha+|hehe+|thanks?|thank you|ty|idk|dunno|meh|bruh|bro)\b(?P<tail>.*)$",
re.IGNORECASE,
)
_CASUAL_BLOCKLIST_RE = re.compile(
r"\b(?:cookbook|serve|serving|launch|start|vllm|sglang|llama\.?cpp|ollama|"
r"download|model|email|document|doc|note|calendar|task|search|web|research|"
r"file|folder|repo|git|settings?|endpoint|api|token|mcp)\b",
re.IGNORECASE,
)
_EXPLICIT_CONTINUATION_RE = re.compile( _EXPLICIT_CONTINUATION_RE = re.compile(
r"^\s*(?:" r"^\s*(?:"
r"yes|y|yeah|yep|ok|okay|sure|do it|go ahead|continue|carry on|" r"yes|y|yeah|yep|ok|okay|sure|do it|go ahead|continue|carry on|"
@@ -716,6 +776,17 @@ _EXPLICIT_CONTINUATION_RE = re.compile(
r")\s*[.!?]*\s*$", r")\s*[.!?]*\s*$",
re.IGNORECASE, re.IGNORECASE,
) )
_RETRY_CONTINUATION_RE = re.compile(
r"\b(?:try again|retry|again|rerun|re-run|run it again|launch it again|"
r"start it again|failed|fails?|died|crashed|broke|insta|instantly)\b",
re.IGNORECASE,
)
_COOKBOOK_CONTEXT_RE = re.compile(
r"\b(?:cookbook|serve|serving|served|launch|start|preset|vllm|sglang|"
r"llama\.?cpp|ollama|download|cached models?|model servers?|running models?|"
r"gpu box|ajax|qwen|gemma|llama|mistral|minimax)\b",
re.IGNORECASE,
)
def _is_explicit_continuation(text: str) -> bool: def _is_explicit_continuation(text: str) -> bool:
@@ -723,6 +794,37 @@ def _is_explicit_continuation(text: str) -> bool:
return bool(_EXPLICIT_CONTINUATION_RE.match(str(text or "").strip())) return bool(_EXPLICIT_CONTINUATION_RE.match(str(text or "").strip()))
def _is_casual_low_signal(text: str) -> bool:
"""True for short greetings/slang that should not inherit stale context."""
s = str(text or "").strip()
m = _CASUAL_OPENING_RE.match(s)
if not m:
return False
tail = m.group("tail") or ""
if _CASUAL_BLOCKLIST_RE.search(tail):
return False
# Allow a short vocative/address after the opener without hardcoding the
# address term itself: "hey man", "yo dude", "sup <name>". Longer tails are
# more likely to be an actual request and should get normal context/tooling.
tail_words = re.findall(r"[A-Za-z0-9_'-]+", tail)
return len(tail_words) <= 2
def _is_contextual_retry_continuation(messages: List[Dict], text: str) -> bool:
"""Treat "try again / it failed" as a continuation only for active tool work.
These follow-ups are common after Cookbook launches: the latest user turn
says only "try again it failed", while the actionable model/host/command
details live one or two turns back. Keep this intentionally narrow so
ordinary chat does not inherit stale Cookbook context.
"""
latest = str(text or "").strip()
if not latest or not _RETRY_CONTINUATION_RE.search(latest):
return False
recent = _recent_context_for_retrieval(messages, max_user=5, max_chars=1200)
return bool(_COOKBOOK_CONTEXT_RE.search(recent))
def _assistant_requested_followup(messages: List[Dict]) -> bool: def _assistant_requested_followup(messages: List[Dict]) -> bool:
"""True when the previous assistant turn asked for missing task details. """True when the previous assistant turn asked for missing task details.
@@ -764,11 +866,12 @@ def _classify_agent_request(messages: List[Dict], last_user: str) -> Dict[str, o
which domain rule packs get appended to the system prompt. which domain rule packs get appended to the system prompt.
""" """
text = str(last_user or "").strip() text = str(last_user or "").strip()
continuation = _is_explicit_continuation(text) or _assistant_requested_followup(messages) retry_continuation = _is_contextual_retry_continuation(messages, text)
continuation = _is_explicit_continuation(text) or _assistant_requested_followup(messages) or retry_continuation
retrieval_query = _recent_context_for_retrieval(messages) if continuation else text retrieval_query = _recent_context_for_retrieval(messages) if continuation else text
q = retrieval_query.lower() q = retrieval_query.lower()
if not text or bool(_LOW_SIGNAL_RE.match(text)): if not text or bool(_LOW_SIGNAL_RE.match(text)) or _is_casual_low_signal(text):
return { return {
"low_signal": True, "low_signal": True,
"continuation": False, "continuation": False,
@@ -811,10 +914,25 @@ def _classify_agent_request(messages: List[Dict], last_user: str) -> Dict[str, o
domains.add("sessions") domains.add("sessions")
if has(r"\b(file|folder|directory|repo|git|grep|find in files|read file|edit file|shell|terminal|bash|python)\b"): if has(r"\b(file|folder|directory|repo|git|grep|find in files|read file|edit file|shell|terminal|bash|python)\b"):
domains.add("files") domains.add("files")
# Managing detached bash jobs: "kill the background job", "stop the job",
# "kill that job", "check the job output", "is the bg job done".
if (has(r"\b(background|bg)\s+(jobs?|task)\b")
or has(r"\b(kill|stop|cancel|terminate|check|tail|show|list)\b.{0,16}\bjobs?\b")
or has(r"\bjobs?\b.{0,16}\b(output|status|done|finished|running)\b")):
domains.add("files")
if has(r"\b(endpoint|api token|mcp|webhook|preference|configure|config|setting)\b"): if has(r"\b(endpoint|api token|mcp|webhook|preference|configure|config|setting)\b"):
domains.add("settings") domains.add("settings")
if has(r"\b(contact|contacts|phone|phone number|address book|vcard)\b"): if has(r"\b(contact|contacts|phone|phone number|address book|vcard)\b"):
domains.add("contacts") domains.add("contacts")
# API-integration intent — calling a configured service via the api_call
# tool. Without this the #3794 repro ("Use the api_call tool to call Home
# Assistant GET /api/states") matched no domain, classified as low-signal,
# and the tool never reached the schema filter. Detect it explicitly so the
# "integrations" domain seeds api_call deterministically (see
# _DOMAIN_TOOL_MAP), independent of embedding retrieval.
if has(r"\bapi[ _]call\b", r"\bintegrations?\b",
r"\b(?:home ?assistant|miniflux|gitea|linkding|jellyfin)\b"):
domains.add("integrations")
low_signal = not continuation and not domains low_signal = not continuation and not domains
return { return {
@@ -843,8 +961,11 @@ def _recent_context_for_retrieval(messages: List[Dict], max_user: int = 3, max_c
if isinstance(content, list): if isinstance(content, list):
content = " ".join(b.get("text", "") for b in content if isinstance(b, dict)) content = " ".join(b.get("text", "") for b in content if isinstance(b, dict))
content = (content or "").strip() content = (content or "").strip()
# Skip injected tool-result envelopes — role=user but not human intent. # Skip injected envelopes — role=user but not human intent. Tool results
if not content or content.startswith("[Tool execution results]"): # are now wrapped via untrusted_context_message (metadata.trusted=False);
# keep the legacy "[Tool execution results]" prefix for older histories.
meta = msg.get("metadata") or {}
if not content or meta.get("trusted") is False or content.startswith("[Tool execution results]"):
continue continue
collected.append(content) collected.append(content)
if len(collected) >= max_user: if len(collected) >= max_user:
@@ -863,6 +984,7 @@ def _build_system_prompt(
compact: bool = False, compact: bool = False,
owner: Optional[str] = None, owner: Optional[str] = None,
suppress_local_context: bool = False, suppress_local_context: bool = False,
suppress_skills: bool = False,
active_email: Optional[Dict[str, str]] = None, active_email: Optional[Dict[str, str]] = None,
) -> List[Dict]: ) -> List[Dict]:
"""Build agent system prompt, inject MCP/document context, merge consecutive system msgs.""" """Build agent system prompt, inject MCP/document context, merge consecutive system msgs."""
@@ -880,7 +1002,7 @@ def _build_system_prompt(
_ov_sig = _hl.sha256(_json.dumps(get_builtin_overrides() or {}, sort_keys=True).encode()).hexdigest() _ov_sig = _hl.sha256(_json.dumps(get_builtin_overrides() or {}, sort_keys=True).encode()).hexdigest()
except Exception: except Exception:
_ov_sig = "" _ov_sig = ""
cache_key = (frozenset(disabled_tools or []), bool(mcp_mgr), needs_admin, _rt_key, compact, _ov_sig, owner, suppress_local_context) cache_key = (frozenset(disabled_tools or []), bool(mcp_mgr), needs_admin, _rt_key, compact, _ov_sig, owner, suppress_local_context, suppress_skills)
if _cached_base_prompt and _cached_base_prompt_key == cache_key and not active_document: if _cached_base_prompt and _cached_base_prompt_key == cache_key and not active_document:
agent_prompt = _cached_base_prompt agent_prompt = _cached_base_prompt
# Skill index is user-editable (name + description), so it must never # Skill index is user-editable (name + description), so it must never
@@ -890,6 +1012,7 @@ def _build_system_prompt(
disabled_tools, mcp_mgr, needs_admin, relevant_tools, disabled_tools, mcp_mgr, needs_admin, relevant_tools,
mcp_disabled_map=mcp_disabled_map, compact=compact, owner=owner, mcp_disabled_map=mcp_disabled_map, compact=compact, owner=owner,
suppress_local_context=suppress_local_context, suppress_local_context=suppress_local_context,
suppress_skills=suppress_skills,
) )
else: else:
agent_prompt, _skill_index_block = _build_base_prompt( agent_prompt, _skill_index_block = _build_base_prompt(
@@ -901,6 +1024,7 @@ def _build_system_prompt(
compact=compact, compact=compact,
owner=owner, owner=owner,
suppress_local_context=suppress_local_context, suppress_local_context=suppress_local_context,
suppress_skills=suppress_skills,
) )
if not active_document: if not active_document:
_cached_base_prompt = agent_prompt _cached_base_prompt = agent_prompt
@@ -929,8 +1053,8 @@ def _build_system_prompt(
try: try:
from src.user_time import current_datetime_context_message from src.user_time import current_datetime_context_message
_datetime_message = current_datetime_context_message() _datetime_message = current_datetime_context_message()
except Exception: except Exception as e:
pass logger.warning("Failed to build datetime context message", exc_info=e)
# Document context is kept as a SEPARATE message (not merged into the tool # Document context is kept as a SEPARATE message (not merged into the tool
# prompt) so the context trimmer doesn't destroy it when truncating the # prompt) so the context trimmer doesn't destroy it when truncating the
@@ -973,8 +1097,8 @@ def _build_system_prompt(
try: try:
from src.pdf_form_doc import find_source_upload_id from src.pdf_form_doc import find_source_upload_id
_is_form_backed = bool(find_source_upload_id(active_document.current_content or "")) _is_form_backed = bool(find_source_upload_id(active_document.current_content or ""))
except Exception: except Exception as e:
pass logger.warning("Failed to detect if document is form-backed, assuming plain", exc_info=e)
if _is_form_backed: if _is_form_backed:
doc_ctx = ( doc_ctx = (
@@ -1184,7 +1308,7 @@ def _build_system_prompt(
# few. If the teacher wrote a procedure for "open my X chat" last # few. If the teacher wrote a procedure for "open my X chat" last
# time the student failed, this is where the student finds it # time the student failed, this is where the student finds it
# before deciding which tool to call. # before deciding which tool to call.
if not suppress_local_context: if not suppress_local_context and not suppress_skills:
try: try:
last_user = _extract_last_user_message(messages) last_user = _extract_last_user_message(messages)
# Respect the user's skills-enabled toggle (mirrors memory_enabled). # Respect the user's skills-enabled toggle (mirrors memory_enabled).
@@ -1351,6 +1475,7 @@ def _build_base_prompt(
compact: bool = False, compact: bool = False,
owner: Optional[str] = None, owner: Optional[str] = None,
suppress_local_context: bool = False, suppress_local_context: bool = False,
suppress_skills: bool = False,
): ):
"""Build the agent prompt with only relevant tools included. """Build the agent prompt with only relevant tools included.
@@ -1403,7 +1528,7 @@ def _build_base_prompt(
# The caller wraps it in untrusted_context_message and ships it as a # The caller wraps it in untrusted_context_message and ships it as a
# user-role message — same treatment as the matched-skills block. # user-role message — same treatment as the matched-skills block.
skill_index_block = "" skill_index_block = ""
if not suppress_local_context: if not suppress_local_context and not suppress_skills:
try: try:
from services.memory.skills import SkillsManager from services.memory.skills import SkillsManager
from src.constants import DATA_DIR from src.constants import DATA_DIR
@@ -1562,8 +1687,14 @@ def _append_tool_results(
if round_reasoning: if round_reasoning:
msg["reasoning_content"] = round_reasoning msg["reasoning_content"] = round_reasoning
messages.append(msg) messages.append(msg)
# Tool output (shell/python stdout, file reads, fetched pages, email
# bodies, MCP results) is sourced from outside the server. Wrap it as
# untrusted data so prompt-injection inside a tool result is treated as
# data, not instructions — same hardening as skills (#788) and the
# web/RAG context. THREAT_MODEL.md lists tool output as a surface that
# must go through untrusted_context_message.
messages.append( messages.append(
{"role": "user", "content": f"[Tool execution results]\n\n{tool_output_text}"} untrusted_context_message("tool execution results", tool_output_text)
) )
@@ -1822,6 +1953,7 @@ async def stream_agent_loop(
approved_plan: Optional[str] = None, approved_plan: Optional[str] = None,
tool_policy: Optional[ToolPolicy] = None, tool_policy: Optional[ToolPolicy] = None,
workspace: Optional[str] = None, workspace: Optional[str] = None,
forced_tools: Optional[Set[str]] = None,
_is_teacher_run: bool = False, _is_teacher_run: bool = False,
) -> AsyncGenerator[str, None]: ) -> AsyncGenerator[str, None]:
"""Streaming agent loop generator. """Streaming agent loop generator.
@@ -1861,6 +1993,20 @@ async def stream_agent_loop(
_needs_admin = _detect_admin_intent(messages) _needs_admin = _detect_admin_intent(messages)
_last_user = _extract_last_user_message(messages) _last_user = _extract_last_user_message(messages)
_intent = _classify_agent_request(messages, _last_user) _intent = _classify_agent_request(messages, _last_user)
_low_signal_turn = bool(_intent.get("low_signal"))
_casual_low_signal_turn = _is_casual_low_signal(_last_user)
_direct_low_signal = (
_low_signal_turn
and not bool(_intent.get("continuation"))
and not plan_mode
and not approved_plan
and not guide_only
and (_casual_low_signal_turn or active_document is None)
and (_casual_low_signal_turn or not active_email)
and (_casual_low_signal_turn or not workspace)
and not forced_tools
and not relevant_tools
)
# Tool retrieval uses the latest message by default. It may inherit recent # Tool retrieval uses the latest message by default. It may inherit recent
# user turns only for explicit continuations ("yes", "do it", "1"). # user turns only for explicit continuations ("yes", "do it", "1").
_retrieval_query = str(_intent.get("retrieval_query") or _last_user) _retrieval_query = str(_intent.get("retrieval_query") or _last_user)
@@ -1868,11 +2014,86 @@ async def stream_agent_loop(
"[agent-intent] latest=%r continuation=%s low_signal=%s domains=%s retrieval_query=%r", "[agent-intent] latest=%r continuation=%s low_signal=%s domains=%s retrieval_query=%r",
_last_user[:120], _last_user[:120],
bool(_intent.get("continuation")), bool(_intent.get("continuation")),
bool(_intent.get("low_signal")), _low_signal_turn,
sorted(_intent.get("domains") or []), sorted(_intent.get("domains") or []),
_retrieval_query[:200], _retrieval_query[:200],
) )
_mcp_disabled_map = _load_mcp_disabled_map() if mcp_mgr else {} _mcp_disabled_map = _load_mcp_disabled_map() if mcp_mgr else {}
if _direct_low_signal:
logger.info("[agent] direct low-signal reply path for latest=%r", _last_user[:80])
direct_messages = [{"role": "user", "content": _last_user}]
direct_response = ""
direct_start = time.time()
direct_actual_model = model
real_input_tokens = 0
real_output_tokens = 0
try:
async for chunk in stream_llm_with_fallback(
[(endpoint_url, model, headers)] + list(fallbacks or []),
direct_messages,
temperature=temperature,
max_tokens=min(max_tokens or 128, 128),
prompt_type=None,
tools=None,
timeout=int(get_setting("agent_stream_timeout_seconds", 300) or 300),
session_id=session_id,
):
if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"):
try:
data = json.loads(chunk[6:])
except json.JSONDecodeError:
yield chunk
continue
if data.get("type") == "usage":
usage = data.get("data", {}) or {}
direct_actual_model = usage.get("model") or direct_actual_model
real_input_tokens += usage.get("input_tokens", 0) or 0
real_output_tokens += usage.get("output_tokens", 0) or 0
continue
if data.get("type") == "model_actual":
direct_actual_model = data.get("model") or direct_actual_model
data["requested_model"] = model
yield f"data: {json.dumps(data)}\n\n"
continue
if data.get("type") == "fallback":
direct_actual_model = data.get("answered_by") or direct_actual_model
yield chunk
continue
if "delta" in data:
if not data.get("thinking"):
direct_response += data.get("delta", "")
yield chunk
continue
yield chunk
elif chunk.startswith("event: "):
yield chunk
except Exception as _direct_err:
logger.warning("[agent] direct low-signal path failed: %s", _direct_err)
fallback = "Hey."
direct_response += fallback
yield f"data: {json.dumps({'delta': fallback})}\n\n"
if not direct_response.strip():
fallback = "Hey."
direct_response = fallback
yield f"data: {json.dumps({'delta': fallback})}\n\n"
duration = time.time() - direct_start
metrics = {
"model": direct_actual_model,
"requested_model": model,
"input_tokens": real_input_tokens or estimate_tokens(direct_messages),
"output_tokens": real_output_tokens or max(len(direct_response) // 4, 1),
"total_time": round(duration, 2),
"response_time": round(duration, 2),
"agent_rounds": 0,
"tool_calls": 0,
"direct_low_signal": True,
}
yield f"data: {json.dumps({'type': 'metrics', 'data': metrics})}\n\n"
yield "data: [DONE]\n\n"
return
if plan_mode and mcp_mgr: if plan_mode and mcp_mgr:
# Allow read-only MCP tools to investigate, block write/unknown ones: # Allow read-only MCP tools to investigate, block write/unknown ones:
# hide them from the schemas AND reject them at runtime by qualified name. # hide them from the schemas AND reject them at runtime by qualified name.
@@ -1884,11 +2105,11 @@ async def stream_agent_loop(
# RAG-based tool selection: retrieve relevant tools for this query. # RAG-based tool selection: retrieve relevant tools for this query.
# If caller provided a pre-computed set (e.g. task_scheduler), use that. # If caller provided a pre-computed set (e.g. task_scheduler), use that.
_relevant_tools = set() if guide_only else relevant_tools _relevant_tools = relevant_tools
_t1 = time.time() _t1 = time.time()
if _relevant_tools: if _relevant_tools:
logger.info(f"[tool-rag] Using caller-provided relevant_tools ({len(_relevant_tools)} tools)") logger.info(f"[tool-rag] Using caller-provided relevant_tools ({len(_relevant_tools)} tools)")
if not guide_only and not _relevant_tools and bool(_intent.get("low_signal")): if not guide_only and not _relevant_tools and _low_signal_turn:
from src.tool_index import ALWAYS_AVAILABLE from src.tool_index import ALWAYS_AVAILABLE
if workspace: if workspace:
# An active workspace IS the file-work signal: a vague "look at the # An active workspace IS the file-work signal: a vague "look at the
@@ -1979,6 +2200,15 @@ async def stream_agent_loop(
if _relevant_tools is not None and active_document is not None: if _relevant_tools is not None and active_document is not None:
_relevant_tools.update({"edit_document", "update_document", "suggest_document"}) _relevant_tools.update({"edit_document", "update_document", "suggest_document"})
# Per-request UI toggles are stronger than retrieval. If the user turns on
# Search, the model must see the search tools even when the latest text is a
# typo or otherwise low-signal for tool RAG.
if not guide_only and forced_tools:
if _relevant_tools is None:
from src.tool_index import ALWAYS_AVAILABLE
_relevant_tools = set(ALWAYS_AVAILABLE)
_relevant_tools.update(t for t in forced_tools if t not in disabled_tools)
# The skill index injected by _build_system_prompt tells the model to # The skill index injected by _build_system_prompt tells the model to
# call `manage_skills action=view`, and Jaccard-matched skills are pasted # call `manage_skills action=view`, and Jaccard-matched skills are pasted
# into the prompt as procedures to follow — but neither path goes through # into the prompt as procedures to follow — but neither path goes through
@@ -1986,7 +2216,7 @@ async def stream_agent_loop(
# (grep, read_file, ...) that aren't in its schema list. Keep the schemas # (grep, read_file, ...) that aren't in its schema list. Keep the schemas
# in lockstep: manage_skills is callable whenever any skill is indexed, # in lockstep: manage_skills is callable whenever any skill is indexed,
# and a matched skill's declared requires_toolsets ride along with it. # and a matched skill's declared requires_toolsets ride along with it.
if not guide_only and _relevant_tools is not None: if not guide_only and _relevant_tools is not None and not _low_signal_turn:
try: try:
from services.memory.skills import SkillsManager from services.memory.skills import SkillsManager
from src.constants import DATA_DIR from src.constants import DATA_DIR
@@ -2051,7 +2281,7 @@ async def stream_agent_loop(
_model_supports_tools = any(kw in _model_lc for kw in ( _model_supports_tools = any(kw in _model_lc for kw in (
"gpt-4", "gpt-5", "gpt-o", "claude", "gemini", "gemma", "gpt-4", "gpt-5", "gpt-o", "claude", "gemini", "gemma",
"qwen3", "qwen2.5", "mixtral", "mistral", "llama-3.1", "llama-3.2", "qwen3", "qwen2.5", "mixtral", "mistral", "llama-3.1", "llama-3.2",
"llama-3.3", "llama-4", "llama-3.3", "llama-4", "llama3.1", "llama3.2", "llama3.3", "llama4",
# Local-served models that follow OpenAI-style function calling # Local-served models that follow OpenAI-style function calling
# via vLLM's `--enable-auto-tool-choice`. Belt-and-suspenders # via vLLM's `--enable-auto-tool-choice`. Belt-and-suspenders
# with the per-endpoint flag above. # with the per-endpoint flag above.
@@ -2093,13 +2323,15 @@ async def stream_agent_loop(
_is_api_model = False _is_api_model = False
else: else:
_is_api_model = any(h in endpoint_url for h in _API_HOSTS) or _model_supports_tools _is_api_model = any(h in endpoint_url for h in _API_HOSTS) or _model_supports_tools
_compact_agent_prompt = _is_api_model or _is_ollama_native or _ollama_openai_compat
messages, mcp_schemas = _build_system_prompt( messages, mcp_schemas = _build_system_prompt(
messages, model, active_document, mcp_mgr, disabled_tools, messages, model, active_document, mcp_mgr, disabled_tools,
needs_admin=_needs_admin, relevant_tools=_relevant_tools, needs_admin=_needs_admin, relevant_tools=_relevant_tools,
mcp_disabled_map=_mcp_disabled_map, mcp_disabled_map=_mcp_disabled_map,
compact=_is_api_model, compact=_compact_agent_prompt,
owner=owner, owner=owner,
suppress_local_context=guide_only, suppress_local_context=guide_only,
suppress_skills=_low_signal_turn,
active_email=active_email, active_email=active_email,
) )
if plan_mode and not guide_only: if plan_mode and not guide_only:
@@ -2185,6 +2417,14 @@ async def stream_agent_loop(
# Strip internal metadata keys before sending to the LLM API # Strip internal metadata keys before sending to the LLM API
messages = [{k: v for k, v in msg.items() if k != "_protected"} for msg in messages] messages = [{k: v for k, v in msg.items() if k != "_protected"} for msg in messages]
agent_prompt_tokens = estimate_tokens(messages)
logger.info(
"[agent-timing] prep_done model=%s prompt_tokens=%s context_length=%s prep=%s",
model,
agent_prompt_tokens,
context_length,
{k: round(v, 3) for k, v in prep_timings.items()},
)
yield f"data: {json.dumps({'type': 'agent_prep', 'data': {k: round(v, 3) for k, v in prep_timings.items()}})}\n\n" yield f"data: {json.dumps({'type': 'agent_prep', 'data': {k: round(v, 3) for k, v in prep_timings.items()}})}\n\n"
full_response = "" full_response = ""
@@ -2329,6 +2569,19 @@ async def stream_agent_loop(
# complementary cap for the rare stream that trickles bytes forever and # complementary cap for the rare stream that trickles bytes forever and
# so never trips the inactivity timeout. Generous — only catches runaway. # so never trips the inactivity timeout. Generous — only catches runaway.
_round_deadline = time.time() + max(agent_stream_timeout * 4, 1200) _round_deadline = time.time() + max(agent_stream_timeout * 4, 1200)
_round_start = time.time()
_round_first_event_logged = False
_round_first_token_logged = False
logger.info(
"[agent-timing] round_start round=%s model=%s endpoint=%s prompt_tokens=%s tools=%s native_tools=%s timeout=%s",
round_num,
model,
endpoint_url,
estimate_tokens(messages),
len(_tool_names_sent),
bool(all_tool_schemas),
agent_stream_timeout,
)
async for chunk in stream_llm_with_fallback( async for chunk in stream_llm_with_fallback(
_candidates, _candidates,
messages, messages,
@@ -2339,11 +2592,30 @@ async def stream_agent_loop(
timeout=agent_stream_timeout, timeout=agent_stream_timeout,
session_id=session_id, session_id=session_id,
): ):
if not _round_first_event_logged:
_round_first_event_logged = True
logger.info(
"[agent-timing] first_event round=%s elapsed=%.3fs kind=%s",
round_num,
time.time() - _round_start,
"error" if chunk.startswith("event: error") else "data",
)
if time.time() > _round_deadline: if time.time() > _round_deadline:
logger.warning(f"[agent] round {round_num} stream exceeded wall-clock deadline; cutting off") logger.warning(
"[agent-timing] round_deadline round=%s elapsed=%.3fs deadline_s=%s",
round_num,
time.time() - _round_start,
max(agent_stream_timeout * 4, 1200),
)
break break
# Forward error events from stream_llm to the frontend # Forward error events from stream_llm to the frontend
if chunk.startswith("event: error"): if chunk.startswith("event: error"):
logger.warning(
"[agent-timing] stream_error round=%s elapsed=%.3fs chunk=%r",
round_num,
time.time() - _round_start,
chunk[:500],
)
yield chunk yield chunk
continue continue
if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"): if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"):
@@ -2423,6 +2695,15 @@ async def stream_agent_loop(
if not first_token_received: if not first_token_received:
time_to_first_token = time.time() - total_start time_to_first_token = time.time() - total_start
first_token_received = True first_token_received = True
if not _round_first_token_logged:
_round_first_token_logged = True
logger.info(
"[agent-timing] first_visible_token round=%s elapsed=%.3fs total_elapsed=%.3fs thinking=%s",
round_num,
time.time() - _round_start,
time.time() - total_start,
bool(data.get("thinking")),
)
# Keep reasoning deltas in a separate accumulator so # Keep reasoning deltas in a separate accumulator so
# we can echo them back via `reasoning_content` on the # we can echo them back via `reasoning_content` on the
# next request (DeepSeek requires this; harmless for # next request (DeepSeek requires this; harmless for
@@ -2492,7 +2773,21 @@ async def stream_agent_loop(
yield chunk yield chunk
# Intercept [DONE] — don't forward until all rounds finish # Intercept [DONE] — don't forward until all rounds finish
tool_blocks, used_native = _resolve_tool_blocks(round_response, native_tool_calls, round_num, is_api_model=_is_api_model) logger.info(
"[agent-timing] round_stream_done round=%s elapsed=%.3fs text_chars=%s tool_calls=%s first_event=%s first_token=%s",
round_num,
time.time() - _round_start,
len(round_response),
len(native_tool_calls),
_round_first_event_logged,
_round_first_token_logged,
)
tool_blocks, used_native = _resolve_tool_blocks(
round_response,
native_tool_calls,
round_num,
is_api_model=(_is_api_model and not guide_only),
)
# Force-answer round: we told the model to STOP calling tools and # Force-answer round: we told the model to STOP calling tools and
# answer. If it ignored that and emitted a (possibly DSML) tool # answer. If it ignored that and emitted a (possibly DSML) tool
@@ -2576,7 +2871,7 @@ async def stream_agent_loop(
# model with no real native_tool_calls) must not be stripped from the # model with no real native_tool_calls) must not be stripped from the
# persisted text either — otherwise it streams once and then disappears # persisted text either — otherwise it streams once and then disappears
# on reload (#3222 follow-up). # on reload (#3222 follow-up).
cleaned_round = strip_tool_blocks(round_response, skip_fenced=(_is_api_model and not used_native)).strip() cleaned_round = strip_tool_blocks(round_response, skip_fenced=(_is_api_model and not used_native and not guide_only)).strip()
round_texts.append(cleaned_round) round_texts.append(cleaned_round)
if not tool_blocks: if not tool_blocks:
@@ -2648,6 +2943,15 @@ async def stream_agent_loop(
_intent_nudge_count += 1 _intent_nudge_count += 1
_matched_phrase = _intent_match.group(0).strip() _matched_phrase = _intent_match.group(0).strip()
logger.info(f"[agent] intent-without-action nudge #{_intent_nudge_count} on round {round_num}: {_matched_phrase!r}") logger.info(f"[agent] intent-without-action nudge #{_intent_nudge_count} on round {round_num}: {_matched_phrase!r}")
_lower_phrase = _matched_phrase.lower()
_cookbook_log_hint = ""
if any(_word in _lower_phrase for _word in ("log", "logs", "output", "tail", "status")):
_cookbook_log_hint = (
" If this is about a Cookbook/model serve, the concrete calls are: "
"`list_served_models` first, then `tail_serve_output` with the "
"session_id from the serve/list result. Never answer with "
"\"check logs\" when those tools are available."
)
messages.append({ messages.append({
"role": "system", "role": "system",
"content": ( "content": (
@@ -2656,6 +2960,7 @@ async def stream_agent_loop(
"see you announced the action but didn't run it, which " "see you announced the action but didn't run it, which "
"is the most frustrating thing you can do. " "is the most frustrating thing you can do. "
"DO IT NOW: emit the actual function call this turn. " "DO IT NOW: emit the actual function call this turn. "
f"{_cookbook_log_hint}"
"If you decided not to do it after all, say so plainly in " "If you decided not to do it after all, say so plainly in "
"one sentence instead of restating the plan." "one sentence instead of restating the plan."
), ),
@@ -2910,9 +3215,12 @@ async def stream_agent_loop(
f'data: {json.dumps({"type": "ui_control", "data": result})}\n\n' f'data: {json.dumps({"type": "ui_control", "data": result})}\n\n'
) )
# ask_user: the agent posed a multiple-choice question. Emit it so the # ask_user: remember the payload now, but emit the interactive event
# frontend renders clickable options, then end the turn (below) and # only *after* tool_output below. Emitting it before tool_output let
# wait — the user's pick becomes the next message. # the subsequent tool-card rewrite/scroll push the choices out of
# view. The payload is also copied into the persisted tool event so
# history reload can reconstruct an unanswered card.
_pending_ask_user_event = None
if "ask_user" in result: if "ask_user" in result:
# The question lives in the tool args. ChatMessage.to_dict() # The question lives in the tool args. ChatMessage.to_dict()
# replays only role+content to the model next turn — tool_event # replays only role+content to the model next turn — tool_event
@@ -2927,9 +3235,7 @@ async def stream_agent_loop(
_auq_delta = ("\n\n" if full_response.strip() else "") + _auq_q _auq_delta = ("\n\n" if full_response.strip() else "") + _auq_q
full_response += _auq_delta full_response += _auq_delta
yield 'data: ' + json.dumps({"delta": _auq_delta}) + '\n\n' yield 'data: ' + json.dumps({"delta": _auq_delta}) + '\n\n'
yield ( _pending_ask_user_event = _auq
f'data: {json.dumps({"type": "ask_user", "data": result["ask_user"]})}\n\n'
)
_awaiting_user = True _awaiting_user = True
# update_plan: agent wrote back to the plan (ticked a step / revised). # update_plan: agent wrote back to the plan (ticked a step / revised).
@@ -2984,6 +3290,10 @@ async def stream_agent_loop(
# Emit tool_output (include ui_event data if present) # Emit tool_output (include ui_event data if present)
tool_output_data = {"type": "tool_output", "tool": block.tool_type, "command": cmd_display, "output": output_text, "exit_code": result.get("exit_code")} tool_output_data = {"type": "tool_output", "tool": block.tool_type, "command": cmd_display, "output": output_text, "exit_code": result.get("exit_code")}
if _pending_ask_user_event:
# Keep enough state in the streamed tool result for alternate
# clients to render the prompt without depending on event order.
tool_output_data["ask_user"] = _pending_ask_user_event
if "ui_event" in result: if "ui_event" in result:
tool_output_data["ui_event"] = result["ui_event"] tool_output_data["ui_event"] = result["ui_event"]
for k in ( for k in (
@@ -3014,6 +3324,14 @@ async def stream_agent_loop(
tool_output_data["diff"] = result["diff"] tool_output_data["diff"] = result["diff"]
yield f'data: {json.dumps(tool_output_data)}\n\n' yield f'data: {json.dumps(tool_output_data)}\n\n'
# This must be the final UI event for ask_user: the frontend appends
# the card below the now-settled tool node and cancels any between-
# round spinner. The turn ends after the current tool batch.
if _pending_ask_user_event:
yield (
f'data: {json.dumps({"type": "ask_user", "data": _pending_ask_user_event})}\n\n'
)
# Native document tools open in the editor + carry the REAL doc id. # Native document tools open in the editor + carry the REAL doc id.
# Emit a doc_update so the frontend opens/activates it and sends it # Emit a doc_update so the frontend opens/activates it and sends it
# back as active_doc_id next turn (otherwise the agent can't "see" # back as active_doc_id next turn (otherwise the agent can't "see"
@@ -3071,6 +3389,11 @@ async def stream_agent_loop(
# this the diff shows live but vanishes from saved history. # this the diff shows live but vanishes from saved history.
if result.get("diff"): if result.get("diff"):
tool_event["diff"] = result["diff"] tool_event["diff"] = result["diff"]
if _pending_ask_user_event:
# Persist the structured question with the tool event. On a
# reload, chatRenderer can restore the card; a later user
# message removes it as answered.
tool_event["ask_user"] = _pending_ask_user_event
tool_events.append(tool_event) tool_events.append(tool_event)
if block.tool_type in _VERIFIER_EFFECTFUL_TOOLS: if block.tool_type in _VERIFIER_EFFECTFUL_TOOLS:
_effectful_used = True _effectful_used = True
+13 -1
View File
@@ -174,8 +174,20 @@ async def subscribe(session_id: str) -> AsyncGenerator[str, None]:
next_seq += 1 next_seq += 1
if run.status != "running": if run.status != "running":
return return
heartbeat_idx = 0
while True: while True:
seq, ev = await q.get() try:
seq, ev = await asyncio.wait_for(q.get(), timeout=10.0)
except asyncio.TimeoutError:
# Keep slow local models/proxies alive while they prefill before
# the first token. SSE comments are ignored by the UI but reset
# browser/proxy idle timers, which prevents "empty response"
# disconnects on llama.cpp first-token latencies of 30s+.
if run.status == "running":
heartbeat_idx += 1
yield f": heartbeat {heartbeat_idx}\n\n"
continue
seq, ev = (None, None)
if seq is None: # end sentinel if seq is None: # end sentinel
while next_seq < len(run.buffer): # flush any tail the sentinel raced while next_seq < len(run.buffer): # flush any tail the sentinel raced
yield run.buffer[next_seq] yield run.buffer[next_seq]
+12 -1
View File
@@ -22,6 +22,9 @@ from .subprocess_tools import BashTool, PythonTool
from .web_tools import WebSearchTool, WebFetchTool from .web_tools import WebSearchTool, WebFetchTool
from .filesystem_tools import ReadFileTool, WriteFileTool, EditFileTool, LsTool, GlobTool, GrepTool, GetWorkspaceTool from .filesystem_tools import ReadFileTool, WriteFileTool, EditFileTool, LsTool, GlobTool, GrepTool, GetWorkspaceTool
from .document_tools import CreateDocumentTool, UpdateDocumentTool, EditDocumentTool, SuggestDocumentTool, ManageDocumentTool from .document_tools import CreateDocumentTool, UpdateDocumentTool, EditDocumentTool, SuggestDocumentTool, ManageDocumentTool
from .model_interaction_tools import ChatWithModelTool, AskTeacherTool, ListModelsTool
from .bg_job_tools import ManageBgJobsTool
from .session_tools import CreateSessionTool, ListSessionsTool, SendToSessionTool, ManageSessionTool
TOOL_HANDLERS = { TOOL_HANDLERS = {
"bash": BashTool().execute, "bash": BashTool().execute,
@@ -40,6 +43,14 @@ TOOL_HANDLERS = {
"suggest_document": SuggestDocumentTool().execute, "suggest_document": SuggestDocumentTool().execute,
"manage_documents": ManageDocumentTool().execute, "manage_documents": ManageDocumentTool().execute,
"get_workspace": GetWorkspaceTool().execute, "get_workspace": GetWorkspaceTool().execute,
"chat_with_model": ChatWithModelTool().execute,
"ask_teacher": AskTeacherTool().execute,
"list_models": ListModelsTool().execute,
"manage_bg_jobs": ManageBgJobsTool().execute,
"create_session": CreateSessionTool().execute,
"list_sessions": ListSessionsTool().execute,
"send_to_session": SendToSessionTool().execute,
"manage_session": ManageSessionTool().execute,
} }
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -52,7 +63,7 @@ PYTHON_TIMEOUT = 30
# Tool types that trigger execution # Tool types that trigger execution
TOOL_TAGS = {"bash", "python", "web_search", "web_fetch", "read_file", "write_file", "edit_file", TOOL_TAGS = {"bash", "python", "web_search", "web_fetch", "read_file", "write_file", "edit_file",
"grep", "glob", "ls", "get_workspace", "grep", "glob", "ls", "get_workspace", "manage_bg_jobs",
"create_document", "update_document", "edit_document", "create_document", "update_document", "edit_document",
"search_chats", "search_chats",
"chat_with_model", "create_session", "list_sessions", "chat_with_model", "create_session", "list_sessions",
+98
View File
@@ -0,0 +1,98 @@
"""Agent tool to inspect and control detached background `bash` jobs.
`bash` blocks prefixed with a `#!bg` marker run detached via `src.bg_jobs`; the
agent is auto-re-invoked with the output when they finish. This tool covers the
gaps in that flow: list the jobs in the current chat, read a still-running job's
output on demand, and kill a runaway job instead of waiting out its max-runtime.
Registry tool (`TOOL_HANDLERS["manage_bg_jobs"]`). Jobs are scoped to the chat
that launched them, so every action requires the caller's `session_id` and a job
from another session is treated as not found.
"""
import json
import time
from typing import Any, Dict, List
_LIST_ACTIONS = {"list", "ls", "jobs"}
_OUTPUT_ACTIONS = {"output", "get", "read", "tail", "status", "show"}
_KILL_ACTIONS = {"kill", "stop", "cancel", "terminate"}
def _age(rec: Dict[str, Any]) -> str:
start = rec.get("started_at")
if not start:
return "?"
secs = int(time.time() - start)
if secs < 60:
return f"{secs}s"
if secs < 3600:
return f"{secs // 60}m"
return f"{secs // 3600}h{(secs % 3600) // 60}m"
def _status_label(rec: Dict[str, Any]) -> str:
status = rec.get("status", "?")
if rec.get("killed"):
return "killed"
if rec.get("timed_out"):
return "timed out"
if rec.get("died"):
return "died"
if status in ("done", "failed"):
return f"{status} (exit {rec.get('exit_code')})"
return status
def _row(rec: Dict[str, Any]) -> str:
cmd = (rec.get("command") or "").strip().splitlines()[0][:80]
return f"[{rec.get('id')}] {_status_label(rec)} | {_age(rec)} | {cmd}"
class ManageBgJobsTool:
async def execute(self, content: str, ctx: dict) -> dict:
from src import bg_jobs
session_id = ctx.get("session_id")
raw = (content or "").strip()
try:
args = json.loads(raw) if raw else {}
except (ValueError, TypeError):
args = {}
if not isinstance(args, dict):
args = {}
action = str(args.get("action", "list")).strip().lower()
job_id = str(args.get("job_id") or args.get("id") or "").strip()
if not session_id:
return {"error": "manage_bg_jobs: no active chat session; background jobs are scoped to a chat.", "exit_code": 1}
if action in _LIST_ACTIONS:
jobs: List[Dict[str, Any]] = bg_jobs.list_for_session(session_id)
if not jobs:
return {"output": "No background jobs in this chat.", "exit_code": 0}
jobs.sort(key=lambda r: r.get("started_at") or 0, reverse=True)
lines = "\n".join(_row(r) for r in jobs)
return {"output": f"{len(jobs)} background job(s):\n{lines}", "exit_code": 0}
if action in _OUTPUT_ACTIONS or action in _KILL_ACTIONS:
if not job_id:
return {"error": f"manage_bg_jobs: action '{action}' requires a job_id (see action='list').", "exit_code": 1}
rec = bg_jobs.get(job_id)
# Scope: only the chat that launched a job may see or control it.
if rec is None or rec.get("session_id") != session_id:
return {"error": f"manage_bg_jobs: no background job '{job_id}' in this chat.", "exit_code": 1}
if action in _KILL_ACTIONS:
if rec.get("status") != "running":
return {"output": f"Job `{job_id}` already {_status_label(rec)}; nothing to kill.", "exit_code": 0}
killed = bg_jobs.kill(job_id)
return {"output": f"Killed background job `{job_id}` ({(killed or {}).get('command', '').splitlines()[0][:80]}).", "exit_code": 0}
out = rec.get("output") or "(no output yet)"
return {
"output": f"Job `{job_id}` [{_status_label(rec)}, {_age(rec)}]\nCommand: {rec.get('command')}\n\nOutput:\n{out}",
"exit_code": 0,
}
return {"error": f"manage_bg_jobs: unknown action '{action}'. Use list, output, or kill.", "exit_code": 1}
+54 -13
View File
@@ -1,6 +1,7 @@
import asyncio import asyncio
import json import json
import os import os
import re
import difflib import difflib
import fnmatch import fnmatch
import shutil import shutil
@@ -16,6 +17,31 @@ _CODENAV_SKIP_DIRS = frozenset({
_CODENAV_MAX_HITS = 200 _CODENAV_MAX_HITS = 200
_CODENAV_MAX_LINE = 400 _CODENAV_MAX_LINE = 400
def _glob_to_regex(pat: str) -> "re.Pattern":
"""Translate a forward-slash glob (**, *, ?) into a compiled regex.
`**/` matches zero or more complete directories.
`*` matches within a single path segment (does not cross /).
"""
i, n, out = 0, len(pat), []
while i < n:
if pat[i : i + 3] == "**/":
out.append("(?:[^/]+/)*")
i += 3
elif pat[i : i + 2] == "**":
out.append(".*")
i += 2
elif pat[i] == "*":
out.append("[^/]*")
i += 1
elif pat[i] == "?":
out.append("[^/]")
i += 1
else:
out.append(re.escape(pat[i]))
i += 1
return re.compile("".join(out))
def _unified_diff(old: str, new: str, path: str) -> Optional[Dict[str, Any]]: def _unified_diff(old: str, new: str, path: str) -> Optional[Dict[str, Any]]:
if old == new: if old == new:
return None return None
@@ -259,23 +285,38 @@ class GlobTool:
return {"error": f"glob: {e}", "exit_code": 1} return {"error": f"glob: {e}", "exit_code": 1}
def _glob(): def _glob():
from pathlib import Path base = os.path.abspath(root)
base = Path(root) if not os.path.isdir(base):
if not base.is_dir():
return None, f"glob: {root}: not a directory" return None, f"glob: {root}: not a directory"
norm_pat = pattern.replace("\\", "/")
# Fast path: literal pattern (no wildcards) → direct path lookup.
if not any(c in norm_pat for c in "*?["):
cand = os.path.normpath(os.path.join(base, norm_pat))
if os.path.exists(cand):
return [cand], None
# Literal not at exact path — fall through to walk so
# e.g. "foo.py" still matches at any depth (like rglob).
# Compile glob to regex: * stays within one segment, **/ spans dirs.
regex = _glob_to_regex(norm_pat)
matched = [] matched = []
cap = _CODENAV_MAX_HITS * 5
try: try:
for p in base.rglob(pattern): for dp, dns, fns in os.walk(base):
if set(p.relative_to(base).parts) & _CODENAV_SKIP_DIRS: # Prune skipped dirs before descending (unlike rglob which
continue # descends first then filters — fatal on large node_modules).
try: dns[:] = [d for d in dns if d not in _CODENAV_SKIP_DIRS]
mtime = p.stat().st_mtime for name in fns + dns:
except OSError: full = os.path.join(dp, name)
mtime = 0 rel = os.path.relpath(full, base).replace(os.sep, "/")
matched.append((mtime, str(p))) if regex.fullmatch(rel) or regex.fullmatch(name):
if len(matched) > _CODENAV_MAX_HITS * 5: try:
mtime = os.stat(full).st_mtime
except OSError:
mtime = 0
matched.append((mtime, full))
if len(matched) > cap:
break break
except (OSError, ValueError) as _e: except OSError as _e:
return None, f"glob: {_e}" return None, f"glob: {_e}"
matched.sort(key=lambda t: t[0], reverse=True) matched.sort(key=lambda t: t[0], reverse=True)
return [pth for _, pth in matched[:_CODENAV_MAX_HITS]], None return [pth for _, pth in matched[:_CODENAV_MAX_HITS]], None
+208
View File
@@ -0,0 +1,208 @@
"""model_interaction_tools.py - agent tools for talking to other models.
Owns the model-interaction tool implementations (chat_with_model, ask_teacher,
list_models) and their handler classes, registered in ``TOOL_HANDLERS``. Part
of the tool -> registry migration (#3629): the implementations were moved here
out of ``src.ai_interaction`` so dispatch flows through the registry instead of
the elif chain / dispatch_ai_tool in tool_execution.py.
Shared helpers that still live in ``src.ai_interaction`` and are used by tools
not yet migrated (``_resolve_model``, ``AI_CHAT_TIMEOUT``) are imported lazily
inside the functions to avoid an import cycle at module load.
"""
import logging
from typing import Dict, Optional
logger = logging.getLogger(__name__)
_TEACHER_SYSTEM_PROMPT = (
"You are a senior AI mentor. A less capable model is stuck on a problem and asking for help. "
"Provide clear, actionable guidance:\n"
"1. Brief analysis of the problem\n"
"2. Recommended approach (step by step)\n"
"3. Key things to watch out for\n\n"
"Be concise and practical. No preamble."
)
async def chat_with_model(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Send a message to a specific model and return its response.
Content format:
Line 1: model_name (or model_name@endpoint_name)
Line 2+: the message to send
"""
from src.ai_interaction import _resolve_model, AI_CHAT_TIMEOUT
from src.llm_core import llm_call_async
lines = content.strip().split("\n", 1)
if not lines or not lines[0].strip():
return {"error": "First line must be the model name"}
model_spec = lines[0].strip()
message = lines[1].strip() if len(lines) > 1 else ""
if not message:
return {"error": "No message provided (line 2+ is the message)"}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
try:
response = await llm_call_async(
url, model,
[{"role": "user", "content": message}],
headers=headers,
timeout=AI_CHAT_TIMEOUT,
)
# Truncate very long responses
if len(response) > 10000:
response = response[:10000] + "\n... (truncated)"
return {"model": model, "response": response}
except Exception as e:
logger.error(f"chat_with_model failed: {e}")
return {"error": f"Failed to get response from {model_spec}: {e}"}
async def ask_teacher(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Ask a more capable model for help.
Content format:
Line 1: model_name (or 'auto')
Line 2+: the problem description
"""
from src.ai_interaction import _resolve_model, AI_CHAT_TIMEOUT
from src.llm_core import llm_call_async
from src.settings import get_setting
lines = content.strip().split("\n", 1)
model_spec = lines[0].strip() if lines else "auto"
problem = lines[1].strip() if len(lines) > 1 else ""
if not problem:
return {"error": "No problem description provided"}
if model_spec.lower() in ("auto", ""):
model_spec = get_setting("teacher_model", "")
if not model_spec:
return {"error": "No teacher model configured. Specify a model name or set teacher_model in settings."}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
try:
response = await llm_call_async(
url, model,
[
{"role": "system", "content": _TEACHER_SYSTEM_PROMPT},
{"role": "user", "content": f"Problem:\n{problem}"},
],
headers=headers,
timeout=AI_CHAT_TIMEOUT,
)
if len(response) > 8000:
response = response[:8000] + "\n... (truncated)"
return {"model": model, "response": response, "teacher": True}
except Exception as e:
logger.error(f"ask_teacher failed: {e}")
return {"error": f"Teacher call failed ({model_spec}): {e}"}
async def list_models(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""List all available models across configured endpoints.
Content = optional filter keyword.
"""
import json
import httpx
from src.database import SessionLocal, ModelEndpoint
from src.llm_core import _detect_provider, ANTHROPIC_MODELS
from src.auth_helpers import owner_filter
from src.endpoint_resolver import resolve_endpoint_runtime, build_headers, build_models_url
keyword = content.strip().lower() if content.strip() else None
db = SessionLocal()
try:
query = db.query(ModelEndpoint).filter(ModelEndpoint.is_enabled == True)
if owner:
query = owner_filter(query, ModelEndpoint, owner)
endpoints = query.all()
if not endpoints:
return {"results": "No enabled model endpoints configured."}
result_lines = []
total_models = 0
for ep in endpoints:
try:
base, api_key = resolve_endpoint_runtime(ep, owner=owner)
except Exception:
continue
provider = _detect_provider(base)
headers = build_headers(api_key, base)
model_ids = []
if provider == "anthropic":
model_ids = list(ANTHROPIC_MODELS)
else:
try:
models_url = build_models_url(base)
if models_url:
r = httpx.get(models_url, headers=headers, timeout=5)
r.raise_for_status()
data = r.json()
model_ids = [m.get("id") for m in (data.get("data") or []) if m.get("id")]
if not model_ids:
model_ids = [
m.get("name") or m.get("model")
for m in (data.get("models") or [])
if m.get("name") or m.get("model")
]
else:
model_ids = json.loads(ep.cached_models or "[]")
except Exception:
model_ids = ["(endpoint offline)"]
if keyword:
model_ids = [m for m in model_ids if keyword in m.lower() or keyword in (ep.name or "").lower()]
if model_ids:
result_lines.append(f"\n**{ep.name or base}** ({provider}):")
for mid in model_ids:
result_lines.append(f" - `{mid}`")
total_models += 1
if not result_lines:
return {"results": "No models found" + (f" matching '{keyword}'" if keyword else "") + "."}
header = f"Available models ({total_models} total):"
return {"results": header + "\n".join(result_lines)}
except Exception as e:
logger.error(f"list_models failed: {e}")
return {"error": str(e)}
finally:
db.close()
# ---------------------------------------------------------------------------
# Handler classes registered in TOOL_HANDLERS
# ---------------------------------------------------------------------------
class ChatWithModelTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await chat_with_model(content, ctx.get("session_id"), owner=ctx.get("owner"))
class AskTeacherTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await ask_teacher(content, ctx.get("session_id"), owner=ctx.get("owner"))
class ListModelsTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await list_models(content, ctx.get("session_id"), owner=ctx.get("owner"))
+464
View File
@@ -0,0 +1,464 @@
"""session_tools.py - agent tools for AI-to-AI session management.
Owns create_session, list_sessions, send_to_session and manage_session, moved
out of src.ai_interaction as part of the tool -> registry migration (#3629), and
their handler classes registered in TOOL_HANDLERS.
The session manager is a runtime-set singleton in src.ai_interaction, so each
function fetches it via get_session_manager() (imported here); _resolve_model and
AI_CHAT_TIMEOUT are reused from there too.
"""
import json
import logging
import uuid
from typing import Dict, Optional
from src.ai_interaction import get_session_manager, _resolve_model, AI_CHAT_TIMEOUT
logger = logging.getLogger(__name__)
async def create_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Create a new chat session.
Content format:
Line 1: session name
Line 2: model_name (or model_name@endpoint_name)
"""
_session_manager = get_session_manager()
if not _session_manager:
return {"error": "Session manager not available"}
lines = content.strip().split("\n")
if len(lines) < 2:
return {"error": "Need 2 lines: session name, then model spec"}
name = lines[0].strip()
model_spec = lines[1].strip()
if not name:
return {"error": "Session name cannot be empty"}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
sid = str(uuid.uuid4())[:8]
try:
_session_manager.create_session(
session_id=sid,
name=name,
endpoint_url=url,
model=model,
rag=False,
owner=owner,
)
# Store headers on session for future calls
sess = _session_manager.get_session(sid)
if sess and headers:
sess.headers = headers
try:
from src.event_bus import fire_event
fire_event("session_created", owner)
except Exception:
logger.debug("session_created event dispatch failed", exc_info=True)
return {"session_id": sid, "name": name, "model": model, "endpoint_url": url}
except Exception as e:
logger.error(f"create_session failed: {e}")
return {"error": f"Failed to create session: {e}"}
async def list_sessions(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""List sessions sorted by most-recently-active first.
Output includes a relative "last active" timestamp per row so the
agent can answer "open my last chat" without guessing from titles.
The most-recent session is always first in the list.
Content = optional filter keyword (matches session name).
"""
_session_manager = get_session_manager()
if not _session_manager:
return {"error": "Session manager not available"}
keyword = content.strip().lower() if content.strip() else None
try:
from core.database import SessionLocal, Session as DbSession
from datetime import datetime, timezone
# Pull every session's last_accessed from the DB so we can sort
# by recency. In-memory sessions hold name + model + msg_count;
# the DB row holds the timestamps.
db = SessionLocal()
try:
db_rows = {r.id: r for r in db.query(DbSession).all()}
finally:
db.close()
# SECURITY: scope to the caller's sessions. Passing None returned
# every user's sessions, which the agent tool then exposed via the
# "list my chats" reply.
sessions = _session_manager.get_sessions_for_user(owner)
rows = []
for sid, sess in sessions.items():
if keyword and keyword not in (sess.name or "").lower():
continue
db_row = db_rows.get(sid)
# Prefer last_accessed; fall back to updated_at, then created_at.
ts = None
if db_row:
ts = getattr(db_row, 'last_accessed', None) or getattr(db_row, 'updated_at', None) or getattr(db_row, 'created_at', None)
rows.append((ts, sid, sess))
# Sort by timestamp DESC; rows without a timestamp sink to the bottom.
rows.sort(key=lambda r: r[0] or datetime.min, reverse=True)
def _rel(ts):
if not ts:
return 'never'
now = datetime.utcnow()
try:
if ts.tzinfo is not None:
now = datetime.now(timezone.utc)
diff = (now - ts).total_seconds()
except Exception:
return 'unknown'
if diff < 60: return 'just now'
if diff < 3600: return f'{int(diff / 60)}m ago'
if diff < 86400: return f'{int(diff / 3600)}h ago'
if diff < 86400 * 7: return f'{int(diff / 86400)}d ago'
return ts.strftime('%Y-%m-%d')
lines = []
for i, (ts, sid, sess) in enumerate(rows):
if i >= 50:
lines.append(f"... and {len(rows) - 50} more (showing first 50)")
break
safe_name = (sess.name or "Untitled").replace("[", "\\[").replace("]", "\\]")
msg_count = getattr(sess, "message_count", 0) or 0
model = getattr(sess, "model", "unknown")
marker = " ← most recent" if i == 0 else ""
lines.append(f"- **[{safe_name}](#session-{sid})** (id: `{sid}`, model: {model}, {msg_count} msgs, last active {_rel(ts)}){marker}")
if not lines:
return {"results": "No sessions found" + (f" matching '{keyword}'" if keyword else "") + "."}
return {
"results": (
f"Found {len(rows)} session(s), sorted most-recent first:\n"
+ "\n".join(lines)
+ "\n\nAssistant: when replying to the user, preserve the chat-title markdown links exactly as shown, e.g. `[Chat](#session-id)`. Do not rewrite this as a plain, non-clickable table."
)
}
except Exception as e:
logger.error(f"list_sessions failed: {e}")
return {"error": str(e)}
async def send_to_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Send a message to an existing session and get a response.
Content format:
Line 1: session_id
Line 2+: message
"""
_session_manager = get_session_manager()
from src.llm_core import llm_call_async
from core.models import ChatMessage
if not _session_manager:
return {"error": "Session manager not available"}
lines = content.strip().split("\n", 1)
if len(lines) < 2:
return {"error": "Need 2 lines: session_id, then message"}
target_sid = lines[0].strip()
message = lines[1].strip()
sess = _session_manager.get_session(target_sid)
if not sess:
return {"error": f"Session '{target_sid}' not found"}
# Owner-scope: reject access to another user's session
if owner and getattr(sess, "owner", None) and sess.owner != owner:
return {"error": f"Session '{target_sid}' not found"}
if not message:
return {"error": "No message provided"}
try:
# Build context from session history
context = sess.get_context_messages()
context.append({"role": "user", "content": message})
response = await llm_call_async(
sess.endpoint_url, sess.model, context,
headers=sess.headers,
timeout=AI_CHAT_TIMEOUT,
)
# Save both messages to session
sess.add_message(ChatMessage("user", message))
sess.add_message(ChatMessage("assistant", response))
# Truncate for tool output
if len(response) > 10000:
response = response[:10000] + "\n... (truncated)"
return {
"session_id": target_sid,
"session_name": sess.name,
"response": response,
}
except Exception as e:
logger.error(f"send_to_session failed: {e}")
return {"error": f"Failed to send to session: {e}"}
async def manage_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Manage sessions: rename, archive, delete, important, truncate, fork.
Content format:
Line 1: action (rename|archive|unarchive|delete|important|unimportant|truncate|fork)
Line 2: target session_id (or "current" to use the active session)
Line 3+: action-specific params (e.g. new name for rename, keep_count for truncate)
"""
_session_manager = get_session_manager()
if not _session_manager:
return {"error": "Session manager not available"}
from src.database import SessionLocal, Session as DbSession
# Accept BOTH the structured JSON args the tool schema advertises
# ({action, session_id, value}) AND the legacy line-based format
# (line1=action, line2=session_id, line3=value). Native function-calling
# models send JSON; fenced-block callers send lines. Previously only the
# line format was parsed, so a model that followed the schema (JSON) got
# "Need at least 2 lines" / "Rename needs line 3" and couldn't drive it.
_raw = (content or "").strip()
action = ""
target_sid = ""
value = None # the action param: new name (rename) / keep_count (truncate, fork)
_list_filter = ""
_parsed = None
if _raw.startswith("{"):
try:
_parsed = json.loads(_raw)
except Exception:
_parsed = None
if isinstance(_parsed, dict):
action = str(_parsed.get("action") or "").strip().lower()
target_sid = str(_parsed.get("session_id") or _parsed.get("session") or _parsed.get("id") or "").strip()
_v = _parsed.get("value")
if _v is None:
_v = (_parsed.get("name") or _parsed.get("new_name")
or _parsed.get("title") or _parsed.get("keep_count"))
value = None if _v is None else str(_v).strip()
_list_filter = str(_parsed.get("filter") or "").strip()
else:
lines = _raw.split("\n")
if not lines or not lines[0].strip():
return {"error": "Missing action (rename|archive|delete|important|truncate|fork|list|switch)"}
action = lines[0].strip().lower()
target_sid = lines[1].strip() if len(lines) >= 2 else ""
value = lines[2].strip() if len(lines) >= 3 else None
_list_filter = "\n".join(lines[1:]).strip()
if not action:
return {"error": "Missing action (rename|archive|delete|important|truncate|fork|list|switch)"}
# `list` alias - dispatch to list_sessions so the agent's natural
# first guess (every other manage_* tool has a `list` action) works.
if action == "list":
return await list_sessions(_list_filter, session_id, owner=owner)
if not target_sid:
return {"error": "Need a session_id (or 'current' for the active chat)"}
# Allow "current" to refer to the active session
if target_sid.lower() == "current" and session_id:
target_sid = session_id
# `switch` / `open` / `select` / `view` - the agent reaches for
# these when the user asks to "open" or "switch to" a session.
# There's no server-side way to make the browser navigate, so we
# just return a clickable anchor link the user can click. The
# frontend's chat-history click delegate routes `#session-<id>`
# to selectSession(). The agent's reply naturally embeds this
# result so the user sees a single clickable line.
def _session_query(db):
query = db.query(DbSession).filter(DbSession.id == target_sid)
if owner is not None:
query = query.filter(DbSession.owner == owner)
return query
if action in ("switch", "open", "select", "view"):
db = SessionLocal()
try:
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
name = db_sess.name or target_sid
finally:
db.close()
return {
"action": action,
"session_id": target_sid,
"name": name,
"results": f"[{name}](#session-{target_sid}) - click to open.",
}
db = SessionLocal()
try:
if action == "rename":
if not value:
return {"error": "rename needs a new name (the `value` arg, or line 3 in the legacy format)"}
new_name = value
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.name = new_name
db.commit()
_session_manager.update_session_name(target_sid, new_name)
return {"action": "rename", "session_id": target_sid, "name": new_name,
"results": f"Session renamed to '{new_name}'"}
elif action == "archive":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.archived = True
db.commit()
return {"action": "archive", "session_id": target_sid,
"results": f"Session '{db_sess.name}' archived"}
elif action == "unarchive":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.archived = False
db.commit()
return {"action": "unarchive", "session_id": target_sid,
"results": f"Session '{db_sess.name}' unarchived"}
elif action == "delete":
if target_sid == session_id:
return {"error": "Cannot delete the current session while chatting in it. Delete other sessions first."}
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Refusing to delete an unknown chat id; use the exact id from list_sessions."}
if db_sess and db_sess.is_important:
return {"error": f"Session '{db_sess.name}' is starred/favorited. Unstar it first before deleting."}
try:
ok = _session_manager.delete_session(target_sid)
if not ok:
return {"error": f"Session '{target_sid}' was not deleted because it no longer exists."}
return {"action": "delete", "session_id": target_sid,
"results": f"Session '{db_sess.name or target_sid}' deleted"}
except Exception as e:
return {"error": f"Failed to delete session: {e}"}
elif action in ("important", "unimportant"):
is_important = action == "important"
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
# Prevent AI from unstarring sessions - only the user can do that manually
if not is_important and db_sess.is_important:
return {"error": f"Session '{db_sess.name}' is starred by the user. Only the user can unstar sessions manually."}
db_sess.is_important = is_important
db.commit()
status = "marked as important" if is_important else "unmarked as important"
return {"action": action, "session_id": target_sid,
"results": f"Session '{db_sess.name}' {status}"}
elif action == "truncate":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
keep_count = 10
if value:
try:
keep_count = int(value)
except ValueError:
pass
success = _session_manager.truncate_messages(target_sid, keep_count)
if success:
return {"action": "truncate", "session_id": target_sid,
"results": f"Session truncated to last {keep_count} messages"}
return {"error": f"Failed to truncate session '{target_sid}'"}
elif action == "fork":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
keep_count = 0 # 0 = all messages
if value:
try:
keep_count = int(value)
except ValueError:
pass
source = _session_manager.get_session(target_sid)
if not source:
return {"error": f"Session '{target_sid}' not found"}
new_sid = str(uuid.uuid4())[:8]
_session_manager.create_session(
session_id=new_sid,
name=f"Fork: {source.name}",
endpoint_url=source.endpoint_url,
model=source.model,
rag=False,
owner=owner,
)
# Copy messages
history = source.get_context_messages()
if keep_count > 0:
history = history[:keep_count]
from core.models import ChatMessage as InMemoryMsg
new_sess = _session_manager.get_session(new_sid)
for msg in history:
new_sess.add_message(InMemoryMsg(msg["role"], msg["content"]))
try:
from src.event_bus import fire_event
fire_event("session_created", owner)
except Exception:
logger.debug("session_created event dispatch failed", exc_info=True)
return {"action": "fork", "session_id": new_sid,
"source_session": target_sid, "messages_copied": len(history),
"results": f"Forked session '{source.name}' -> new session {new_sid} ({len(history)} messages)"}
else:
return {"error": f"Unknown action '{action}'. Use: list, switch, rename, archive, unarchive, delete, important, unimportant, truncate, fork"}
except Exception as e:
logger.error(f"manage_session failed: {e}")
return {"error": str(e)}
finally:
db.close()
# ---------------------------------------------------------------------------
# Handler classes registered in TOOL_HANDLERS
# ---------------------------------------------------------------------------
class CreateSessionTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await create_session(content, ctx.get("session_id"), owner=ctx.get("owner"))
class ListSessionsTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await list_sessions(content, ctx.get("session_id"), owner=ctx.get("owner"))
class SendToSessionTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await send_to_session(content, ctx.get("session_id"), owner=ctx.get("owner"))
class ManageSessionTool:
async def execute(self, content: str, ctx: dict) -> Dict:
return await manage_session(content, ctx.get("session_id"), owner=ctx.get("owner"))
+65 -13
View File
@@ -7,6 +7,7 @@ from src.constants import MAX_OUTPUT_CHARS
class WebSearchTool: class WebSearchTool:
async def execute(self, content: str, ctx: dict) -> dict: async def execute(self, content: str, ctx: dict) -> dict:
from src.search import comprehensive_web_search from src.search import comprehensive_web_search
progress_cb = ctx.get("progress_cb") if isinstance(ctx, dict) else None
raw = content.strip() raw = content.strip()
query = raw query = raw
time_filter = None time_filter = None
@@ -37,18 +38,39 @@ class WebSearchTool:
elif " news" in q_lc or q_lc.startswith("news ") or q_lc.endswith(" news"): elif " news" in q_lc or q_lc.startswith("news ") or q_lc.endswith(" news"):
time_filter = "week" time_filter = "week"
loop = asyncio.get_running_loop() loop = asyncio.get_running_loop()
text, sources = await asyncio.wait_for( if progress_cb:
loop.run_in_executor( await progress_cb({
None, "elapsed_s": 0,
lambda: comprehensive_web_search( "tail": f"Searching web for: {query[:160]}",
query, })
max_pages=max_pages, try:
time_filter=time_filter, text, sources = await asyncio.wait_for(
return_sources=True, loop.run_in_executor(
None,
lambda: comprehensive_web_search(
query,
max_pages=max_pages,
time_filter=time_filter,
return_sources=True,
),
), ),
), timeout=30,
timeout=30, )
) except asyncio.TimeoutError:
return {
"error": f"web_search timed out after 30s: {query[:200]}",
"exit_code": 1,
}
except Exception as e:
return {
"error": f"web_search failed: {type(e).__name__}: {str(e) or 'no details'}",
"exit_code": 1,
}
if progress_cb:
await progress_cb({
"elapsed_s": 30,
"tail": "Search completed; preparing sources.",
})
output = text[:MAX_OUTPUT_CHARS] if len(text) > MAX_OUTPUT_CHARS else text output = text[:MAX_OUTPUT_CHARS] if len(text) > MAX_OUTPUT_CHARS else text
if sources: if sources:
output += "\n\n<!-- SOURCES:" + json.dumps(sources) + " -->" output += "\n\n<!-- SOURCES:" + json.dumps(sources) + " -->"
@@ -57,13 +79,23 @@ class WebSearchTool:
class WebFetchTool: class WebFetchTool:
async def execute(self, content: str, ctx: dict) -> dict: async def execute(self, content: str, ctx: dict) -> dict:
from src.search.content import fetch_webpage_content from src.search.content import fetch_webpage_content
from src.constants import WEB_FETCH_HARD_MAX_BYTES
raw = content.strip() raw = content.strip()
url = "" url = ""
max_bytes = None
if raw.startswith("{"): if raw.startswith("{"):
try: try:
parsed = json.loads(raw) parsed = json.loads(raw)
if isinstance(parsed, dict): if isinstance(parsed, dict):
url = str(parsed.get("url") or "").strip() url = str(parsed.get("url") or "").strip()
# Download-budget override (#3812): "full": true raises the
# budget to the hard cap; an explicit max_bytes is clamped
# to the hard cap downstream. Default stays the soft cap.
if parsed.get("full") is True:
max_bytes = WEB_FETCH_HARD_MAX_BYTES
mb = parsed.get("max_bytes")
if isinstance(mb, int) and mb > 0:
max_bytes = mb
except json.JSONDecodeError: except json.JSONDecodeError:
url = "" url = ""
if not url: if not url:
@@ -78,7 +110,7 @@ class WebFetchTool:
loop = asyncio.get_running_loop() loop = asyncio.get_running_loop()
try: try:
result = await asyncio.wait_for( result = await asyncio.wait_for(
loop.run_in_executor(None, lambda: fetch_webpage_content(url, timeout=10)), loop.run_in_executor(None, lambda: fetch_webpage_content(url, timeout=10, max_bytes=max_bytes)),
timeout=30, timeout=30,
) )
except asyncio.TimeoutError: except asyncio.TimeoutError:
@@ -94,8 +126,28 @@ class WebFetchTool:
return {"error": f"web_fetch: {url}: {err}", "exit_code": 1} return {"error": f"web_fetch: {url}: {err}", "exit_code": 1}
return {"error": f"web_fetch: {url}: no readable text content (not HTML, or the page needs JS/login)", "exit_code": 1} return {"error": f"web_fetch: {url}: no readable text content (not HTML, or the page needs JS/login)", "exit_code": 1}
# Tell the model when the download budget cut the body short and how
# to get the rest, instead of silently presenting a partial page as
# the whole thing.
size_note = ""
if result.get("truncated"):
fetched = result.get("fetched_bytes") or 0
total = result.get("total_bytes")
total_txt = f" of {total:,} bytes" if total else ""
size_note = (
f"[partial content: download stopped at {fetched:,} bytes{total_txt}. "
f'Re-call with {{"url": "{url}", "full": true}} to fetch up to '
f"{WEB_FETCH_HARD_MAX_BYTES:,} bytes.]\n\n"
)
# The notice must lead the output so the MAX_OUTPUT_CHARS trim below can
# never drop it. The title is untrusted, uncapped page content, so a
# giant title ahead of the notice could push it out of range; keep the
# notice first and cap the title as a second guard.
if len(title) > 300:
title = title[:300] + "..."
header = (f"# {title}\n" if title else "") + f"Source: {url}\n\n" header = (f"# {title}\n" if title else "") + f"Source: {url}\n\n"
output = header + text output = size_note + header + text
if len(output) > MAX_OUTPUT_CHARS: if len(output) > MAX_OUTPUT_CHARS:
output = output[:MAX_OUTPUT_CHARS] + "\n\n[...truncated]" output = output[:MAX_OUTPUT_CHARS] + "\n\n[...truncated]"
return {"output": output, "exit_code": 0} return {"output": output, "exit_code": 0}
+21 -775
View File
@@ -1,8 +1,14 @@
""" """
ai_interaction.py ai_interaction.py
AI-to-AI interaction tools: chat_with_model, create_session, list_sessions, AI-to-AI interaction tools: pipeline and manage_memory, plus shared model
send_to_session, pipeline. resolution (_resolve_model), the session-manager singleton, and dispatch_ai_tool.
As part of the tool -> registry migration (#3629), chat_with_model, ask_teacher
and list_models moved to src/agent_tools/model_interaction_tools.py, and
create_session, list_sessions, send_to_session and manage_session moved to
src/agent_tools/session_tools.py. Those modules reuse get_session_manager /
_resolve_model / AI_CHAT_TIMEOUT from here.
These are agent tools the LLM writes fenced code blocks and they execute These are agent tools the LLM writes fenced code blocks and they execute
through the standard agent_tools.py pipeline. through the standard agent_tools.py pipeline.
@@ -159,440 +165,6 @@ def _resolve_model(spec: str, owner: Optional[str] = None) -> Tuple[str, str, Di
# Tool implementations # Tool implementations
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
async def do_chat_with_model(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Send a message to a specific model and return its response.
Content format:
Line 1: model_name (or model_name@endpoint_name)
Line 2+: the message to send
"""
from src.llm_core import llm_call_async
lines = content.strip().split("\n", 1)
if not lines or not lines[0].strip():
return {"error": "First line must be the model name"}
model_spec = lines[0].strip()
message = lines[1].strip() if len(lines) > 1 else ""
if not message:
return {"error": "No message provided (line 2+ is the message)"}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
try:
response = await llm_call_async(
url, model,
[{"role": "user", "content": message}],
headers=headers,
timeout=AI_CHAT_TIMEOUT,
)
# Truncate very long responses
if len(response) > 10000:
response = response[:10000] + "\n... (truncated)"
return {"model": model, "response": response}
except Exception as e:
logger.error(f"chat_with_model failed: {e}")
return {"error": f"Failed to get response from {model_spec}: {e}"}
_TEACHER_SYSTEM_PROMPT = (
"You are a senior AI mentor. A less capable model is stuck on a problem and asking for help. "
"Provide clear, actionable guidance:\n"
"1. Brief analysis of the problem\n"
"2. Recommended approach (step by step)\n"
"3. Key things to watch out for\n\n"
"Be concise and practical. No preamble."
)
async def do_ask_teacher(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Ask a more capable model for help.
Content format:
Line 1: model_name (or 'auto')
Line 2+: the problem description
"""
from src.llm_core import llm_call_async
from src.settings import get_setting
lines = content.strip().split("\n", 1)
model_spec = lines[0].strip() if lines else "auto"
problem = lines[1].strip() if len(lines) > 1 else ""
if not problem:
return {"error": "No problem description provided"}
if model_spec.lower() in ("auto", ""):
model_spec = get_setting("teacher_model", "")
if not model_spec:
return {"error": "No teacher model configured. Specify a model name or set teacher_model in settings."}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
try:
response = await llm_call_async(
url, model,
[
{"role": "system", "content": _TEACHER_SYSTEM_PROMPT},
{"role": "user", "content": f"Problem:\n{problem}"},
],
headers=headers,
timeout=AI_CHAT_TIMEOUT,
)
if len(response) > 8000:
response = response[:8000] + "\n... (truncated)"
return {"model": model, "response": response, "teacher": True}
except Exception as e:
logger.error(f"ask_teacher failed: {e}")
return {"error": f"Teacher call failed ({model_spec}): {e}"}
async def do_second_opinion(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Get a second opinion from another model, then have the original model
evaluate the feedback and produce a unified version.
Content format:
Line 1: model_name (or model_name@endpoint_name)
Line 2+ (optional): specific question or focus area
Flow:
1. Pull recent conversation context
2. Send to reviewer model get honest feedback
3. Send feedback back to the session's own model → evaluate & unify
4. Return both the review and the unified response
"""
from src.llm_core import llm_call_async
lines = content.strip().split("\n", 1)
if not lines or not lines[0].strip():
return {"error": "First line must be the model name"}
model_spec = lines[0].strip()
focus = lines[1].strip() if len(lines) > 1 else ""
try:
reviewer_url, reviewer_model, reviewer_headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
# Pull recent conversation context from current session
context_text = ""
sess = None
if session_id and _session_manager:
sess = _session_manager.get_session(session_id)
if sess:
messages = sess.get_context_messages()
recent = messages[-15:] if len(messages) > 15 else messages
parts = []
for m in recent:
role = m.get("role", "unknown").upper()
text = m.get("content", "")
if isinstance(text, list):
text = " ".join(
p.get("text", "") for p in text if isinstance(p, dict)
)
if text:
parts.append(f"[{role}]: {text[:2000]}")
context_text = "\n\n".join(parts)
if not context_text:
return {"error": "No conversation context found to review"}
# ── Step 1: Get the reviewer's feedback ──
reviewer_system = (
"You are giving a second opinion on a conversation between a user and an AI assistant. "
"Your job is to be genuinely helpful and honest — not a yes-man, but not a contrarian either.\n\n"
"Guidelines:\n"
"- If the plan/idea is solid, say so clearly. Don't manufacture problems that aren't there.\n"
"- If you spot a real flaw, blind spot, or simpler approach — call it out directly.\n"
"- Be practical. Don't over-engineer or over-analyze. Real-world tradeoffs matter.\n"
"- If there's a meaningfully better way to do something, suggest it concretely.\n"
"- Give credit where it's due — highlight what's working well.\n"
"- Keep it concise and actionable. No fluff.\n"
"- You're a second pair of eyes, not a professor grading a paper."
)
reviewer_message = f"Here's the conversation so far:\n\n{context_text}"
if focus:
reviewer_message += f"\n\n---\nSpecifically, I want your take on: {focus}"
else:
reviewer_message += "\n\n---\nGive me your honest second opinion on what's being discussed."
try:
review = await llm_call_async(
reviewer_url, reviewer_model,
[
{"role": "system", "content": reviewer_system},
{"role": "user", "content": reviewer_message},
],
headers=reviewer_headers,
timeout=AI_CHAT_TIMEOUT,
)
if len(review) > 8000:
review = review[:8000] + "\n... (truncated)"
except Exception as e:
logger.error(f"second_opinion reviewer call failed: {e}")
return {"error": f"Failed to get second opinion from {model_spec}: {e}"}
# ── Step 2: Send review back to session's own model for evaluation ──
unified = ""
original_model = "unknown"
if sess:
original_url = sess.endpoint_url
original_model = sess.model
original_headers = getattr(sess, "headers", None) or {}
unify_system = (
"Another AI model just reviewed the conversation you've been having with the user. "
"Read their feedback carefully, then respond with:\n\n"
"1. **What you agree with** — acknowledge valid points honestly.\n"
"2. **What you disagree with** — explain why, briefly.\n"
"3. **Unified version** — produce an updated/refined version of whatever was being discussed, "
"incorporating the feedback you found valid. Don't accept every note blindly — "
"use your judgment on what actually improves things vs what's unnecessary.\n\n"
"Be concise and practical. The user wants a better result, not a meta-discussion."
)
unify_message = (
f"Here's the conversation context:\n\n{context_text}\n\n"
f"---\n\n"
f"**Review from {reviewer_model}:**\n\n{review}\n\n"
f"---\n\n"
f"Evaluate this feedback and produce a unified improved version."
)
try:
unified = await llm_call_async(
original_url, original_model,
[
{"role": "system", "content": unify_system},
{"role": "user", "content": unify_message},
],
headers=original_headers,
timeout=AI_CHAT_TIMEOUT,
)
if len(unified) > 10000:
unified = unified[:10000] + "\n... (truncated)"
except Exception as e:
logger.error(f"second_opinion unify call failed: {e}")
unified = f"(Failed to get unified response: {e})"
# Build combined result
combined = (
f"## Second Opinion from {reviewer_model}\n\n{review}"
f"\n\n---\n\n"
f"## {original_model}'s Response\n\n{unified}"
)
return {
"model": reviewer_model,
"response": combined,
"instruction": "Present these results to the user exactly as they are. Do NOT call second_opinion again. The user can continue the conversation from here.",
}
async def do_create_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Create a new chat session.
Content format:
Line 1: session name
Line 2: model_name (or model_name@endpoint_name)
"""
if not _session_manager:
return {"error": "Session manager not available"}
lines = content.strip().split("\n")
if len(lines) < 2:
return {"error": "Need 2 lines: session name, then model spec"}
name = lines[0].strip()
model_spec = lines[1].strip()
if not name:
return {"error": "Session name cannot be empty"}
try:
url, model, headers = _resolve_model(model_spec, owner=owner)
except ValueError as e:
return {"error": str(e)}
sid = str(uuid.uuid4())[:8]
try:
_session_manager.create_session(
session_id=sid,
name=name,
endpoint_url=url,
model=model,
rag=False,
owner=owner,
)
# Store headers on session for future calls
sess = _session_manager.get_session(sid)
if sess and headers:
sess.headers = headers
try:
from src.event_bus import fire_event
fire_event("session_created", owner)
except Exception:
logger.debug("session_created event dispatch failed", exc_info=True)
return {"session_id": sid, "name": name, "model": model, "endpoint_url": url}
except Exception as e:
logger.error(f"create_session failed: {e}")
return {"error": f"Failed to create session: {e}"}
async def do_list_sessions(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""List sessions sorted by most-recently-active first.
Output includes a relative "last active" timestamp per row so the
agent can answer "open my last chat" without guessing from titles.
The most-recent session is always first in the list.
Content = optional filter keyword (matches session name).
"""
if not _session_manager:
return {"error": "Session manager not available"}
keyword = content.strip().lower() if content.strip() else None
try:
from core.database import SessionLocal, Session as DbSession
from datetime import datetime, timezone
# Pull every session's last_accessed from the DB so we can sort
# by recency. In-memory sessions hold name + model + msg_count;
# the DB row holds the timestamps.
db = SessionLocal()
try:
db_rows = {r.id: r for r in db.query(DbSession).all()}
finally:
db.close()
# SECURITY: scope to the caller's sessions. Passing None returned
# every user's sessions, which the agent tool then exposed via the
# "list my chats" reply.
sessions = _session_manager.get_sessions_for_user(owner)
rows = []
for sid, sess in sessions.items():
if keyword and keyword not in (sess.name or "").lower():
continue
db_row = db_rows.get(sid)
# Prefer last_accessed; fall back to updated_at, then created_at.
ts = None
if db_row:
ts = getattr(db_row, 'last_accessed', None) or getattr(db_row, 'updated_at', None) or getattr(db_row, 'created_at', None)
rows.append((ts, sid, sess))
# Sort by timestamp DESC; rows without a timestamp sink to the bottom.
rows.sort(key=lambda r: r[0] or datetime.min, reverse=True)
def _rel(ts):
if not ts:
return 'never'
now = datetime.utcnow()
try:
if ts.tzinfo is not None:
now = datetime.now(timezone.utc)
diff = (now - ts).total_seconds()
except Exception:
return 'unknown'
if diff < 60: return 'just now'
if diff < 3600: return f'{int(diff / 60)}m ago'
if diff < 86400: return f'{int(diff / 3600)}h ago'
if diff < 86400 * 7: return f'{int(diff / 86400)}d ago'
return ts.strftime('%Y-%m-%d')
lines = []
for i, (ts, sid, sess) in enumerate(rows):
if i >= 50:
lines.append(f"... and {len(rows) - 50} more (showing first 50)")
break
safe_name = (sess.name or "Untitled").replace("[", "\\[").replace("]", "\\]")
msg_count = getattr(sess, "message_count", 0) or 0
model = getattr(sess, "model", "unknown")
marker = " ← most recent" if i == 0 else ""
lines.append(f"- **[{safe_name}](#session-{sid})** (id: `{sid}`, model: {model}, {msg_count} msgs, last active {_rel(ts)}){marker}")
if not lines:
return {"results": "No sessions found" + (f" matching '{keyword}'" if keyword else "") + "."}
return {
"results": (
f"Found {len(rows)} session(s), sorted most-recent first:\n"
+ "\n".join(lines)
+ "\n\nAssistant: when replying to the user, preserve the chat-title markdown links exactly as shown, e.g. `[Chat](#session-id)`. Do not rewrite this as a plain, non-clickable table."
)
}
except Exception as e:
logger.error(f"list_sessions failed: {e}")
return {"error": str(e)}
async def do_send_to_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Send a message to an existing session and get a response.
Content format:
Line 1: session_id
Line 2+: message
"""
from src.llm_core import llm_call_async
from core.models import ChatMessage
if not _session_manager:
return {"error": "Session manager not available"}
lines = content.strip().split("\n", 1)
if len(lines) < 2:
return {"error": "Need 2 lines: session_id, then message"}
target_sid = lines[0].strip()
message = lines[1].strip()
sess = _session_manager.get_session(target_sid)
if not sess:
return {"error": f"Session '{target_sid}' not found"}
# Owner-scope: reject access to another user's session
if owner and getattr(sess, "owner", None) and sess.owner != owner:
return {"error": f"Session '{target_sid}' not found"}
if not message:
return {"error": "No message provided"}
try:
# Build context from session history
context = sess.get_context_messages()
context.append({"role": "user", "content": message})
response = await llm_call_async(
sess.endpoint_url, sess.model, context,
headers=sess.headers,
timeout=AI_CHAT_TIMEOUT,
)
# Save both messages to session
sess.add_message(ChatMessage("user", message))
sess.add_message(ChatMessage("assistant", response))
# Truncate for tool output
if len(response) > 10000:
response = response[:10000] + "\n... (truncated)"
return {
"session_id": target_sid,
"session_name": sess.name,
"response": response,
}
except Exception as e:
logger.error(f"send_to_session failed: {e}")
return {"error": f"Failed to send to session: {e}"}
async def stream_ai_tool(tool: str, content: str, session_id: Optional[str] = None, owner: Optional[str] = None): async def stream_ai_tool(tool: str, content: str, session_id: Optional[str] = None, owner: Optional[str] = None):
@@ -715,229 +287,6 @@ async def do_pipeline(content: str, session_id: Optional[str] = None, owner: Opt
# Session management tool # Session management tool
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
async def do_manage_session(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""Manage sessions: rename, archive, delete, important, truncate, fork.
Content format:
Line 1: action (rename|archive|unarchive|delete|important|unimportant|truncate|fork)
Line 2: target session_id (or "current" to use the active session)
Line 3+: action-specific params (e.g. new name for rename, keep_count for truncate)
"""
if not _session_manager:
return {"error": "Session manager not available"}
from src.database import SessionLocal, Session as DbSession
# Accept BOTH the structured JSON args the tool schema advertises
# ({action, session_id, value}) AND the legacy line-based format
# (line1=action, line2=session_id, line3=value). Native function-calling
# models send JSON; fenced-block callers send lines. Previously only the
# line format was parsed, so a model that followed the schema (JSON) got
# "Need at least 2 lines" / "Rename needs line 3" and couldn't drive it.
_raw = (content or "").strip()
action = ""
target_sid = ""
value = None # the action param: new name (rename) / keep_count (truncate, fork)
_list_filter = ""
_parsed = None
if _raw.startswith("{"):
try:
_parsed = json.loads(_raw)
except Exception:
_parsed = None
if isinstance(_parsed, dict):
action = str(_parsed.get("action") or "").strip().lower()
target_sid = str(_parsed.get("session_id") or _parsed.get("session") or _parsed.get("id") or "").strip()
_v = _parsed.get("value")
if _v is None:
_v = (_parsed.get("name") or _parsed.get("new_name")
or _parsed.get("title") or _parsed.get("keep_count"))
value = None if _v is None else str(_v).strip()
_list_filter = str(_parsed.get("filter") or "").strip()
else:
lines = _raw.split("\n")
if not lines or not lines[0].strip():
return {"error": "Missing action (rename|archive|delete|important|truncate|fork|list|switch)"}
action = lines[0].strip().lower()
target_sid = lines[1].strip() if len(lines) >= 2 else ""
value = lines[2].strip() if len(lines) >= 3 else None
_list_filter = "\n".join(lines[1:]).strip()
if not action:
return {"error": "Missing action (rename|archive|delete|important|truncate|fork|list|switch)"}
# `list` alias — dispatch to do_list_sessions so the agent's natural
# first guess (every other manage_* tool has a `list` action) works.
if action == "list":
return await do_list_sessions(_list_filter, session_id, owner=owner)
if not target_sid:
return {"error": "Need a session_id (or 'current' for the active chat)"}
# Allow "current" to refer to the active session
if target_sid.lower() == "current" and session_id:
target_sid = session_id
# `switch` / `open` / `select` / `view` — the agent reaches for
# these when the user asks to "open" or "switch to" a session.
# There's no server-side way to make the browser navigate, so we
# just return a clickable anchor link the user can click. The
# frontend's chat-history click delegate routes `#session-<id>`
# to selectSession(). The agent's reply naturally embeds this
# result so the user sees a single clickable line.
def _session_query(db):
query = db.query(DbSession).filter(DbSession.id == target_sid)
if owner is not None:
query = query.filter(DbSession.owner == owner)
return query
if action in ("switch", "open", "select", "view"):
db = SessionLocal()
try:
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
name = db_sess.name or target_sid
finally:
db.close()
return {
"action": action,
"session_id": target_sid,
"name": name,
"results": f"[{name}](#session-{target_sid}) — click to open.",
}
db = SessionLocal()
try:
if action == "rename":
if not value:
return {"error": "rename needs a new name (the `value` arg, or line 3 in the legacy format)"}
new_name = value
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.name = new_name
db.commit()
_session_manager.update_session_name(target_sid, new_name)
return {"action": "rename", "session_id": target_sid, "name": new_name,
"results": f"Session renamed to '{new_name}'"}
elif action == "archive":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.archived = True
db.commit()
return {"action": "archive", "session_id": target_sid,
"results": f"Session '{db_sess.name}' archived"}
elif action == "unarchive":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
db_sess.archived = False
db.commit()
return {"action": "unarchive", "session_id": target_sid,
"results": f"Session '{db_sess.name}' unarchived"}
elif action == "delete":
if target_sid == session_id:
return {"error": "Cannot delete the current session while chatting in it. Delete other sessions first."}
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Refusing to delete an unknown chat id; use the exact id from list_sessions."}
if db_sess and db_sess.is_important:
return {"error": f"Session '{db_sess.name}' is starred/favorited. Unstar it first before deleting."}
try:
ok = _session_manager.delete_session(target_sid)
if not ok:
return {"error": f"Session '{target_sid}' was not deleted because it no longer exists."}
return {"action": "delete", "session_id": target_sid,
"results": f"Session '{db_sess.name or target_sid}' deleted"}
except Exception as e:
return {"error": f"Failed to delete session: {e}"}
elif action in ("important", "unimportant"):
is_important = action == "important"
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
# Prevent AI from unstarring sessions — only the user can do that manually
if not is_important and db_sess.is_important:
return {"error": f"Session '{db_sess.name}' is starred by the user. Only the user can unstar sessions manually."}
db_sess.is_important = is_important
db.commit()
status = "marked as important" if is_important else "unmarked as important"
return {"action": action, "session_id": target_sid,
"results": f"Session '{db_sess.name}' {status}"}
elif action == "truncate":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
keep_count = 10
if value:
try:
keep_count = int(value)
except ValueError:
pass
success = _session_manager.truncate_messages(target_sid, keep_count)
if success:
return {"action": "truncate", "session_id": target_sid,
"results": f"Session truncated to last {keep_count} messages"}
return {"error": f"Failed to truncate session '{target_sid}'"}
elif action == "fork":
db_sess = _session_query(db).first()
if not db_sess:
return {"error": f"Session '{target_sid}' not found. Use list_sessions and pass the exact id it returned."}
keep_count = 0 # 0 = all messages
if value:
try:
keep_count = int(value)
except ValueError:
pass
source = _session_manager.get_session(target_sid)
if not source:
return {"error": f"Session '{target_sid}' not found"}
new_sid = str(uuid.uuid4())[:8]
_session_manager.create_session(
session_id=new_sid,
name=f"Fork: {source.name}",
endpoint_url=source.endpoint_url,
model=source.model,
rag=False,
owner=owner,
)
# Copy messages
history = source.get_context_messages()
if keep_count > 0:
history = history[:keep_count]
from core.models import ChatMessage as InMemoryMsg
new_sess = _session_manager.get_session(new_sid)
for msg in history:
new_sess.add_message(InMemoryMsg(msg["role"], msg["content"]))
try:
from src.event_bus import fire_event
fire_event("session_created", owner)
except Exception:
logger.debug("session_created event dispatch failed", exc_info=True)
return {"action": "fork", "session_id": new_sid,
"source_session": target_sid, "messages_copied": len(history),
"results": f"Forked session '{source.name}' -> new session {new_sid} ({len(history)} messages)"}
else:
return {"error": f"Unknown action '{action}'. Use: list, switch, rename, archive, unarchive, delete, important, unimportant, truncate, fork"}
except Exception as e:
logger.error(f"manage_session failed: {e}")
return {"error": str(e)}
finally:
db.close()
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
# Memory management tool # Memory management tool
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -1104,83 +453,6 @@ async def do_manage_memory(content: str, session_id: Optional[str] = None, owner
return {"error": f"Unknown action '{action}'. Use: list, add, edit, delete, search"} return {"error": f"Unknown action '{action}'. Use: list, add, edit, delete, search"}
# ---------------------------------------------------------------------------
# List models tool
# ---------------------------------------------------------------------------
async def do_list_models(content: str, session_id: Optional[str] = None, owner: Optional[str] = None) -> Dict:
"""List all available models across configured endpoints.
Content = optional filter keyword.
"""
import httpx
from src.database import SessionLocal, ModelEndpoint
from src.llm_core import _detect_provider, ANTHROPIC_MODELS
from src.auth_helpers import owner_filter
keyword = content.strip().lower() if content.strip() else None
db = SessionLocal()
try:
query = db.query(ModelEndpoint).filter(ModelEndpoint.is_enabled == True)
if owner:
query = owner_filter(query, ModelEndpoint, owner)
endpoints = query.all()
if not endpoints:
return {"results": "No enabled model endpoints configured."}
result_lines = []
total_models = 0
for ep in endpoints:
try:
base, api_key = resolve_endpoint_runtime(ep, owner=owner)
except Exception:
continue
provider = _detect_provider(base)
headers = build_headers(api_key, base)
model_ids = []
if provider == "anthropic":
model_ids = list(ANTHROPIC_MODELS)
else:
try:
models_url = build_models_url(base)
if models_url:
r = httpx.get(models_url, headers=headers, timeout=5)
r.raise_for_status()
data = r.json()
model_ids = [m.get("id") for m in (data.get("data") or []) if m.get("id")]
if not model_ids:
model_ids = [
m.get("name") or m.get("model")
for m in (data.get("models") or [])
if m.get("name") or m.get("model")
]
else:
model_ids = json.loads(ep.cached_models or "[]")
except Exception:
model_ids = ["(endpoint offline)"]
if keyword:
model_ids = [m for m in model_ids if keyword in m.lower() or keyword in (ep.name or "").lower()]
if model_ids:
result_lines.append(f"\n**{ep.name or base}** ({provider}):")
for mid in model_ids:
result_lines.append(f" - `{mid}`")
total_models += 1
if not result_lines:
return {"results": "No models found" + (f" matching '{keyword}'" if keyword else "") + "."}
header = f"Available models ({total_models} total):"
return {"results": header + "\n".join(result_lines)}
except Exception as e:
logger.error(f"list_models failed: {e}")
return {"error": str(e)}
finally:
db.close()
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
@@ -1613,7 +885,9 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne
""" """
import base64 import base64
import httpx import httpx
import os
from pathlib import Path from pathlib import Path
from src.url_safety import check_outbound_url
lines = content.strip().split("\n") lines = content.strip().split("\n")
prompt = lines[0].strip() if lines else "" prompt = lines[0].strip() if lines else ""
@@ -1779,8 +1053,15 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne
elif img.get("url"): elif img.get("url"):
# Download external URL and save locally (DALL-E returns temp URLs) # Download external URL and save locally (DALL-E returns temp URLs)
result_url = img["url"]
ok, reason = check_outbound_url(
result_url,
block_private=os.getenv("IMAGE_BLOCK_PRIVATE_IPS", "false").lower() == "true",
)
if not ok:
return {"error": f"Image API returned unsafe image URL: {reason}"}
try: try:
dl_resp = httpx.get(img["url"], timeout=60) dl_resp = httpx.get(result_url, timeout=60)
if dl_resp.status_code == 200: if dl_resp.status_code == 200:
img_dir = Path(GENERATED_IMAGES_DIR) img_dir = Path(GENERATED_IMAGES_DIR)
img_dir.mkdir(parents=True, exist_ok=True) img_dir.mkdir(parents=True, exist_ok=True)
@@ -1790,10 +1071,10 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne
image_url = f"/api/generated-image/{filename}" image_url = f"/api/generated-image/{filename}"
image_id = _save_to_gallery(filename) image_id = _save_to_gallery(filename)
else: else:
image_url = img["url"] # fallback to external URL image_url = result_url # fallback to external URL
except Exception as _dl_e: except Exception as _dl_e:
logger.warning(f"Failed to download DALL-E image: {_dl_e}") logger.warning(f"Failed to download DALL-E image: {_dl_e}")
image_url = img["url"] # fallback to external URL image_url = result_url # fallback to external URL
else: else:
return {"error": "Image API returned unexpected format (no b64_json or url)"} return {"error": "Image API returned unexpected format (no b64_json or url)"}
@@ -1822,55 +1103,20 @@ async def dispatch_ai_tool(
) -> Tuple[str, Dict]: ) -> Tuple[str, Dict]:
"""Dispatch an AI interaction tool. Returns (description, result_dict).""" """Dispatch an AI interaction tool. Returns (description, result_dict)."""
if tool == "chat_with_model": if tool == "pipeline":
model_spec = content.split("\n")[0].strip()[:60]
desc = f"chat_with_model: {model_spec}"
result = await do_chat_with_model(content, session_id, owner=owner)
elif tool == "create_session":
name = content.split("\n")[0].strip()[:60]
desc = f"create_session: {name}"
result = await do_create_session(content, session_id, owner=owner)
elif tool == "list_sessions":
keyword = content.strip()[:40]
desc = f"list_sessions{': ' + keyword if keyword else ''}"
result = await do_list_sessions(content, session_id, owner=owner)
elif tool == "send_to_session":
sid = content.split("\n")[0].strip()[:20]
desc = f"send_to_session: {sid}"
result = await do_send_to_session(content, session_id, owner=owner)
elif tool == "pipeline":
desc = "pipeline: running steps" desc = "pipeline: running steps"
result = await do_pipeline(content, session_id, owner=owner) result = await do_pipeline(content, session_id, owner=owner)
elif tool == "manage_session":
action = content.split("\n")[0].strip()[:40]
desc = f"manage_session: {action}"
result = await do_manage_session(content, session_id, owner=owner)
elif tool == "manage_memory": elif tool == "manage_memory":
action = content.split("\n")[0].strip()[:40] action = content.split("\n")[0].strip()[:40]
desc = f"manage_memory: {action}" desc = f"manage_memory: {action}"
result = await do_manage_memory(content, session_id, owner=owner) result = await do_manage_memory(content, session_id, owner=owner)
elif tool == "list_models":
keyword = content.strip()[:40]
desc = f"list_models{': ' + keyword if keyword else ''}"
result = await do_list_models(content, session_id, owner=owner)
elif tool == "ui_control": elif tool == "ui_control":
action = content.split("\n")[0].strip()[:60] action = content.split("\n")[0].strip()[:60]
desc = f"ui_control: {action}" desc = f"ui_control: {action}"
result = await do_ui_control(content, session_id, owner=owner) result = await do_ui_control(content, session_id, owner=owner)
elif tool == "ask_teacher":
problem = content.split("\n", 1)[-1].strip()[:60]
desc = f"ask_teacher: {problem}"
result = await do_ask_teacher(content, session_id, owner=owner)
else: else:
desc = f"unknown ai tool: {tool}" desc = f"unknown ai tool: {tool}"
result = {"error": f"Unknown AI interaction tool: {tool}"} result = {"error": f"Unknown AI interaction tool: {tool}"}
+23 -1
View File
@@ -263,10 +263,32 @@ def list_for_session(session_id: str) -> List[Dict[str, Any]]:
return [r for r in refresh().values() if r.get("session_id") == session_id] return [r for r in refresh().values() if r.get("session_id") == session_id]
def kill(job_id: str) -> Optional[Dict[str, Any]]:
"""Terminate a running job's process tree and mark it killed. Returns the
updated record, or None if the id is unknown. Idempotent: a job that already
finished is returned unchanged. Sets followed_up so the monitor does not also
fire an auto-continue for a job the agent deliberately stopped."""
jobs = _load()
rec = jobs.get(job_id)
if rec is None:
return None
if rec.get("status") == "running":
_kill(rec.get("pid"))
rec["status"] = "failed"
rec["exit_code"] = -1
rec["ended_at"] = time.time()
rec["killed"] = True
rec["followed_up"] = True
_save(jobs)
return rec
def result_text(rec: Dict[str, Any]) -> str: def result_text(rec: Dict[str, Any]) -> str:
"""Human/agent-readable summary of a finished job, for the follow-up.""" """Human/agent-readable summary of a finished job, for the follow-up."""
out = _read_output(rec) out = _read_output(rec)
if rec.get("timed_out"): if rec.get("killed"):
head = "Background job was killed."
elif rec.get("timed_out"):
head = f"Background job timed out after {rec.get('max_runtime_s')}s." head = f"Background job timed out after {rec.get('max_runtime_s')}s."
elif rec.get("died"): elif rec.get("died"):
head = "Background job process died unexpectedly (no exit code)." head = "Background job process died unexpectedly (no exit code)."
+50 -48
View File
@@ -76,8 +76,7 @@ async def action_consolidate_memory(owner: str, **kwargs) -> Tuple[str, bool]:
import json import json
import re import re
from src.constants import DATA_DIR from src.constants import DATA_DIR
from src.endpoint_resolver import resolve_endpoint from src.llm_core import llm_call_async_with_fallback
from src.llm_core import llm_call_async
from src.memory import MemoryManager from src.memory import MemoryManager
manager = MemoryManager(DATA_DIR) manager = MemoryManager(DATA_DIR)
@@ -116,10 +115,9 @@ async def action_consolidate_memory(owner: str, **kwargs) -> Tuple[str, bool]:
if len(group_memories) < 2: if len(group_memories) < 2:
return False return False
url, model, headers = resolve_endpoint("utility", owner=group_owner or None) from src.task_endpoint import resolve_task_candidates
if not url or not model: candidates = resolve_task_candidates(owner=group_owner or None)
url, model, headers = resolve_endpoint("default", owner=group_owner or None) if not candidates:
if not url or not model:
return False return False
try: try:
@@ -147,13 +145,11 @@ async def action_consolidate_memory(owner: str, **kwargs) -> Tuple[str, bool]:
"\"drop\":[{\"id\":\"existing id\",\"reason\":\"short reason\"}]}\n\n" "\"drop\":[{\"id\":\"existing id\",\"reason\":\"short reason\"}]}\n\n"
f"MEMORIES:\n{json.dumps(items, ensure_ascii=False)}" f"MEMORIES:\n{json.dumps(items, ensure_ascii=False)}"
) )
raw = await llm_call_async( raw = await llm_call_async_with_fallback(
url=url, candidates,
model=model,
messages=[{"role": "user", "content": prompt}], messages=[{"role": "user", "content": prompt}],
temperature=0.0, temperature=0.0,
max_tokens=4096, max_tokens=4096,
headers=headers,
timeout=120, timeout=120,
) )
from src.text_helpers import strip_think from src.text_helpers import strip_think
@@ -604,8 +600,7 @@ async def action_classify_events(owner: str, **kwargs) -> Tuple[str, bool]:
try: try:
from datetime import timedelta from datetime import timedelta
from core.database import SessionLocal, CalendarEvent from core.database import SessionLocal, CalendarEvent
from src.endpoint_resolver import resolve_endpoint from src.llm_core import llm_call_async_with_fallback
from src.llm_core import llm_call_async
import re as _re, json as _json import re as _re, json as _json
db = SessionLocal() db = SessionLocal()
@@ -620,10 +615,9 @@ async def action_classify_events(owner: str, **kwargs) -> Tuple[str, bool]:
if not events: if not events:
return "No upcoming events to classify", True return "No upcoming events to classify", True
llm_url, llm_model, llm_headers = resolve_endpoint("utility", owner=owner) from src.task_endpoint import resolve_task_candidates
if not llm_url: llm_candidates = resolve_task_candidates(owner=owner)
llm_url, llm_model, llm_headers = resolve_endpoint("default", owner=owner) llm_available = bool(llm_candidates)
llm_available = bool(llm_url and llm_model)
# Pull user memories so the LLM has personal context (relationships, # Pull user memories so the LLM has personal context (relationships,
# job, hobbies). Helps it know e.g. "<name> is your spouse" so their # job, hobbies). Helps it know e.g. "<name> is your spouse" so their
@@ -699,11 +693,11 @@ async def action_classify_events(owner: str, **kwargs) -> Tuple[str, bool]:
f"EVENTS: {_json.dumps(items)}" f"EVENTS: {_json.dumps(items)}"
) )
try: try:
raw = await llm_call_async( raw = await llm_call_async_with_fallback(
url=llm_url, model=llm_model, llm_candidates,
messages=[{"role": "user", "content": prompt}], messages=[{"role": "user", "content": prompt}],
temperature=0.1, max_tokens=16384, temperature=0.1, max_tokens=16384,
headers=llm_headers, timeout=180, timeout=180,
) )
from src.text_helpers import strip_think as _st from src.text_helpers import strip_think as _st
raw = _st(raw or "", prose=False, prompt_echo=False) raw = _st(raw or "", prose=False, prompt_echo=False)
@@ -810,8 +804,7 @@ async def action_learn_sender_signatures(owner: str, **kwargs) -> Tuple[str, boo
import asyncio as _aio import asyncio as _aio
from datetime import datetime as _dt, timedelta as _td from datetime import datetime as _dt, timedelta as _td
from routes.email_helpers import _email_cache_owner_clause, _imap_connect, SCHEDULED_DB from routes.email_helpers import _email_cache_owner_clause, _imap_connect, SCHEDULED_DB
from src.endpoint_resolver import resolve_endpoint from src.llm_core import llm_call_async_with_fallback
from src.llm_core import llm_call_async
# 1. Pull recent UIDs + From headers cheaply (header-only fetch). # 1. Pull recent UIDs + From headers cheaply (header-only fetch).
def _pull_headers(): def _pull_headers():
@@ -891,11 +884,11 @@ async def action_learn_sender_signatures(owner: str, **kwargs) -> Tuple[str, boo
if not eligible: if not eligible:
return "All sender sigs already cached (or no eligible senders)", True return "All sender sigs already cached (or no eligible senders)", True
url, model, headers = resolve_endpoint("utility", owner=owner) from src.task_endpoint import resolve_task_candidates
if not url or not model: candidates = resolve_task_candidates(owner=owner)
url, model, headers = resolve_endpoint("default", owner=owner) if not candidates:
if not url or not model:
return "No LLM endpoint available", False return "No LLM endpoint available", False
model = candidates[0][1]
analyzed = 0 analyzed = 0
no_sig = 0 no_sig = 0
@@ -949,11 +942,11 @@ async def action_learn_sender_signatures(owner: str, **kwargs) -> Tuple[str, boo
) )
try: try:
raw = await llm_call_async( raw = await llm_call_async_with_fallback(
url=url, model=model, candidates,
messages=[{"role": "user", "content": prompt}], messages=[{"role": "user", "content": prompt}],
temperature=0.0, max_tokens=600, temperature=0.0, max_tokens=600,
headers=headers, timeout=60, timeout=60,
) )
from src.text_helpers import strip_think as _st from src.text_helpers import strip_think as _st
sig = _st(raw or "", prose=False, prompt_echo=False).strip() sig = _st(raw or "", prose=False, prompt_echo=False).strip()
@@ -1137,7 +1130,6 @@ async def action_test_skills(owner: str, **kwargs) -> Tuple[str, bool]:
from services.memory.skills import SkillsManager from services.memory.skills import SkillsManager
from src.constants import DATA_DIR from src.constants import DATA_DIR
from routes.skills_routes import _run_skill_test_once, _skill_test_task from routes.skills_routes import _run_skill_test_once, _skill_test_task
from src.endpoint_resolver import resolve_endpoint
# #3 SCOPE GUARD: refuse to run on a None/empty owner — otherwise # #3 SCOPE GUARD: refuse to run on a None/empty owner — otherwise
# `sm.load(owner=None)` returns every user's skills and we'd cross- # `sm.load(owner=None)` returns every user's skills and we'd cross-
@@ -1152,27 +1144,40 @@ async def action_test_skills(owner: str, **kwargs) -> Tuple[str, bool]:
if not names: if not names:
raise TaskNoop("no skills to test") raise TaskNoop("no skills to test")
url, model, headers = resolve_endpoint("default", owner=owner) from src.task_endpoint import resolve_task_candidates
if not url or not model: candidates = resolve_task_candidates(owner=owner)
if not candidates:
return "No Default/Utility model configured — set one in Settings.", False return "No Default/Utility model configured — set one in Settings.", False
# #2 NO SILENT MODEL SWAP: if the configured model isn't served by the # #2 NO SILENT MODEL SWAP: if the configured model isn't served by the
# endpoint, try a basename match — but fail loudly instead of grabbing # endpoint, try a basename match — but fail loudly instead of grabbing
# `avail[0]` which could be an embedding-only model and produce 36 # `avail[0]` which could be an embedding-only model and produce 36
# garbage transcripts → 36 'unknown' verdicts with no hint why. # garbage transcripts → 36 'unknown' verdicts with no hint why.
url, model, headers = candidates[0]
try: try:
from src.llm_core import list_model_ids from src.llm_core import list_model_ids
avail = list_model_ids(url, headers=headers) import os as _os
if avail and model not in avail:
import os as _os selected = None
base = _os.path.basename((model or "").rstrip("/")) mismatch_notes = []
m = next((a for a in avail if _os.path.basename(a.rstrip("/")) == base), None) for cand_url, cand_model, cand_headers in candidates:
if m: avail = list_model_ids(cand_url, headers=cand_headers)
model = m if not avail or cand_model in avail:
else: selected = (cand_url, cand_model, cand_headers)
return (f"Default model '{model}' not served by endpoint {url}. " break
f"Available: {', '.join(avail[:8])}{'' if len(avail) > 8 else ''}. " base = _os.path.basename((cand_model or "").rstrip("/"))
"Set a valid Default model in Settings."), False matched = next((a for a in avail if _os.path.basename(a.rstrip("/")) == base), None)
if matched:
selected = (cand_url, matched, cand_headers)
break
mismatch_notes.append(
f"{cand_model} not served by {cand_url}; available: "
f"{', '.join(avail[:8])}{'...' if len(avail) > 8 else ''}"
)
if selected:
url, model, headers = selected
elif mismatch_notes:
return "No configured task fallback model is served. " + " | ".join(mismatch_notes[:3]), False
except Exception as _e: except Exception as _e:
logger.warning(f"test_skills model resolve check failed (continuing): {_e}") logger.warning(f"test_skills model resolve check failed (continuing): {_e}")
@@ -1483,7 +1488,6 @@ async def action_check_email_urgency(owner: str, **kwargs) -> Tuple[str, bool]:
from pathlib import Path as _P from pathlib import Path as _P
from core.database import SessionLocal as _SL, EmailAccount as _EA from core.database import SessionLocal as _SL, EmailAccount as _EA
from routes.email_helpers import _imap_connect, _decode_header from routes.email_helpers import _imap_connect, _decode_header
from src.endpoint_resolver import resolve_endpoint, resolve_utility_fallback_candidates
from src.llm_core import llm_call_async_with_fallback from src.llm_core import llm_call_async_with_fallback
# Per-owner state file so multi-user runs don't clobber each other's # Per-owner state file so multi-user runs don't clobber each other's
@@ -1505,12 +1509,10 @@ async def action_check_email_urgency(owner: str, **kwargs) -> Tuple[str, bool]:
# ── 1. Resolve LLM candidates (utility primary + utility fallbacks; fall # ── 1. Resolve LLM candidates (utility primary + utility fallbacks; fall
# through to default chat as a last resort). # through to default chat as a last resort).
url, model, headers = resolve_endpoint("utility", owner=owner) from src.task_endpoint import resolve_task_candidates
if not url or not model: candidates = resolve_task_candidates(owner=owner)
url, model, headers = resolve_endpoint("default", owner=owner) if not candidates:
if not url or not model:
return "No LLM endpoint available", False return "No LLM endpoint available", False
candidates = [(url, model, headers)] + resolve_utility_fallback_candidates(owner=owner)
# ── 2. Enumerate enabled accounts. Match this task's owner AND fall # ── 2. Enumerate enabled accounts. Match this task's owner AND fall
# back to the legacy "unowned account whose imap_user / from_address # back to the legacy "unowned account whose imap_user / from_address
+3 -2
View File
@@ -14,6 +14,7 @@ import subprocess
import sys import sys
from core.platform_compat import IS_WINDOWS, which_tool from core.platform_compat import IS_WINDOWS, which_tool
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -81,7 +82,7 @@ _BUILTIN_NPX_SERVERS = {
"name": "Built-in: Browser", "name": "Built-in: Browser",
"command": "npx", "command": "npx",
"args": ["-y", "@playwright/mcp@latest", "--headless", "--caps", "vision"], "args": ["-y", "@playwright/mcp@latest", "--headless", "--caps", "vision"],
}, }
} }
# Global flag to disable MCP if there are compatibility issues # Global flag to disable MCP if there are compatibility issues
@@ -94,7 +95,7 @@ async def register_builtin_servers(mcp_manager):
logger.info("Built-in MCP servers disabled via ODYSSEUS_DISABLE_MCP") logger.info("Built-in MCP servers disabled via ODYSSEUS_DISABLE_MCP")
return return
base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) base_dir = get_app_root()
python = sys.executable python = sys.executable
async def _connect_python_server(server_id: str, script_path: str, name: str): async def _connect_python_server(server_id: str, script_path: str, name: str):
+3 -2
View File
@@ -5,6 +5,7 @@ from pydantic_settings import BaseSettings, SettingsConfigDict
from pydantic import Field, field_validator from pydantic import Field, field_validator
from src.constants import DATA_DIR as _DATA_DIR_CONST from src.constants import DATA_DIR as _DATA_DIR_CONST
from src.runtime_paths import get_app_root
# Cross-platform OS flag, exposed here so callers can `from src.config import # Cross-platform OS flag, exposed here so callers can `from src.config import
# IS_WINDOWS`. Defined locally (a trivial `os.name == "nt"`) rather than imported # IS_WINDOWS`. Defined locally (a trivial `os.name == "nt"`) rather than imported
@@ -19,7 +20,7 @@ IS_WINDOWS = os.name == "nt"
class DataConfig(BaseSettings): class DataConfig(BaseSettings):
"""Configuration for data storage and file handling.""" """Configuration for data storage and file handling."""
# Base directory # Base directory
base_dir: Path = Field(default=Path(__file__).parent.parent, description="Base directory for the application") base_dir: Path = Field(default=Path(get_app_root()), description="Base directory for the application")
# Data paths # Data paths
data_dir: Path = Field(default=Path(_DATA_DIR_CONST), description="Main data directory") data_dir: Path = Field(default=Path(_DATA_DIR_CONST), description="Main data directory")
@@ -138,7 +139,7 @@ class AppConfig(BaseSettings):
if isinstance(v, dict) and "base_dir" in v: if isinstance(v, dict) and "base_dir" in v:
base_dir = v["base_dir"] base_dir = v["base_dir"]
else: else:
base_dir = Path(__file__).parent.parent base_dir = Path(get_app_root())
# Convert string paths to Path objects relative to base_dir # Convert string paths to Path objects relative to base_dir
data_dir = Path(_DATA_DIR_CONST) data_dir = Path(_DATA_DIR_CONST)
+30 -4
View File
@@ -2,12 +2,14 @@
"""Application-wide constants and configuration values.""" """Application-wide constants and configuration values."""
import os import os
APP_VERSION = "1.0.0" from src.runtime_paths import get_app_root, get_default_data_dir
APP_VERSION = "1.0.1"
# Base paths # Base paths
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) + "/" BASE_DIR = os.path.join(get_app_root(), "")
STATIC_DIR = os.path.join(BASE_DIR, "static") STATIC_DIR = os.path.join(BASE_DIR, "static")
DATA_DIR = os.getenv("ODYSSEUS_DATA_DIR", os.path.join(BASE_DIR, "data")) DATA_DIR = os.getenv("ODYSSEUS_DATA_DIR", get_default_data_dir())
# Data file paths # Data file paths
# Single source of truth: every persisted file/dir lives under DATA_DIR, which # Single source of truth: every persisted file/dir lives under DATA_DIR, which
@@ -55,7 +57,13 @@ MEMORY_VECTORS_DIR = os.path.join(DATA_DIR, "memory_vectors")
# Paths with an intentional dedicated env override, defaulting under DATA_DIR. # Paths with an intentional dedicated env override, defaulting under DATA_DIR.
MAIL_ATTACHMENTS_DIR = os.getenv("ODYSSEUS_MAIL_ATTACHMENTS_DIR", os.path.join(DATA_DIR, "mail-attachments")) MAIL_ATTACHMENTS_DIR = os.getenv("ODYSSEUS_MAIL_ATTACHMENTS_DIR", os.path.join(DATA_DIR, "mail-attachments"))
FASTEMBED_CACHE_DIR = os.getenv("FASTEMBED_CACHE_PATH", os.path.join(DATA_DIR, "fastembed_cache")) # `or` (not os.getenv's default arg) so a PRESENT-but-EMPTY value falls back to
# the default. docker-compose.yml injects `FASTEMBED_CACHE_PATH=${FASTEMBED_CACHE_PATH:-}`,
# which sets the var to "" when the host hasn't defined it. os.getenv(name, default)
# only returns the default when the var is ABSENT, so the empty string would win →
# os.makedirs("") raises [Errno 2] No such file or directory: '' → FastEmbed fails to
# init and all vector features (RAG, semantic memory, tool index) silently degrade.
FASTEMBED_CACHE_DIR = os.getenv("FASTEMBED_CACHE_PATH") or os.path.join(DATA_DIR, "fastembed_cache")
# Agent tool output limits (single source of truth — imported by tool_execution.py, # Agent tool output limits (single source of truth — imported by tool_execution.py,
# tool_implementations.py, agent_tools.py, and any other module that needs them) # tool_implementations.py, agent_tools.py, and any other module that needs them)
@@ -63,11 +71,26 @@ MAX_OUTPUT_CHARS = 10_000 # cap for bash/python/web_search/web_fetch outpu
MAX_READ_CHARS = 20_000 # cap for read_file / document preview MAX_READ_CHARS = 20_000 # cap for read_file / document preview
MAX_DIFF_LINES = 400 # cap for edit_file unified-diff display MAX_DIFF_LINES = 400 # cap for edit_file unified-diff display
# web_fetch response-size policy (#3812). MAX_OUTPUT_CHARS above only trims
# what the agent SEES; these caps bound what the server downloads, parses,
# and writes to the content cache. The soft cap is the default download
# budget; the agent can raise it per call (full/max_bytes) but never past
# the hard cap, so a model can't decide to pull a multi-GB file.
WEB_FETCH_SOFT_MAX_BYTES = 2_000_000 # default download budget (2 MB)
WEB_FETCH_HARD_MAX_BYTES = 20_000_000 # absolute ceiling, even with override (20 MB)
# API Configuration # API Configuration
MAX_CONTEXT_MESSAGES = 90 MAX_CONTEXT_MESSAGES = 90
REQUEST_TIMEOUT = 20 REQUEST_TIMEOUT = 20
OPENAI_COMPAT_PATH = "/v1/chat/completions" OPENAI_COMPAT_PATH = "/v1/chat/completions"
# Outbound UA for web_fetch / web_search scraping; common desktop UA so pages serve normal HTML.
WEB_FETCH_USER_AGENT = os.environ.get(
"WEB_FETCH_USER_AGENT",
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
"(KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36",
)
# Environment variables with defaults # Environment variables with defaults
DEFAULT_HOST = os.getenv("LLM_HOST", "localhost") DEFAULT_HOST = os.getenv("LLM_HOST", "localhost")
LLM_HOSTS = [h.strip() for h in os.getenv("LLM_HOSTS", "").split(",") if h.strip()] LLM_HOSTS = [h.strip() for h in os.getenv("LLM_HOSTS", "").split(",") if h.strip()]
@@ -79,6 +102,9 @@ SEARXNG_INSTANCE = os.getenv("SEARXNG_INSTANCE", "http://localhost:8080")
CLEANUP_ENABLED = os.getenv("CLEANUP_ENABLED", "True").lower() == "true" CLEANUP_ENABLED = os.getenv("CLEANUP_ENABLED", "True").lower() == "true"
CLEANUP_INTERVAL_HOURS = int(os.getenv("CLEANUP_INTERVAL_HOURS", "24")) CLEANUP_INTERVAL_HOURS = int(os.getenv("CLEANUP_INTERVAL_HOURS", "24"))
# Auth policy
PASSWORD_MIN_LENGTH = 8
# Default parameters # Default parameters
DEFAULT_TEMPERATURE = 1.0 DEFAULT_TEMPERATURE = 1.0
DEFAULT_MAX_TOKENS = 0 DEFAULT_MAX_TOKENS = 0
+3 -2
View File
@@ -161,11 +161,13 @@ async def _tick() -> None:
# Re-read state once before writing so we capture any updates from # Re-read state once before writing so we capture any updates from
# concurrent UI syncs. # concurrent UI syncs.
stopped_any = False stopped_any = False
successfully_stopped_sids = set()
for sid, host, port in to_stop: for sid, host, port in to_stop:
ok = await _stop_serve(sid, host, port) ok = await _stop_serve(sid, host, port)
logger.info(f"cookbook_serve_lifecycle: stop {sid} (host={host or 'local'}): {'ok' if ok else 'failed'}") logger.info(f"cookbook_serve_lifecycle: stop {sid} (host={host or 'local'}): {'ok' if ok else 'failed'}")
if ok: if ok:
stopped_any = True stopped_any = True
successfully_stopped_sids.add(sid)
# Drop the auto-registered endpoint so the model picker and # Drop the auto-registered endpoint so the model picker and
# the chat router don't keep pointing at a dead server. # the chat router don't keep pointing at a dead server.
for t in tasks: for t in tasks:
@@ -188,12 +190,11 @@ async def _tick() -> None:
except Exception: except Exception:
fresh = state fresh = state
fresh_tasks = tasks fresh_tasks = tasks
stopped_sids = {sid for sid, _, _ in to_stop}
for ft in fresh_tasks: for ft in fresh_tasks:
if not isinstance(ft, dict): if not isinstance(ft, dict):
continue continue
ft_sid = ft.get("sessionId") or ft.get("id") ft_sid = ft.get("sessionId") or ft.get("id")
if ft_sid in stopped_sids: if ft_sid in successfully_stopped_sids:
ft["status"] = "stopped" ft["status"] = "stopped"
ft["_scheduledStopAtMs"] = None ft["_scheduledStopAtMs"] = None
ft["_lastStatusFlipAt"] = now_ms ft["_lastStatusFlipAt"] = now_ms
+2
View File
@@ -31,6 +31,8 @@ import numpy as np
import httpx import httpx
from typing import List, Optional from typing import List, Optional
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
_DEFAULT_MODEL = "all-minilm:l6-v2" _DEFAULT_MODEL = "all-minilm:l6-v2"
+42 -11
View File
@@ -161,6 +161,32 @@ def normalize_base(url: str) -> str:
return url return url
def _validated_endpoint_base(url: str) -> str:
"""Return a base URL that is safe for endpoint path appends."""
base = (url or "").strip().rstrip("/")
if "?" in base or "#" in base:
raise ValueError("Endpoint base URL must not include query or fragment")
return urlunparse(urlparse(base)._replace(query="", fragment="")).rstrip("/")
def _prepare_endpoint_base(base: str) -> str:
base = _validated_endpoint_base(normalize_base(base))
return _validated_endpoint_base(normalize_base(resolve_url(base)))
def _append_endpoint_path(base: str, suffix: str) -> str:
parsed = urlparse(base)
current = (parsed.path or "").rstrip("/")
extra = "/" + suffix.lstrip("/")
path = f"{current}{extra}" if current else extra
return urlunparse(parsed._replace(path=path, query="", fragment=""))
def _pathless_host(base: str, host: str) -> bool:
parsed = urlparse(base)
return (parsed.hostname or "").lower() == host and not (parsed.path or "").strip("/")
def _anthropic_api_root(base: str) -> str: def _anthropic_api_root(base: str) -> str:
"""Return Anthropic's API root, preserving /v1 for OpenAI-compatible APIs elsewhere.""" """Return Anthropic's API root, preserving /v1 for OpenAI-compatible APIs elsewhere."""
base = (base or "").strip().rstrip("/") base = (base or "").strip().rstrip("/")
@@ -171,15 +197,17 @@ def _anthropic_api_root(base: str) -> str:
def build_chat_url(base: str) -> str: def build_chat_url(base: str) -> str:
"""Return the correct chat endpoint URL for a given base.""" """Return the correct chat endpoint URL for a given base."""
base = resolve_url(base) base = _prepare_endpoint_base(base)
provider = _detect_provider(base) provider = _detect_provider(base)
if provider == "anthropic": if provider == "anthropic":
return _anthropic_api_root(base) + "/v1/messages" return _append_endpoint_path(_anthropic_api_root(base), "/v1/messages")
if provider == "ollama": if provider == "ollama":
return _ollama_api_root(base) + "/chat" return _append_endpoint_path(_ollama_api_root(base), "/chat")
if provider == "chatgpt-subscription": if provider == "chatgpt-subscription":
return base.rstrip("/") + "/responses" return _append_endpoint_path(base, "/responses")
return base + "/chat/completions" if _pathless_host(base, "api.openai.com"):
base = _append_endpoint_path(base, "/v1")
return _append_endpoint_path(base, "/chat/completions")
def build_models_url(base: str) -> Optional[str]: def build_models_url(base: str) -> Optional[str]:
@@ -193,12 +221,12 @@ def build_models_url(base: str) -> Optional[str]:
untouched (so custom prefixes like ``/openai`` or ``/api/openai/v1`` keep untouched (so custom prefixes like ``/openai`` or ``/api/openai/v1`` keep
their semantics). their semantics).
""" """
base = normalize_base(resolve_url(base)) base = _prepare_endpoint_base(base)
provider = _detect_provider(base) provider = _detect_provider(base)
if provider == "anthropic": if provider == "anthropic":
return _anthropic_api_root(base) + "/v1/models" return _append_endpoint_path(_anthropic_api_root(base), "/v1/models")
if provider == "ollama": if provider == "ollama":
return _ollama_api_root(base) + "/tags" return _append_endpoint_path(_ollama_api_root(base), "/tags")
if provider == "chatgpt-subscription": if provider == "chatgpt-subscription":
return None return None
# Generic OpenAI-compatible fallback: local model servers with no explicit # Generic OpenAI-compatible fallback: local model servers with no explicit
@@ -208,10 +236,10 @@ def build_models_url(base: str) -> Optional[str]:
parsed = urlparse(base) parsed = urlparse(base)
host = (parsed.hostname or "").lower() host = (parsed.hostname or "").lower()
is_local = host in {"localhost", "127.0.0.1", "::1", "host.docker.internal"} is_local = host in {"localhost", "127.0.0.1", "::1", "host.docker.internal"}
uses_v1_models_by_default = is_local or host in {"api.deepseek.com"} uses_v1_models_by_default = is_local or host in {"api.deepseek.com", "api.openai.com"}
if not parsed.path and uses_v1_models_by_default: if not parsed.path and uses_v1_models_by_default:
base = base + "/v1" base = _append_endpoint_path(base, "/v1")
return base + "/models" return _append_endpoint_path(base, "/models")
def build_headers(api_key: Optional[str], base: str) -> Dict[str, str]: def build_headers(api_key: Optional[str], base: str) -> Dict[str, str]:
@@ -396,6 +424,9 @@ def resolve_utility_fallback_candidates(owner: Optional[str] = None) -> list:
settings = load_settings() settings = load_settings()
utility_ep = (get_user_setting("utility_endpoint_id", owner or "", settings.get("utility_endpoint_id", "")) or "").strip() utility_ep = (get_user_setting("utility_endpoint_id", owner or "", settings.get("utility_endpoint_id", "")) or "").strip()
if not utility_ep: if not utility_ep:
utility_chain = get_user_setting("utility_model_fallbacks", owner or "", settings.get("utility_model_fallbacks") or []) or []
if utility_chain:
return _resolve_fallback_candidates("utility_model_fallbacks", owner=owner)
return _resolve_fallback_candidates("default_model_fallbacks", owner=owner) return _resolve_fallback_candidates("default_model_fallbacks", owner=owner)
except Exception: except Exception:
pass pass
+35 -8
View File
@@ -4,6 +4,7 @@ import uuid
import logging import logging
import re import re
from typing import Dict, List, Optional, Any from typing import Dict, List, Optional, Any
from urllib.parse import urljoin, urlparse, urlunparse
import httpx import httpx
from fastapi import HTTPException from fastapi import HTTPException
@@ -202,6 +203,22 @@ def mask_integration_secret(integration: Dict[str, Any]) -> Dict[str, Any]:
return safe return safe
def _normalize_integration_base_url(base_url: Any) -> str:
if not isinstance(base_url, str) or not base_url.strip():
raise ValueError("Integration base URL is required")
cleaned = base_url.strip().rstrip("/")
if "?" in cleaned or "#" in cleaned:
raise ValueError("Integration base URL must not include query or fragment")
parsed = urlparse(cleaned)
if parsed.scheme.lower() not in ("http", "https") or not parsed.hostname:
raise ValueError("Integration base URL must be an HTTP(S) URL")
return urlunparse(parsed._replace(scheme=parsed.scheme.lower(), query="", fragment="")).rstrip("/")
def _join_integration_url(base_url: str, path: str) -> str:
return urljoin(base_url.rstrip("/") + "/", path.lstrip("/"))
def load_integrations() -> List[Dict[str, Any]]: def load_integrations() -> List[Dict[str, Any]]:
"""Load all integrations from disk with secrets decrypted for runtime use.""" """Load all integrations from disk with secrets decrypted for runtime use."""
if not os.path.exists(DATA_FILE): if not os.path.exists(DATA_FILE):
@@ -261,8 +278,10 @@ def add_integration(data: Dict[str, Any]) -> Dict[str, Any]:
if not isinstance(integration.get("name"), str) or not integration["name"].strip(): if not isinstance(integration.get("name"), str) or not integration["name"].strip():
raise HTTPException(400, "Integration name is required") raise HTTPException(400, "Integration name is required")
if not isinstance(integration.get("base_url"), str) or not integration["base_url"].strip(): try:
raise HTTPException(400, "Integration base URL is required") integration["base_url"] = _normalize_integration_base_url(integration.get("base_url"))
except ValueError as exc:
raise HTTPException(400, str(exc)) from exc
integrations = load_integrations() integrations = load_integrations()
integrations.append(integration) integrations.append(integration)
@@ -272,10 +291,14 @@ def add_integration(data: Dict[str, Any]) -> Dict[str, Any]:
def update_integration(integration_id: str, data: Dict[str, Any]) -> Optional[Dict[str, Any]]: def update_integration(integration_id: str, data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
"""Update fields on an existing integration. Returns updated integration or None.""" """Update fields on an existing integration. Returns updated integration or None."""
data = dict(data)
if "name" in data and (not isinstance(data["name"], str) or not data["name"].strip()): if "name" in data and (not isinstance(data["name"], str) or not data["name"].strip()):
raise HTTPException(400, "Integration name is required") raise HTTPException(400, "Integration name is required")
if "base_url" in data and (not isinstance(data["base_url"], str) or not data["base_url"].strip()): if "base_url" in data:
raise HTTPException(400, "Integration base URL is required") try:
data["base_url"] = _normalize_integration_base_url(data["base_url"])
except ValueError as exc:
raise HTTPException(400, str(exc)) from exc
integrations = load_integrations() integrations = load_integrations()
for item in integrations: for item in integrations:
@@ -341,9 +364,10 @@ async def execute_api_call(
if not integration.get("enabled", True): if not integration.get("enabled", True):
return {"error": f"Integration '{integration.get('name')}' is disabled", "exit_code": 1} return {"error": f"Integration '{integration.get('name')}' is disabled", "exit_code": 1}
base_url = integration.get("base_url", "").rstrip("/") try:
if not base_url: base_url = _normalize_integration_base_url(integration.get("base_url", ""))
return {"error": "Integration has no base_url configured", "exit_code": 1} except ValueError as exc:
return {"error": str(exc), "exit_code": 1}
# Strip common API path suffixes users might accidentally include # Strip common API path suffixes users might accidentally include
# (e.g. "http://host/v1/" → "http://host"). The integration's preset # (e.g. "http://host/v1/" → "http://host"). The integration's preset
@@ -366,7 +390,10 @@ async def execute_api_call(
if re.search(r"^https?://", path) or "://" in path: if re.search(r"^https?://", path) or "://" in path:
return {"error": "Path must not contain a protocol scheme", "exit_code": 1} return {"error": "Path must not contain a protocol scheme", "exit_code": 1}
url = base_url + path if "#" in path:
return {"error": "Path must not contain a fragment", "exit_code": 1}
url = _join_integration_url(base_url, path)
method = method.upper() method = method.upper()
# Build headers # Build headers
+16 -6
View File
@@ -87,7 +87,7 @@ _host_health_lock = threading.Lock()
_model_activity: Dict[str, float] = {} _model_activity: Dict[str, float] = {}
_HARMONY_MARKER_RE = re.compile( _HARMONY_MARKER_RE = re.compile(
r"<\|channel\|>(analysis|final)" r"<\|channel\|>(analysis|commentary|final)"
r"|<\|start\|>(?:assistant|system|user|tool)?" r"|<\|start\|>(?:assistant|system|user|tool)?"
r"|<\|message\|>" r"|<\|message\|>"
r"|<\|end\|>" r"|<\|end\|>"
@@ -96,6 +96,7 @@ _HARMONY_MARKER_RE = re.compile(
) )
_HARMONY_MARKERS = ( _HARMONY_MARKERS = (
"<|channel|>analysis", "<|channel|>analysis",
"<|channel|>commentary",
"<|channel|>final", "<|channel|>final",
"<|start|>assistant", "<|start|>assistant",
"<|start|>system", "<|start|>system",
@@ -145,7 +146,10 @@ class _HarmonyStreamRouter:
out.append((text, False)) out.append((text, False))
return return
if self._in_message: if self._in_message:
out.append((text, self._channel == "analysis")) # analysis + commentary (tool-call preambles / function-arg bodies)
# are internal, not user-facing — route them to thinking so they
# don't leak into the visible answer; only `final` is visible.
out.append((text, self._channel in ("analysis", "commentary")))
def _handle_marker(self, match: re.Match[str]) -> None: def _handle_marker(self, match: re.Match[str]) -> None:
marker = match.group(0) marker = match.group(0)
@@ -283,7 +287,8 @@ def _is_ollama_native_url(url: str) -> bool:
"""Return True for native Ollama API URLs, including Ollama Cloud.""" """Return True for native Ollama API URLs, including Ollama Cloud."""
try: try:
parsed = urlparse(url or "") parsed = urlparse(url or "")
except Exception: except Exception as e:
logger.warning("Failed to parse URL for Ollama detection", exc_info=e)
return False return False
host = parsed.hostname or "" host = parsed.hostname or ""
path = (parsed.path or "").rstrip("/") path = (parsed.path or "").rstrip("/")
@@ -902,7 +907,10 @@ def _anthropic_rejects_temperature(model: str) -> bool:
return (int(match.group(1)), int(match.group(2))) >= (4, 7) return (int(match.group(1)), int(match.group(2))) >= (4, 7)
# Models that support structured thinking — may output </think> without opening tag # Models that support structured thinking — may output </think> without opening tag
_THINKING_MODEL_PATTERNS = ("qwen3", "qwq", "deepseek-r1", "deepseek-reasoner", "minimax", "m2-reap", "gemma") _THINKING_MODEL_PATTERNS = (
"qwen3", "qwq", "deepseek-r1", "deepseek-reasoner", "minimax",
"m2-reap", "gemma", "stepfun", "step-3", "step3",
)
def _supports_thinking(model: str) -> bool: def _supports_thinking(model: str) -> bool:
"""Check if model supports structured thinking output.""" """Check if model supports structured thinking output."""
@@ -1345,8 +1353,8 @@ def list_model_ids(
r = httpx.get(root + "/api/tags", timeout=timeout) r = httpx.get(root + "/api/tags", timeout=timeout)
r.raise_for_status() r.raise_for_status()
return [m.get("name") or m.get("model") for m in (r.json().get("models") or []) if m.get("name") or m.get("model")] return [m.get("name") or m.get("model") for m in (r.json().get("models") or []) if m.get("name") or m.get("model")]
except Exception: except Exception as e:
pass logger.warning("Failed to fetch model list from configured endpoint", exc_info=e)
return [] return []
def normalize_model_id( def normalize_model_id(
@@ -2130,6 +2138,8 @@ async def stream_llm(url: str, model: str, messages: List[Dict], temperature: fl
yield _stream_delta_event(reasoning, thinking=True) yield _stream_delta_event(reasoning, thinking=True)
content = delta.get("content") or "" content = delta.get("content") or ""
if content: if content:
content = re.sub(r"<mm:think(\s+[^>]*)?>", r"<think\1>", content, flags=re.IGNORECASE)
content = re.sub(r"</mm:think>", "</think>", content, flags=re.IGNORECASE)
stripped = content.lstrip() stripped = content.lstrip()
# gpt-oss harmony format (<|channel|>analysis/final): route via the harmony # gpt-oss harmony format (<|channel|>analysis/final): route via the harmony
# stream router. Sticky once the first marker appears — distinct from the # stream router. Sticky once the first marker appears — distinct from the
+3 -1
View File
@@ -11,6 +11,8 @@ import os
import re import re
from typing import Any, Dict, List, Optional, Set, Tuple from typing import Any, Dict, List, Optional, Set, Tuple
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
def _format_mcp_connection_error(name: str, command: str = "", args: Optional[List[str]] = None, error: Exception = None) -> str: def _format_mcp_connection_error(name: str, command: str = "", args: Optional[List[str]] = None, error: Exception = None) -> str:
@@ -508,7 +510,7 @@ class McpManager:
return False return False
script_rel, name = _BUILTIN_SERVERS[server_id] script_rel, name = _BUILTIN_SERVERS[server_id]
base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) base_dir = get_app_root()
script_path = os.path.join(base_dir, script_rel) script_path = os.path.join(base_dir, script_rel)
# Clean up old connection # Clean up old connection
+14 -5
View File
@@ -17,10 +17,11 @@ import httpx
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
_LOCAL_HOSTS = {"localhost", "127.0.0.1", "0.0.0.0", "::1", "host.docker.internal"} _LOCAL_HOSTS = {"localhost", "127.0.0.1", "0.0.0.0", "::1", "host.docker.internal"}
_PRIVATE_PREFIXES = ("10.", "172.16.", "172.17.", "172.18.", "172.19.", _PRIVATE_NETWORKS = (
"172.20.", "172.21.", "172.22.", "172.23.", "172.24.", ipaddress.ip_network("10.0.0.0/8"),
"172.25.", "172.26.", "172.27.", "172.28.", "172.29.", ipaddress.ip_network("172.16.0.0/12"),
"172.30.", "172.31.", "192.168.") ipaddress.ip_network("192.168.0.0/16"),
)
# Tailscale uses the CGNAT range 100.64.0.0/10, NOT all of 100.0.0.0/8. # Tailscale uses the CGNAT range 100.64.0.0/10, NOT all of 100.0.0.0/8.
# A bare "100." prefix would classify public addresses (e.g. AWS ranges # A bare "100." prefix would classify public addresses (e.g. AWS ranges
@@ -36,6 +37,14 @@ def _in_tailscale_range(host: str) -> bool:
return False return False
def _is_private_ip_literal(host: str) -> bool:
try:
ip = ipaddress.ip_address(host)
except ValueError:
return False
return any(ip in network for network in _PRIVATE_NETWORKS)
def _normalize_base_for_compare(url: str) -> str: def _normalize_base_for_compare(url: str) -> str:
url = (url or "").strip().rstrip("/") url = (url or "").strip().rstrip("/")
for suffix in ("/chat/completions", "/models", "/completions", "/v1/messages"): for suffix in ("/chat/completions", "/models", "/completions", "/v1/messages"):
@@ -87,7 +96,7 @@ def is_local_endpoint(url: str) -> bool:
return True return True
try: try:
host = urlparse(url).hostname or "" host = urlparse(url).hostname or ""
return host in _LOCAL_HOSTS or host.startswith(_PRIVATE_PREFIXES) or _in_tailscale_range(host) return host in _LOCAL_HOSTS or _is_private_ip_literal(host) or _in_tailscale_range(host)
except Exception: except Exception:
return False return False
+41
View File
@@ -322,6 +322,47 @@ class PersonalDocsManager:
else: else:
logger.info(f"Directory not in index: {directory}") logger.info(f"Directory not in index: {directory}")
def rename_directory(self, old_directory: str, new_directory: str, *, path_map: Dict[str, str] = None):
"""Rewrite tracked directory and excluded-file paths after an owner rename."""
old_directory = os.path.abspath(old_directory)
new_directory = os.path.abspath(new_directory)
path_map = {os.path.abspath(k): os.path.abspath(v) for k, v in (path_map or {}).items()}
def rewrite(path: str) -> str:
abs_path = os.path.abspath(path)
mapped = path_map.get(abs_path)
if mapped:
return mapped
if abs_path == old_directory:
return new_directory
if abs_path.startswith(old_directory + os.sep):
return new_directory + abs_path[len(old_directory):]
return abs_path
changed_dirs = False
rewritten_dirs = []
for directory in self.indexed_directories:
rewritten = rewrite(directory)
changed_dirs = changed_dirs or rewritten != os.path.abspath(directory)
if rewritten not in rewritten_dirs:
rewritten_dirs.append(rewritten)
if changed_dirs:
self.indexed_directories = rewritten_dirs
self.save_directories()
changed_excluded = False
rewritten_excluded = set()
for path in self.excluded_files:
rewritten = rewrite(path)
changed_excluded = changed_excluded or rewritten != os.path.abspath(path)
rewritten_excluded.add(rewritten)
if changed_excluded:
self.excluded_files = rewritten_excluded
self._save_excluded()
if changed_dirs or changed_excluded:
self.refresh_index()
def get_indexed_directories(self): def get_indexed_directories(self):
"""Get the list of all indexed directories.""" """Get the list of all indexed directories."""
return self.indexed_directories.copy() return self.indexed_directories.copy()
+1
View File
@@ -7,6 +7,7 @@ import time
from pathlib import Path from pathlib import Path
from src.constants import RAG_DIR from src.constants import RAG_DIR
from src.runtime_paths import get_app_root
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
+86
View File
@@ -50,6 +50,23 @@ def _generate_doc_id(text: str, owner: str = "") -> str:
return f"doc_{hashlib.sha256(key.encode('utf-8')).hexdigest()[:16]}" return f"doc_{hashlib.sha256(key.encode('utf-8')).hexdigest()[:16]}"
def _rewrite_owner_path(value: str, path_map: Dict[str, str], path_prefixes: List[tuple]) -> str:
if not isinstance(value, str) or not value:
return value
abs_value = os.path.abspath(value)
mapped = path_map.get(abs_value)
if mapped:
return mapped
for old_prefix, new_prefix in path_prefixes:
old_abs = os.path.abspath(old_prefix)
new_abs = os.path.abspath(new_prefix)
if abs_value == old_abs:
return new_abs
if abs_value.startswith(old_abs + os.sep):
return new_abs + abs_value[len(old_abs):]
return value
class VectorRAG: class VectorRAG:
"""RAG system using ChromaDB vector storage with hybrid search.""" """RAG system using ChromaDB vector storage with hybrid search."""
@@ -250,6 +267,75 @@ class VectorRAG:
"failed_count": len(docs) - len(valid), "failed_count": len(docs) - len(valid),
} }
def rename_owner(
self,
old_owner: str,
new_owner: str,
*,
path_map: Optional[Dict[str, str]] = None,
path_prefixes: Optional[List[tuple]] = None,
) -> Dict[str, Any]:
"""Rewrite existing RAG metadata after an auth username rename."""
if not self.healthy:
return {"success": False, "updated_count": 0, "message": "Collection not initialized"}
old_owner = (old_owner or "").strip().lower()
new_owner = (new_owner or "").strip().lower()
if not old_owner or not new_owner or old_owner == new_owner:
return {"success": True, "updated_count": 0, "message": "No owner rename needed"}
path_map = {os.path.abspath(k): os.path.abspath(v) for k, v in (path_map or {}).items()}
path_prefixes = path_prefixes or []
updated_ids = set()
failed_count = 0
for lane_name, collection in self._collections_for_delete():
try:
results = collection.get(
where={"owner": old_owner},
include=["metadatas"],
)
except Exception as e:
logger.warning("rename_owner metadata scan failed in %s lane: %s", lane_name, e)
failed_count += 1
continue
ids = results.get("ids") or []
metadatas = results.get("metadatas") or []
if not ids:
continue
new_metas = []
selected_ids = []
for doc_id, meta in zip(ids, metadatas):
if not isinstance(meta, dict):
continue
next_meta = dict(meta)
if str(next_meta.get("owner", "")).strip().lower() == old_owner:
next_meta["owner"] = new_owner
for key in ("source", "directory"):
next_meta[key] = _rewrite_owner_path(next_meta.get(key), path_map, path_prefixes)
selected_ids.append(doc_id)
new_metas.append(next_meta)
if not selected_ids:
continue
try:
collection.update(ids=selected_ids, metadatas=new_metas)
updated_ids.update(selected_ids)
except Exception as e:
logger.warning("rename_owner metadata update failed in %s lane: %s", lane_name, e)
failed_count += len(selected_ids)
success = failed_count == 0
return {
"success": success,
"updated_count": len(updated_ids),
"failed_count": failed_count,
"message": f"Updated {len(updated_ids)} RAG chunk(s)",
}
# ------------------------------------------------------------------ # ------------------------------------------------------------------
# Search — hybrid: vector similarity + keyword overlap # Search — hybrid: vector similarity + keyword overlap
# ------------------------------------------------------------------ # ------------------------------------------------------------------

Some files were not shown because too many files have changed in this diff Show More