* fix(security): escape backslashes in calendar bg-image CSS url()
The calendar event-background CSS escaped ' -> \' for a bg: image URL but
not backslashes first. Inside a single-quoted url('...'), \ is the CSS
escape char, so a URL value ending in/containing a backslash escapes the
closing quote and breaks out of the string, injecting arbitrary CSS. The
bg:<url> value is per-event and CalDAV-syncable, hence untrusted (CodeQL
js/incomplete-sanitization).
Add a single canonical _cssUrlEscape() in calendar/utils.js that escapes
backslashes FIRST, then quotes, and route all four sinks through it:
calendar.js:416 / :1263 (the flagged #463/#464), the event-form preview
(:2931), and _calBgCss() in utils.js — the latter two share the identical
bug but were unflagged. Output is byte-identical to the old escaping for
legitimate URLs (which contain no backslashes); only malicious input differs.
Resolves CodeQL js/incomplete-sanitization #463, #464.
* fix(security): route remaining calendar bg url() sinks through _cssUrlEscape
Review (vdmkenny) flagged that the centralization missed an injectable
sibling sink: the edit-form color-picker swatch (calendar.js:2856) built
`url('${url}')` from `existing.color` (a CalDAV-syncable, untrusted `bg:`
value) raw, then interpolated it into `style="background:..."` via innerHTML
- the same `'`/`\` breakout class as the sinks already fixed. The custom-dot
preview (:2953) was likewise raw (non-exploitable - a CSSOM `.style`
assignment of a URL the current user just picked - but it broke the invariant).
Route both through `_cssUrlEscape`, and normalize the two pre-escaped-variable
sites (_calItemBgStyle, _renderWeek) to the same inline form so all five
url() interpolations in calendar.js follow one rule. Add a whole-file
invariant test asserting every `url('${...}')` calls `_cssUrlEscape` - this
catches a future missed sink, the exact failure mode here. Behavior-identical
for legitimate URLs (no visual change).
* fix: document read fails with 403 when auth is disabled
Add _auth_disabled() bypass in _verify_doc_owner() and the
/api/documents/{session_id} route guard so documents remain accessible
in single-user / no-auth mode.
Minimal change: only adds the auth-disabled check alongside existing
403 raises — preserves existing formatting and line endings.
* refactor: hoist _auth_disabled import to module level
Address reviewer feedback on PR #4623 — no circular import exists
(src.auth_helpers only imports stdlib + fastapi), so the inline
imports are unnecessary. Moves the import to module top in both
document_helpers.py and document_routes.py.
* test: add regression tests for auth-disabled document access (PR #4623)
Remote Cookbook hwfit probes failed on Windows hosts because the PowerShell script was sent as nested -Command quoting through OpenSSH. Use -EncodedCommand for remote probes, auto-detect platform when omitted (including Darwin for Mac SSH hosts), and return a clearer error when SSH works but the probe fails.
Co-authored-by: Cursor <cursoragent@cursor.com>
Inline backtick spans were converted to <code> only at the end of
mdToHtml, after the bare-URL autolink and <a>/allowed-HTML passes. A URL
inside inline code is preceded by a space, so the autolink wrapped it in
an <a> tag and swapped it for an ___ALLOWED_HTML_ placeholder, corrupting
commands like `irm http://127.0.0.1:3000/x`.
Extract inline code into placeholders before the link passes, mirroring
the existing fenced-code-block handling, and restore them last so
placeholders carried inside restored <a> blocks resolve. Escape the code
at extraction time since it now bypasses the global escape pass.
The fallback regex in email_pollers.py that recovers a
[{"action": ...}, ...] JSON array from raw model output used lazy
[^[\]]*? runs inside a (?:,\s*\{...\}\s*)* repetition, which backtracks
exponentially (CodeQL py/redos) on inputs like [{"action"},{ + }},{{ * N.
It runs on the LLM reply to an email→calendar prompt embedding the
untrusted email body, so a crafted email can stall the background poller.
Extract the pattern to a module-level _CAL_ACTION_ARRAY_RE and rewrite the
object-content class from the lazy [^[\]]*? to a greedy brace-delimited
[^{}], which removes the quantifier ambiguity. The match is linear (a 500KB
adversarial input now resolves in <1ms) and equivalent on well-formed
arrays; it is also strictly more robust for values containing '[' or ']'
(the old class bailed on those and extracted nothing).
Resolves CodeQL py/redos #198.
The DELETE /api/personal/file disk-delete containment check used the
shared PERSONAL_UPLOADS_DIR root, so one admin could delete another
user's personal upload by passing its path (uploads are partitioned per
owner under <root>/<owner>/). Confine the check to the caller's own
per-owner subdir via _personal_upload_dir_for_owner(owner). RAG removal
and listing exclusion are unchanged (they still serve non-upload indexed
sources). Adds a regression test for the cross-owner case.
Moves create_session, list_sessions, send_to_session and manage_session out of
ai_interaction.py into src/agent_tools/session_tools.py (the do_ prefix
dropped) and registers them in TOOL_HANDLERS, so dispatch flows through the
registry instead of the dispatch_ai_tool elif in tool_execution.py. Same
pattern as the model-interaction move.
The bodies move verbatim; each fetches the runtime-set session manager via a
get_session_manager() shim, and reuses _resolve_model / AI_CHAT_TIMEOUT from
ai_interaction. manage_session's internal 'list' alias is repointed from the
old do_list_sessions to the moved list_sessions. stream_ai_tool (dead, no
callers) and do_pipeline stay put. dispatch_ai_tool loses its four now-unused
branches.
Tests: test_session_tools_registry covers registration, owner threading, the
manage_session->list_sessions delegation, graceful no-manager handling, and
registry dispatch. Verified end-to-end against a live SessionManager.
Detached bash jobs (#!bg) could be launched and auto-reported on completion,
but the agent had no way to act on a running one: no on-demand output read and
no kill (it blocked until the 1h max-runtime). bg_jobs had the pieces
(_read_output, list_for_session, internal _kill) but none was exposed.
Adds:
- bg_jobs.kill(job_id): tears down the process tree, marks the job killed, and
sets followed_up so the monitor does not also auto-continue a deliberate kill.
- manage_bg_jobs registry tool with actions list / output / kill, scoped to the
chat that launched the job (cross-session access reads as not found).
- Wiring: TOOL_HANDLERS/TAGS, function schema, RAG index + keyword hints, parser
name map, dispatch (threads session_id via _direct_fallback). Gated like bash
(NON_ADMIN_BLOCKED_TOOLS; plan-mode mutator).
- agent_loop: background-job intent regex maps to the files domain (and the tool
joins _DOMAIN_TOOL_MAP[files]) so short commands like 'kill that job' are not
dropped by the low-signal gate that skips tool retrieval.
- bg launch message tells the model to call manage_bg_jobs itself for check/stop
rather than printing raw tool syntax to the user.
Tests: tests/test_bg_job_tools.py (kill semantics, per-chat scoping, actions,
and the intent classifier).
* fix(tools): prune skipped dirs before descending in glob tool
GlobTool used pathlib.Path.rglob which descends into every directory
(including node_modules, .git, dist, etc.) and filters AFTER the walk.
On repos with large junk directories this causes the glob tool to hang
for minutes.
Replace rglob with os.walk that prunes _CODENAV_SKIP_DIRS before
descending — matching the approach GrepTool already uses. Also add a
fast path for literal patterns (no wildcards → direct path lookup).
Fixes#4493
* fix(tools): use regex glob matching to fix * semantics and literal fallback
Replace fnmatch with _glob_to_regex so that * stays within a single
path segment (matching pathlib/rglob semantics) and **/ spans zero or
more directories. Literal patterns now fall through to os.walk when
the direct path lookup misses, so e.g. 'foo.py' still finds files at
any depth.
Add tests for:
- bare literal matching in subdirectories
- multi-segment single-star patterns (sub/*.txt)
- * not crossing / boundaries
- ** matching at arbitrary depth
Closes#4493
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
The Dependencies tab's llama.cpp docker recipe surfaced
\`docker pull ghcr.io/ggerganov/llama.cpp:server-cuda\`. The upstream
repo moved from github.com/ggerganov/llama.cpp to
github.com/ggml-org/llama.cpp and the old GHCR namespace no longer
publishes images, so copying the recipe failed with:
failed to resolve reference "ghcr.io/ggerganov/llama.cpp:server-cuda":
not found
Point the recipe at \`ghcr.io/ggml-org/llama.cpp:server-cuda\`, which is
already the namespace routes/cookbook_routes.py uses for the source
clone. Adds a regression test in the same shape as
test_cookbook_diagnosis_js.py asserting the new namespace and forbidding
the dead one.
No CSS/HTML/SVG/style changes — the file is a pure data module
(no DOM access) consumed by other renderers; only the displayed command
text changes.
Two background tasks scheduled on every chat completion in
routes/chat_helpers.py — the memory/skill extraction dispatch and the
session auto-namer — are created via bare asyncio.create_task(...).
asyncio only holds a weak reference to the outer task, so the GC can
collect it mid-execution and the work silently never runs.
Add a module-private _BG_TASKS set and a _spawn_bg() helper that mirrors
WebhookManager._spawn_tracked (the pattern #3964 / #4336 established for
the webhook emitters two lines apart in the same function). Route both
call sites through it so the lifecycle owner is explicit.
Adds an AST-level guard test that fails on any bare
asyncio.create_task(...) statement in routes/chat_helpers.py to prevent
a regression — same shape as test_webhook_emitters_use_manager.py from
#4336.
The same bare pattern exists in routes/email_routes.py and
routes/cookbook_routes.py; left out of this PR per CONTRIBUTING.md's
"one fix per PR" and tracked in #4443's "Additional Information" for a
follow-up.
The persistent login cookie's max_age hardcoded 60 * 60 * 24 * 7, an
independent copy of the session token lifetime that core/auth.py already
defines once as TOKEN_TTL (and reports to the frontend via /api/auth/policy
as session_days). If TOKEN_TTL changes, the cookie silently drifts: the
browser keeps a cookie for a token whose lifetime no longer matches.
Import TOKEN_TTL and use it for the cookie max_age so the session lifetime
has a single source of truth. No behaviour change at the current value.
Fixes#4471
The harmony stream router only recognized the analysis and final channels, so
gpt-oss's standard `commentary` channel (tool-call preambles / function-arg
bodies) was unhandled: the literal `<|channel|>commentary` marker, the
`to=functions.*` recipient, and the commentary body all leaked into the
visible answer. Add commentary to the marker regex + the suffix-hold table, and
route its body to thinking (only `final` is user-facing). Adds a regression
test (split-chunk + recipient + body), verified to fail without the fix.
_patch_prefs installs a fake routes.prefs_routes with a bare
sys.modules[...] = assignment that is never undone. The stub is an empty
ModuleType without _save_for_user, so a later test whose code path runs
`from routes.prefs_routes import _save_for_user` (e.g. test_backup_import_skills)
fails with ImportError under an unfavorable test order.
Install the stub with monkeypatch.setitem instead (the helper already takes
monkeypatch and uses it for DATA_DIR) so it is reverted at teardown.
Repro: pytest tests/test_skill_index_prompt_injection.py tests/test_backup_import_skills.py
(1 failed before, 5 passed after).
* fix(agent): index api_call so RAG tool selection can retrieve it
api_call exists in FUNCTION_TOOL_SCHEMAS and the agent's system prompt
advertises configured API integrations, but the tool had no entry in
BUILTIN_TOOL_DESCRIPTIONS. RAG tool selection embeds those descriptions and
retrieves the top-K per message, so a tool without one can never be selected:
the agent claims it can call Home Assistant/Miniflux/Gitea/etc. and then
never receives the api_call schema (unless the Personal Assistant
ASSISTANT_ALWAYS_AVAILABLE path applies).
Add a retrieval-rich description for api_call, plus an ast-based parity test
asserting every FUNCTION_TOOL_SCHEMAS tool has an index description so the
next added tool cannot silently drift the same way.
Fixes#3794
* fix(agent): route API-integration intent to api_call at selection time
Addresses review (RaresKeY) on #3923: indexing api_call in the ToolIndex
description was necessary but not sufficient — the #3794 repro ('Use the
api_call tool to call Home Assistant GET /api/states') matched no domain in
_classify_agent_request, classified as low-signal, so the agent loop skipped
retrieval entirely and the schema filter sent only ALWAYS_AVAILABLE
(manage_memory/ask_user/update_plan). api_call never reached the model.
- _classify_agent_request: detect API-integration intent (api_call,
integration(s), Home Assistant/Miniflux/Gitea/Linkding/Jellyfin) -> new
'integrations' domain, so the turn is no longer low-signal.
- _DOMAIN_TOOL_MAP['integrations'] = {api_call}: deterministically seeds
api_call into relevant tools after retrieval, independent of embeddings.
- _DOMAIN_RULES['integrations']: rule pack (required — _domain_rules_for_tools
indexes _DOMAIN_RULES[domain] directly).
- tool_index _KEYWORD_HINTS: parity hint for the retrieval / keyword-fallback
paths.
- Regression drives the real classifier -> domain-map -> FUNCTION_TOOL_SCHEMAS
filter chain and asserts api_call is advertised for the #3794 prompt.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(document): allow render-pdf to be framed and 503 cleanly on missing PyMuPDF
Fixes#2101.
Two related bugs in the PDF-form library preview flow:
1. SecurityHeadersMiddleware was sending X-Frame-Options: DENY and
frame-ancestors 'none' on /api/document/{doc_id}/render-pdf, but
static/js/documentLibrary.js embeds the response in an <iframe> for
the library card preview. The browser blocked the load with
ERR_BLOCKED_BY_RESPONSE, leaving the user with a blank panel.
Extend the existing is_tool_render exemption to also cover
/api/document/.../render-pdf. Per-document owner checks still run in
the route handler, so the exemption is scoped the same way as the
tool-render exemption it mirrors. /api/document/.../export-pdf is
left untouched — it's a download (Content-Disposition: attachment),
not an iframe embed.
2. routes/document_routes.py:render_pdf called fill_fields, which
raises RuntimeError via _require_fitz() when the optional PyMuPDF
dependency isn't installed. That RuntimeError bubbled out as a
generic 500 with a cryptic 'PDF render failed' detail.
Reuse the existing _load_pdf_viewer_fitz() helper to fail fast with
a 503 and a user-actionable install hint (mentions
requirements-optional.txt and AGPL-3.0), matching the convention
used by the other PDF endpoints.
Tests cover both fixes:
- middleware headers on /api/document/.../render-pdf (iframeable, but
X-Content-Type-Options and Referrer-Policy are still set)
- middleware headers on /api/document/.../export-pdf (must stay strict)
- middleware path matching precision (similar-but-different paths stay
strict)
- middleware headers on /api/tools/.../render (no regression)
- middleware headers on /api/chat (no regression)
- render-pdf returns 503 with install hint when PyMuPDF is missing
- 503 is raised before any file I/O (fail-fast ordering)
* chore: address maintainer feedback on PDF previews same-origin framing and comment trimming
* chore: make render-pdf regression tests order-independent
Moves chat_with_model, ask_teacher and list_models out of ai_interaction.py
into src/agent_tools/model_interaction_tools.py (the do_ prefix dropped) and
registers them in TOOL_HANDLERS, so dispatch flows through the registry instead
of the dispatch_ai_tool elif in tool_execution.py.
The implementations are relocated, not wrapped. ai_interaction.py keeps only
the shared helpers they reuse (_resolve_model, AI_CHAT_TIMEOUT), still used by
the not-yet-migrated session/pipeline tools. dispatch_ai_tool loses its three
now-unused branches.
Also removes the dead do_second_opinion: it was already off the live tool
surface (no tag/schema/parsing/dispatch; tool_index.py notes it was removed),
so the function and its stale frontend catalog entries (admin.js, assistant.js)
are deleted.
Tests: owner-scope test points at the new list_models location and drops the
moved tools from the dispatch_ai_tool parametrize; a new
test_model_interaction_registry covers registration, owner threading, and
registry dispatch.
* fix(security): allowlist manage_mcp 'add' to close the agent-path RCE
do_manage_mcp('add') passed model- and prompt-injection-controlled command,
args, and env straight to a stdio subprocess spawn with no validation, and it
persisted an enabled server row before connecting (so a payload also survived
to re-execute on restart). A string smuggled into a skill description, memory
entry, fetched page, or email body could register a server running arbitrary
code as the app UID, e.g. command='sh' args=['-c','...'].
Add _validate_mcp_command, applied on the agent path before any DB write or
spawn:
- Hard-deny interpreters, runtimes, package runners, shells, and exec-wrappers
(even if an operator lists one in ODYSSEUS_MCP_ALLOWED_COMMANDS).
- Require a bare basename (no path components, no shell metacharacters) that is
present in the operator allowlist (empty by default).
- Reject code-exec argv flags by prefix so glued forms are caught too
(-c/-e/-m/--eval/--exec/--print/--module/--command/--require), remote-URL
args, and env keys that inject code into the child (LD_PRELOAD, NODE_OPTIONS,
PYTHONPATH, DYLD_*, PATH, ...).
A rejected registration returns an error, writes no row, and makes no
connection. The trusted admin route is unchanged. Mirrors the policy intent of
_validate_serve_cmd but inverted for the model-reachable surface.
Supersedes #438; incorporates the bypass forms found in its review (interpreter
script paths, -m pip, glued -c/-e, --eval=, eval subcommands, package runners,
remote URLs) and adds integration coverage on the real do_manage_mcp path.
Closes#2891
* fix(security): deny versioned/alias runtimes in manage_mcp allowlist
Addresses RaresKeY's review on #4433. The hard-deny matched command names
exactly, so versioned or alias runtime forms (python3.11, node18, pip3,
ruby3.2, java, javac, bunx, tsx, ts-node, pypy3, ...) slipped past and, if an
operator allowlisted one, re-opened the prompt-injection-controlled MCP
registration path.
- Canonicalize a trailing version suffix before the deny check so versioned
forms collapse to the family (python3.11 -> python, node18 -> node, pip3 ->
pip); both the raw basename and the canonical form are denied.
- Broaden the denied-family set (java/javac/jshell/jbang/kotlin/dotnet/mono/
swift/osascript/tsx/ts-node/bunx/pypy/jruby/raku/luajit/wish/expect/iex).
Deny runs before the operator allowlist, so an alias cannot be allowlisted back
in. Canonicalization only feeds the deny check, so a legit name that ends in a
digit still reaches the normal allowlist check rather than being mis-denied.
Adds validator + integration regressions for versioned/alias runtimes asserting
no DB row and no connection, including the allowlisted-anyway case.
* fix(hwfit): use CPU fallback for cpu_only speed estimates
* fix(hwfit): preserve ARM fallback for cpu_only estimates
---------
Co-authored-by: Cata <cata@bigjohn.local>
The scheduled-task runner built the agent's tool set from RAG retrieval plus
ASSISTANT_ALWAYS_AVAILABLE. Neither includes bash/python (nor the file tools),
and no keyword hint force-includes them, so a task only saw the shell when the
tool-embedding index happened to surface it. On hosts where that index is empty
or degraded (e.g. a fresh Docker deploy), retrieval returns nothing and the task
agent never receives bash/python — telling the user the shell is disabled even
for an admin owner.
Offer the shell/file group to task agents by default, mirroring the chat agent
where these are on unless a privilege or global setting turns them off. The
existing blocked_tools_for_owner() gate in stream_agent_loop still strips the
whole group for non-admin multi-user owners and only admits it for admins and
single-user (AUTH_ENABLED=false) deployments, so this changes what is offered,
not who is allowed. A crew that defines an explicit enabled_tools allowlist
still has its restriction honored.
Also merge the operator's global disabled_tools setting into the scheduler's
disabled set before composing relevant_tools and before entering the agent
loop, matching what chat already does. Without it, the global tool-disable
contract did not reach unattended scheduled tasks: an admin or AUTH_ENABLED=false
task could still see and call shell/file tools the operator had turned off
globally, since the prompt/schema/execution gates only enforce the disabled
tools passed in.
cmd_list filtered on the event START falling inside the window
(dtstart >= start AND dtstart < end). The canonical web route
(routes/calendar_routes.py) and the recurrence contract test use
OVERLAP semantics for non-recurring events: dtstart < end AND
dtend > start. So an event that began before the window but is still
ongoing inside it — e.g. a 09:00-17:00 conference listed at 14:00, or
any multi-day event spanning the window — was silently dropped by the
CLI even though the web UI shows it. Use overlap, matching the route.
dtend is NOT NULL in the schema, so no null-end regression.
The non-native (prompted) tool-call path fed tool output back to the model as a plain "[Tool execution results]" user message, bypassing the untrusted_context_message wrapper that THREAT_MODEL.md requires for tool output. That path is what models without native tool-calling (many smaller local models) use, so prompt-injection inside a tool result (fetched page, file read, MCP/email output) could be read as instructions there.
Wrap it via untrusted_context_message("tool execution results", ...), the same hardening already applied to skills (#788) and escalation traces (#275). Also update _recent_context_for_retrieval, which used the old "[Tool execution results]" prefix as a sentinel to keep tool envelopes out of the retrieval query, to recognise the wrapped envelope via metadata.trusted.
The native path keeps returning tool-role messages (a user-role wrapper would break the native tool-call contract); it is covered by UNTRUSTED_CONTEXT_POLICY. Adds tests/test_tool_output_prompt_injection.py.
Fixes#1627.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Added PASSWORD_MIN_LENGTH and RESERVED_USERNAMES to src/constants.py as the
single source of truth. Previously PASSWORD_MIN_LENGTH was hardcoded as 8 in
four route handlers and all three JS validation paths; RESERVED_USERNAMES was
an inline frozenset duplicated in core/auth.py, routes/assistant_routes.py,
routes/research_routes.py, and src/task_scheduler.py.
Added GET /api/auth/policy (unauthenticated) so the frontend reads the real
values from the server instead of hardcoding them in JS.
Added missing empty-username guard to /setup and admin POST /users. Both
returned a misleading 500/409 on whitespace-only input. /signup already had the
check; this makes all three consistent.