Commit Graph

1562 Commits

Author SHA1 Message Date
Wei Hong 7475779b7c fix(chat): track chat hot-path background tasks for strong references (#4443) (#4444)
Two background tasks scheduled on every chat completion in
routes/chat_helpers.py — the memory/skill extraction dispatch and the
session auto-namer — are created via bare asyncio.create_task(...).
asyncio only holds a weak reference to the outer task, so the GC can
collect it mid-execution and the work silently never runs.

Add a module-private _BG_TASKS set and a _spawn_bg() helper that mirrors
WebhookManager._spawn_tracked (the pattern #3964 / #4336 established for
the webhook emitters two lines apart in the same function). Route both
call sites through it so the lifecycle owner is explicit.

Adds an AST-level guard test that fails on any bare
asyncio.create_task(...) statement in routes/chat_helpers.py to prevent
a regression — same shape as test_webhook_emitters_use_manager.py from
#4336.

The same bare pattern exists in routes/email_routes.py and
routes/cookbook_routes.py; left out of this PR per CONTRIBUTING.md's
"one fix per PR" and tracked in #4443's "Additional Information" for a
follow-up.
2026-06-18 21:26:11 +02:00
Christian Eriksson e7ffc69729 fix(cookbook): scope the "Kill vLLM" diagnosis to actual vLLM tracebacks (#4517)
The diagnosis panel offered a "Kill vLLM processes" (pkill -f vllm) recovery
for ANY Python traceback — including pip build failures and other tracebacks
that have nothing to do with vLLM. That advice is useless for a build failure
and harmful if an unrelated vLLM server happens to be running.

ERROR_PATTERNS in static/js/cookbook-diagnosis.js had one catch-all traceback
matcher that always attached the vLLM-kill fix. Split it into three (all
keeping the existing healthy-server suppression):
- pip build failure (Failed to build / metadata-generation-failed /
  subprocess-exited-with-error / Could not build wheels) -> "a dependency
  failed to build" message, no kill.
- vLLM-specific traceback (tail mentions vllm) -> keeps the kill, now scoped.
- any other traceback -> neutral "check the captured output" message, no kill.

How to test:
- node --check static/js/cookbook-diagnosis.js
- Trigger a wheel-build failure (old package on a newer Python) or a non-vLLM
  traceback and open the diagnosis. Before: generic traceback message + "Kill
  vLLM processes" button. After: a build-failure / neutral message with no kill;
  only a real vLLM traceback still offers it.

Fixes #4516

Co-authored-by: Claude
2026-06-18 21:18:14 +02:00
Karl Jussila 396e26b4bf fix(auth): tie remember-me cookie lifetime to TOKEN_TTL (#4472)
The persistent login cookie's max_age hardcoded 60 * 60 * 24 * 7, an
independent copy of the session token lifetime that core/auth.py already
defines once as TOKEN_TTL (and reports to the frontend via /api/auth/policy
as session_days). If TOKEN_TTL changes, the cookie silently drifts: the
browser keeps a cookie for a token whose lifetime no longer matches.

Import TOKEN_TTL and use it for the cookie max_age so the session lifetime
has a single source of truth. No behaviour change at the current value.

Fixes #4471
2026-06-18 21:15:48 +02:00
nubs 0bfc7750a2 fix(llm): route gpt-oss harmony commentary channel without leaking markers/tool-args (#4523)
The harmony stream router only recognized the analysis and final channels, so
gpt-oss's standard `commentary` channel (tool-call preambles / function-arg
bodies) was unhandled: the literal `<|channel|>commentary` marker, the
`to=functions.*` recipient, and the commentary body all leaked into the
visible answer. Add commentary to the marker regex + the suffix-hold table, and
route its body to thinking (only `final` is user-facing). Adds a regression
test (split-chunk + recipient + body), verified to fail without the fix.
2026-06-18 21:12:25 +02:00
Rolly Calma 790ef81b06 fix: use aware UTC in health timestamp (#4503) 2026-06-18 20:58:25 +02:00
Victor 804691501f test: stop test_skill_index_prompt_injection leaking a stub prefs_routes (#4387)
_patch_prefs installs a fake routes.prefs_routes with a bare
sys.modules[...] = assignment that is never undone. The stub is an empty
ModuleType without _save_for_user, so a later test whose code path runs
`from routes.prefs_routes import _save_for_user` (e.g. test_backup_import_skills)
fails with ImportError under an unfavorable test order.

Install the stub with monkeypatch.setitem instead (the helper already takes
monkeypatch and uses it for DATA_DIR) so it is reverted at teardown.

Repro: pytest tests/test_skill_index_prompt_injection.py tests/test_backup_import_skills.py
(1 failed before, 5 passed after).
2026-06-18 20:54:15 +02:00
dependabot[bot] 8e6a2e89f8 chore(deps): bump actions/checkout in the actions group (#4559)
Bumps the actions group with 1 update: [actions/checkout](https://github.com/actions/checkout).


Updates `actions/checkout` from 6.0.3 to 7.0.0
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/df4cb1c069e1874edd31b4311f1884172cec0e10...9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 20:49:58 +02:00
dependabot[bot] dbcc7874bf chore(deps): bump the npm group with 2 updates (#4558)
Bumps the npm group with 2 updates: [@anthropic-ai/sdk](https://github.com/anthropics/anthropic-sdk-typescript) and [@antithesishq/bombadil](https://github.com/antithesishq/bombadil).


Updates `@anthropic-ai/sdk` from 0.104.1 to 0.105.0
- [Release notes](https://github.com/anthropics/anthropic-sdk-typescript/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-typescript/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-typescript/compare/sdk-v0.104.1...sdk-v0.105.0)

Updates `@antithesishq/bombadil` from 0.5.0 to 0.6.1
- [Release notes](https://github.com/antithesishq/bombadil/releases)
- [Changelog](https://github.com/antithesishq/bombadil/blob/main/CHANGELOG.md)
- [Commits](https://github.com/antithesishq/bombadil/compare/v0.5.0...v0.6.1)

---
updated-dependencies:
- dependency-name: "@anthropic-ai/sdk"
  dependency-version: 0.105.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: npm
- dependency-name: "@antithesishq/bombadil"
  dependency-version: 0.6.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: npm
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-18 20:42:49 +02:00
RaresKeY 16e660ad09 fix(hwfit): normalize CPU arch for fallback estimates (#4441) 2026-06-18 20:26:22 +02:00
Mazen Tamer Salah b51d83b16d fix(agent): index api_call so RAG tool selection can retrieve it (#3923)
* fix(agent): index api_call so RAG tool selection can retrieve it

api_call exists in FUNCTION_TOOL_SCHEMAS and the agent's system prompt
advertises configured API integrations, but the tool had no entry in
BUILTIN_TOOL_DESCRIPTIONS. RAG tool selection embeds those descriptions and
retrieves the top-K per message, so a tool without one can never be selected:
the agent claims it can call Home Assistant/Miniflux/Gitea/etc. and then
never receives the api_call schema (unless the Personal Assistant
ASSISTANT_ALWAYS_AVAILABLE path applies).

Add a retrieval-rich description for api_call, plus an ast-based parity test
asserting every FUNCTION_TOOL_SCHEMAS tool has an index description so the
next added tool cannot silently drift the same way.

Fixes #3794

* fix(agent): route API-integration intent to api_call at selection time

Addresses review (RaresKeY) on #3923: indexing api_call in the ToolIndex
description was necessary but not sufficient — the #3794 repro ('Use the
api_call tool to call Home Assistant GET /api/states') matched no domain in
_classify_agent_request, classified as low-signal, so the agent loop skipped
retrieval entirely and the schema filter sent only ALWAYS_AVAILABLE
(manage_memory/ask_user/update_plan). api_call never reached the model.

- _classify_agent_request: detect API-integration intent (api_call,
  integration(s), Home Assistant/Miniflux/Gitea/Linkding/Jellyfin) -> new
  'integrations' domain, so the turn is no longer low-signal.
- _DOMAIN_TOOL_MAP['integrations'] = {api_call}: deterministically seeds
  api_call into relevant tools after retrieval, independent of embeddings.
- _DOMAIN_RULES['integrations']: rule pack (required — _domain_rules_for_tools
  indexes _DOMAIN_RULES[domain] directly).
- tool_index _KEYWORD_HINTS: parity hint for the retrieval / keyword-fallback
  paths.
- Regression drives the real classifier -> domain-map -> FUNCTION_TOOL_SCHEMAS
  filter chain and asserts api_call is advertised for the #3794 prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 08:43:25 +00:00
Shreyas S Joshi f70db19cc6 fix(document): allow render-pdf to be framed and 503 cleanly on missing PyMuPDF (#2103)
* fix(document): allow render-pdf to be framed and 503 cleanly on missing PyMuPDF

Fixes #2101.

Two related bugs in the PDF-form library preview flow:

1. SecurityHeadersMiddleware was sending X-Frame-Options: DENY and
   frame-ancestors 'none' on /api/document/{doc_id}/render-pdf, but
   static/js/documentLibrary.js embeds the response in an <iframe> for
   the library card preview. The browser blocked the load with
   ERR_BLOCKED_BY_RESPONSE, leaving the user with a blank panel.

   Extend the existing is_tool_render exemption to also cover
   /api/document/.../render-pdf. Per-document owner checks still run in
   the route handler, so the exemption is scoped the same way as the
   tool-render exemption it mirrors. /api/document/.../export-pdf is
   left untouched — it's a download (Content-Disposition: attachment),
   not an iframe embed.

2. routes/document_routes.py:render_pdf called fill_fields, which
   raises RuntimeError via _require_fitz() when the optional PyMuPDF
   dependency isn't installed. That RuntimeError bubbled out as a
   generic 500 with a cryptic 'PDF render failed' detail.

   Reuse the existing _load_pdf_viewer_fitz() helper to fail fast with
   a 503 and a user-actionable install hint (mentions
   requirements-optional.txt and AGPL-3.0), matching the convention
   used by the other PDF endpoints.

Tests cover both fixes:
- middleware headers on /api/document/.../render-pdf (iframeable, but
  X-Content-Type-Options and Referrer-Policy are still set)
- middleware headers on /api/document/.../export-pdf (must stay strict)
- middleware path matching precision (similar-but-different paths stay
  strict)
- middleware headers on /api/tools/.../render (no regression)
- middleware headers on /api/chat (no regression)
- render-pdf returns 503 with install hint when PyMuPDF is missing
- 503 is raised before any file I/O (fail-fast ordering)

* chore: address maintainer feedback on PDF previews same-origin framing and comment trimming

* chore: make render-pdf regression tests order-independent
2026-06-18 06:25:26 +00:00
Kenny Van de Maele 56ba144875 refactor(tools): move model-interaction tools to the agent_tools registry (#4445)
Moves chat_with_model, ask_teacher and list_models out of ai_interaction.py
into src/agent_tools/model_interaction_tools.py (the do_ prefix dropped) and
registers them in TOOL_HANDLERS, so dispatch flows through the registry instead
of the dispatch_ai_tool elif in tool_execution.py.

The implementations are relocated, not wrapped. ai_interaction.py keeps only
the shared helpers they reuse (_resolve_model, AI_CHAT_TIMEOUT), still used by
the not-yet-migrated session/pipeline tools. dispatch_ai_tool loses its three
now-unused branches.

Also removes the dead do_second_opinion: it was already off the live tool
surface (no tag/schema/parsing/dispatch; tool_index.py notes it was removed),
so the function and its stale frontend catalog entries (admin.js, assistant.js)
are deleted.

Tests: owner-scope test points at the new list_models location and drops the
moved tools from the dispatch_ai_tool parametrize; a new
test_model_interaction_registry covers registration, owner threading, and
registry dispatch.
2026-06-18 05:56:37 +00:00
Matyas Gosztonyi 97a7f59fe7 fix(ui): share one z-order stack across Notes and modals (#3798)
* fix(notes): bring pane above active windows

* fix(notes): align tool window z-order handoff

---------

Co-authored-by: Matyas Fenyves <16389204+uhhgoat@users.noreply.github.com>
2026-06-17 12:15:48 +02:00
Afonso Coutinho 24ace44888 fix: canvasCoords crashes on empty touch list (mobile race) (#2045) 2026-06-17 10:25:39 +02:00
Kenny Van de Maele 93569b141b fix(security): allowlist manage_mcp 'add' to close the agent-path RCE (#4433)
* fix(security): allowlist manage_mcp 'add' to close the agent-path RCE

do_manage_mcp('add') passed model- and prompt-injection-controlled command,
args, and env straight to a stdio subprocess spawn with no validation, and it
persisted an enabled server row before connecting (so a payload also survived
to re-execute on restart). A string smuggled into a skill description, memory
entry, fetched page, or email body could register a server running arbitrary
code as the app UID, e.g. command='sh' args=['-c','...'].

Add _validate_mcp_command, applied on the agent path before any DB write or
spawn:
- Hard-deny interpreters, runtimes, package runners, shells, and exec-wrappers
  (even if an operator lists one in ODYSSEUS_MCP_ALLOWED_COMMANDS).
- Require a bare basename (no path components, no shell metacharacters) that is
  present in the operator allowlist (empty by default).
- Reject code-exec argv flags by prefix so glued forms are caught too
  (-c/-e/-m/--eval/--exec/--print/--module/--command/--require), remote-URL
  args, and env keys that inject code into the child (LD_PRELOAD, NODE_OPTIONS,
  PYTHONPATH, DYLD_*, PATH, ...).

A rejected registration returns an error, writes no row, and makes no
connection. The trusted admin route is unchanged. Mirrors the policy intent of
_validate_serve_cmd but inverted for the model-reachable surface.

Supersedes #438; incorporates the bypass forms found in its review (interpreter
script paths, -m pip, glued -c/-e, --eval=, eval subcommands, package runners,
remote URLs) and adds integration coverage on the real do_manage_mcp path.

Closes #2891

* fix(security): deny versioned/alias runtimes in manage_mcp allowlist

Addresses RaresKeY's review on #4433. The hard-deny matched command names
exactly, so versioned or alias runtime forms (python3.11, node18, pip3,
ruby3.2, java, javac, bunx, tsx, ts-node, pypy3, ...) slipped past and, if an
operator allowlisted one, re-opened the prompt-injection-controlled MCP
registration path.

- Canonicalize a trailing version suffix before the deny check so versioned
  forms collapse to the family (python3.11 -> python, node18 -> node, pip3 ->
  pip); both the raw basename and the canonical form are denied.
- Broaden the denied-family set (java/javac/jshell/jbang/kotlin/dotnet/mono/
  swift/osascript/tsx/ts-node/bunx/pypy/jruby/raku/luajit/wish/expect/iex).

Deny runs before the operator allowlist, so an alias cannot be allowlisted back
in. Canonicalization only feeds the deny check, so a legit name that ends in a
digit still reaches the normal allowlist check rather than being mis-denied.
Adds validator + integration regressions for versioned/alias runtimes asserting
no DB row and no connection, including the allowlisted-anyway case.
2026-06-16 14:34:53 +00:00
Catalin Iliescu 9a00401507 fix(hwfit): use CPU fallback for cpu_only speed estimates (#4397)
* fix(hwfit): use CPU fallback for cpu_only speed estimates

* fix(hwfit): preserve ARM fallback for cpu_only estimates

---------

Co-authored-by: Cata <cata@bigjohn.local>
2026-06-16 14:18:31 +00:00
Aura Rays Lab 76562ae31d Change host from 0.0.0.0 to 127.0.0.1 in CONTRIBUTING.md (#4422)
Updated the host address in the run command for clarity.
2026-06-16 13:40:47 +00:00
Christian Eriksson 497f455da6 fix(cookbook): open() no longer crashes when a task has a diagnosis (#4417)
_showDiagnosis referenced an undefined `body` (left over from the refactor
that moved the diagnosis text into the toolbar), throwing a ReferenceError
whenever a failed task rendered fix buttons. Because open() wraps its render
in try/finally with no catch, the throw escaped before the modal was
un-hidden, so the whole Cookbook silently failed to open.

- cookbook-diagnosis.js: append the fixes row to `diag` (the in-scope
  container) instead of the removed `body` element.
- cookbook.js: guard the render passes in open() so one broken task card
  can't leave the entire panel stuck hidden.

Fixes #4406
2026-06-16 13:35:51 +00:00
Ashvin dd20c2bc75 fix(tasks): offer shell/file tools to scheduled task agents by default (#4398)
The scheduled-task runner built the agent's tool set from RAG retrieval plus
ASSISTANT_ALWAYS_AVAILABLE. Neither includes bash/python (nor the file tools),
and no keyword hint force-includes them, so a task only saw the shell when the
tool-embedding index happened to surface it. On hosts where that index is empty
or degraded (e.g. a fresh Docker deploy), retrieval returns nothing and the task
agent never receives bash/python — telling the user the shell is disabled even
for an admin owner.

Offer the shell/file group to task agents by default, mirroring the chat agent
where these are on unless a privilege or global setting turns them off. The
existing blocked_tools_for_owner() gate in stream_agent_loop still strips the
whole group for non-admin multi-user owners and only admits it for admins and
single-user (AUTH_ENABLED=false) deployments, so this changes what is offered,
not who is allowed. A crew that defines an explicit enabled_tools allowlist
still has its restriction honored.

Also merge the operator's global disabled_tools setting into the scheduler's
disabled set before composing relevant_tools and before entering the agent
loop, matching what chat already does. Without it, the global tool-disable
contract did not reach unattended scheduled tasks: an admin or AUTH_ENABLED=false
task could still see and call shell/file tools the operator had turned off
globally, since the prompt/schema/execution gates only enforce the disabled
tools passed in.
2026-06-16 13:27:30 +00:00
Afonso Coutinho a36b423a4e Fix odysseus-calendar list dropping in-progress / multi-day events (#2065)
cmd_list filtered on the event START falling inside the window
(dtstart >= start AND dtstart < end). The canonical web route
(routes/calendar_routes.py) and the recurrence contract test use
OVERLAP semantics for non-recurring events: dtstart < end AND
dtend > start. So an event that began before the window but is still
ongoing inside it — e.g. a 09:00-17:00 conference listed at 14:00, or
any multi-day event spanning the window — was silently dropped by the
CLI even though the web UI shows it. Use overlap, matching the route.
dtend is NOT NULL in the schema, so no null-end regression.
2026-06-16 14:04:56 +02:00
Rudy Wolf 4e477741e7 harden(agent-loop): wrap non-native tool results as untrusted data (#1629)
The non-native (prompted) tool-call path fed tool output back to the model as a plain "[Tool execution results]" user message, bypassing the untrusted_context_message wrapper that THREAT_MODEL.md requires for tool output. That path is what models without native tool-calling (many smaller local models) use, so prompt-injection inside a tool result (fetched page, file read, MCP/email output) could be read as instructions there.

Wrap it via untrusted_context_message("tool execution results", ...), the same hardening already applied to skills (#788) and escalation traces (#275). Also update _recent_context_for_retrieval, which used the old "[Tool execution results]" prefix as a sentinel to keep tool envelopes out of the retrieval query, to recognise the wrapped envelope via metadata.trusted.

The native path keeps returning tool-role messages (a user-role wrapper would break the native tool-call contract); it is covered by UNTRUSTED_CONTEXT_POLICY. Adds tests/test_tool_output_prompt_injection.py.

Fixes #1627.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 13:35:07 +02:00
Kenny Van de Maele a2261c38c1 refactor(auth): centralize the internal-tool pseudo-username into a constant (#4333)
The in-process tool loopback stamps current_user = "internal-tool" and
require_admin grants admin to that sentinel; it is also a reserved username.
That security-sensitive string was hand-typed in ~7 places (stamp, admin gate,
RESERVED_USERNAMES, and standalone admin-equivalent checks in note/research/
shell/task routes), where a typo silently breaks an auth gate.

Add INTERNAL_TOOL_USER in core/middleware.py next to INTERNAL_TOOL_TOKEN/
INTERNAL_TOOL_HEADER and use it at every such site. A typo is now an
ImportError, not a silent mismatch. auth.py importing middleware is acyclic
(middleware imports no app modules). Behaviour is unchanged.

The multi-sentinel sets bundling internal-tool with api/demo/system
(assistant_routes, task_scheduler, research_routes) are a separate reserved-set
dedup, left for a follow-up.

Closes #4332
2026-06-16 13:13:00 +02:00
Alexandre Teixeira bf56010aad test: split provider classification tests (#4392) 2026-06-16 09:54:07 +00:00
Karl Jussila ee72d71872 fix(auth): centralize password and username validation constants (#4120)
Added PASSWORD_MIN_LENGTH and RESERVED_USERNAMES to src/constants.py as the
single source of truth. Previously PASSWORD_MIN_LENGTH was hardcoded as 8 in
four route handlers and all three JS validation paths; RESERVED_USERNAMES was
an inline frozenset duplicated in core/auth.py, routes/assistant_routes.py,
routes/research_routes.py, and src/task_scheduler.py.

Added GET /api/auth/policy (unauthenticated) so the frontend reads the real
values from the server instead of hardcoding them in JS.

Added missing empty-username guard to /setup and admin POST /users. Both
returned a misleading 500/409 on whitespace-only input. /signup already had the
check; this makes all three consistent.
2026-06-16 09:52:15 +02:00
RaresKeY 2b519bf355 fix(routes): normalize session owner fallback helpers (#4313)
* fix(memory): normalize import session fallback

* fix(chat): use token owner for compaction scope

* fix(background): honor session endpoint fallback
2026-06-16 06:07:42 +01:00
Kfir Sadeh d795d9a923 feat(launcher): add portable windows launcher (#976)
* feat(windows): add standalone portable executable, splash screen, and system tray

* test: fix test_get_wsl_windows_user_profile_falls_back_to_users_dir on Windows

* Refactor launcher: isolate desktop logic into launcher.py, clean app.py/requirements, update build scripts, and add tests

* chore: clean launcher test whitespace

---------

Co-authored-by: Alexandre Teixeira <alexandremagteixeira@gmail.com>
2026-06-16 04:58:16 +01:00
Tal.Yuan 648db61b45 docs(architecture): add Phase 0 runtime inventory document (#4148)
* docs(architecture): add Phase 0 runtime inventory document

Per #4082 requirements, this no-code planning document maps:
- Largest runtime modules (Python + frontend)
- Import dependency graph and cross-layer violations
- Route ownership grouped by feature domain
- Tool registry boundaries and split candidates
- Risk-ranked candidate slices with recommended first 3 PRs
- Safety guardrails and validation commands for follow-up work

Closes #4082

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(architecture): correct inventory metrics per review feedback

Address @alteixeira20 review on #4148 (CHANGES_REQUESTED):

- src/ flat .py: ~60 -> 91; routes/: 52 -> 54
- core/database.py importers: 49 -> 94; src/agent_loop.py: -> 21
- src/ -> routes/ import lines: ~20 -> 38
- src/ subdirs: 3 -> 2 (agent_tools/, search/); drop non-existent agent/
- move main.py and src/agent/ out of current-structure into new
  section 10 'Future Direction (NOT current state)'
- route grouping: frame as one domain per PR, not a broad
  reorganization (helper imports / registration / test path risk)

* docs(architecture): round-2 fixes — move to specs/, correct counts, frame as candidate

Per @alteixeira20 + @RaresKeY review on #4148:

- Move docs/architecture-runtime-inventory.md -> specs/ (docs/ is
  GitHub Pages public content, per @RaresKeY)
- src/ -> routes/ import lines: 38 -> 30 (direct grep of import lines
  referencing routes/, matching reviewer's count)
- self-caught count drift: tests 552 -> 544; routes->src 349 -> 351;
  src->core 49 -> 99
- frame section 6 (rankings/package shapes/split order/route grouping)
  and section 10 (future direction) as candidate proposals pending
  maintainer agreement, not a committed plan (per @RaresKeY)

* docs(architecture): round-3 reviewer fixes — fix tool categorization, counts, appendix

Self-review as reviewer found:
- §5.2 tool categories were wrong: listed filesystem/shell/email-sending
  tools that are NOT in tool_implementations.py (they live in src/agent_tools/).
  Rewrote to the actual 33 do_* functions grouped by domain
  (system/cookbook/calendar/notes/search/research/contacts/vault/image)
- §2.1 builtin_actions.py: 0 -> 2 classes, ~26 -> ~24 functions
- §5.1: '33+' -> '33' (exact count)
- Appendix A: 'Complete File Listing' -> 'File Listing'; src noted as
  '61 of 91 shown' (was claiming complete but listed 61)
- Last updated date refreshed

* docs(architecture): round-4 — verify remaining counts, soften §6.3 framing

- task_scheduler ~6 -> 5 funcs; tool_index ~580 -> 542 lines (verified vs dev)
- §6.3 'Recommended First 3 Slices' -> 'Candidate' (ownership unsettled, per review)
- verified §4 route-domain line counts, §2.2 frontend counts, mcp_servers=4
- full test suite: 3267 passed, 1 skipped, 0 failed

* docs(architecture): refresh Phase 0 inventory metrics + document counting method

Refresh every count against current dev (b58af42) per review on #4148:
- src/ flat .py: 91 -> 95; tests/test_*.py: 544 -> 583
- core.database importers: 94 -> 102; src.agent_loop importers: 21 -> 22
- src/ -> routes/ lines: 30 -> 31; routes/ -> src/: 351 -> 374; src/ -> core/: 99 -> 106
- Last updated: dev@9d7a3d6 -> dev@b58af42

Add a "How the metrics are computed" note under section 3.4 with the exact
grep/find command for each count, so the numbers are reproducible and future
dev drift is a one-command recheck instead of another review round (per the
request to note the counting method).

Documentation-only; no code changes.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(architecture): refresh remaining counts + add snapshot basis note

Reviewer self-audit of the previous refresh caught more stale counts after
the rebase onto dev@b58af42:
- tool_implementations importers: 18 -> 17 (§3.2, §6.2, Appendix B)
- core/database classes: 27 -> 28 (§2.1, §6.2)
- mcp_servers .py files: 4 -> 5 (§1.1)
- routes/ -> core/ import lines: 124 -> 126 (§3.4)

Line counts in §2.1/§2.2 also drifted over the rebased range but are left
as-is and covered by a new "Snapshot basis" note in the header: line counts
are a snapshot that drifts as dev moves (recompute with wc -l), while the
importer/file/import-line counts are the authoritative ones refreshed here.
This keeps the inventory honest about live metric vs structural snapshot, so
dev drift no longer triggers a review round.

Documentation-only; no code changes.

Co-Authored-By: Claude <noreply@anthropic.com>

* docs(architecture): fix missed tool_implementations importer count in §6.3

Follow-up to the previous refresh: §6.3 Slice 1 still read "18 importers"
after the 18->17 update elsewhere. Correct to 17 for consistency. Doc-only.

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: yuandonghao <yuandonghao@cohl.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:57:24 +01:00
RaresKeY 260ce8ba59 fix(email): enforce MCP owner boundaries (#4335)
* fix(email): enforce MCP owner boundaries

* fix(email): fail closed for unowned MCP fallback
2026-06-16 04:31:24 +01:00
RaresKeY 2f9ae43a58 test(email): cover sender signature owner cache writes (#4278) 2026-06-16 04:21:11 +01:00
RaresKeY 293bbfabf4 test(hwfit): cover SSH target validation regressions (#4279) 2026-06-16 04:18:21 +01:00
Alexandre Teixeira 0086399656 test: add fire_and_forget to API chat webhook stub (#4383) 2026-06-16 03:15:14 +00:00
RaresKeY 9d2989f386 test(auth): cover reserved username sentinel gate (#4276) 2026-06-16 04:09:58 +01:00
RaresKeY b5edbd3df7 fix(devops): harden docker config defaults (#4349) 2026-06-16 04:03:43 +01:00
RaresKeY 33fe7276be fix(endpoints): normalize URL handling (#4338) 2026-06-16 03:59:18 +01:00
RaresKeY a031a94a2e fix(cookbook): harden remote serve host handling (#4345) 2026-06-16 03:46:32 +01:00
RaresKeY 4d10c16d02 fix(auth): clean up rename and null-owner ownership (#4340) 2026-06-16 03:33:02 +01:00
RaresKeY 745c10e0d7 fix(gallery): confine gallery image path resolution (#4352) 2026-06-16 03:28:09 +01:00
Alexandre Teixeira 6b7a4c1e70 test: add oversized test split plan (#3987)
* test: add oversized test split plan

* test: refresh oversized split plan
2026-06-16 02:28:03 +00:00
RaresKeY 422f23fb12 fix(mcp): scope memory server by owner (#4315) 2026-06-16 03:18:17 +01:00
TheDragonTail 0f966d6b9f fix(embeddings): fall back to default cache dir when FASTEMBED_CACHE_PATH is empty (#3434)
docker-compose.yml injects FASTEMBED_CACHE_PATH=${FASTEMBED_CACHE_PATH:-},
which sets the variable to an empty string when the host has not defined it.
FASTEMBED_CACHE_DIR used os.getenv("FASTEMBED_CACHE_PATH", default), and
os.getenv only returns the default when the variable is ABSENT -- so the empty
value won and FASTEMBED_CACHE_DIR became "". os.makedirs("") then raised
[Errno 2] No such file or directory: '', FastEmbed failed to initialise, and
every vector feature (RAG, semantic memory, tool index) silently degraded on
the default Docker stack.

Treat an empty value like an absent one via `os.getenv(...) or default`.
Add a regression test covering the empty, unset, and explicit cases.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 03:11:48 +01:00
Afonso Coutinho 7b09491557 fix: check-in calendar digest leaks every user's events (missing owner scope) (#1925)
* fix: check-in calendar digest leaks every user's events (no owner scope)

* Seed dtend on calendar events in digest test so the NOT NULL column is satisfied
2026-06-16 02:42:41 +01:00
Kenny Van de Maele fafaf089c5 refactor(search): centralize the web-scraping User-Agent into one constant (#4325)
The outbound UA for web_fetch / web_search was inlined in four places with
two different values and nothing keeping them current: content.py pinned a
mid-2021 Chrome 91 build, and providers.py sent a bare Mozilla/5.0 in three
spots. Some sites serve a degraded or blocked page to a UA that old.

Add WEB_FETCH_USER_AGENT to src/constants.py (env-overridable, matching the
existing Copilot/Kimi UA-constant pattern) and import it in content.py and
providers.py. Default to a current, common desktop UA so pages return their
normal HTML: the market-leading desktop OS (Windows; NT 10.0 covers Windows
10 and 11) and browser (Chrome) on a current stable build. The version is now
bumped in one place.

Service-specific self-identifying agents (Copilot, Kimi, webhooks, cookbook)
are intentionally left separate. Adds a regression pinning the constant shape,
the env override, and a guard against a new inline Mozilla literal in the
search sources.

Closes #4324
2026-06-16 01:33:47 +00:00
RaresKeY b58af4267b fix(companion): require chat scope for model inventory (#4319) 2026-06-16 01:15:05 +02:00
AkioKoneko 8ff76f083c fix(cookbook): avoid launching Ollama during Windows cache scan (#4368) 2026-06-16 01:00:40 +02:00
Wei Hong 2196869c86 fix(webhooks): route public emitters through fire_and_forget (#3964) (#4336)
The three public webhook emitters in chat_helpers and webhook_routes
schedule deliveries via asyncio.create_task(webhook_manager.fire(...)),
which bypasses WebhookManager._bg_tasks. asyncio only holds a weak
reference to the outer task, so the GC can collect it mid-delivery and
the webhook is silently dropped.

Route all three through webhook_manager.fire_and_forget() so the task
is tracked by _spawn_tracked() and the manager owns the full lifecycle.

Adds an AST-level guard test that scans routes/ for direct
asyncio.create_task wrapping webhook_manager.fire(...) to prevent
regressions.
2026-06-16 00:41:45 +02:00
holden093 dd2e23c9af fix(agent): report phone numbers from resolve_contact when a matched contact has no email (#4327)
When a CardDAV contact matched the search query but had no email
address (only phone numbers), the tool silently dropped it and
returned 'No contacts found'.  Fall back to the contact's phone
number(s) so the caller still receives usable information.

Refs: #4178 (the contacts-domain classifier fix that made the model
actually call resolve_contact for contacts queries, surfacing this
pre-existing gap)
2026-06-16 00:03:33 +02:00
Fahim facc50cb0f fix(api): attribute bearer-token actions to the token owner on owner-scoped routes (#4054)
* fix(api): attribute bearer-token actions to the token owner on owner-scoped routes

Owner-scoped chat, session, and upload routes called
get_current_user(), which resolves a bearer ody_ API token to the
sandboxed "api" pseudo-user. A paired API-token client (companion, CLI,
IDE extension) therefore saw and created a separate "api"-owned silo
instead of the owner's data.

effective_user() already exists for exactly this: it attributes a token's
actions to request.state.api_token_owner, is identical to
get_current_user() for cookie sessions, and falls back safely when a
token has no owner. session_routes.py was already migrated; this
completes the migration for the remaining owner-scoped routes:

- chat_helpers.py: chat-privilege enforcement, message attribution, prefs/context
- chat_routes.py: orphaned-endpoint owner, session-auth owner, message search
- upload_routes.py: upload owner attribution + access checks

The /api/models swap is intentionally omitted: #4292 already migrated it
to effective_user (plus the chat-scope gate and ownerless-token 403), so
this PR keeps dev's version of routes/model_routes.py unchanged.

chat_routes.py keeps importing get_current_user for the workspace owner
gate; session_routes.py drops the now-unused import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: target effective_user in auth monkeypatches and owner-scope assertion

The owner-scoped routes now call effective_user() instead of
get_current_user(), so the tests that stubbed get_current_user (or
asserted on it) follow suit:

- test_chat_helpers.py, test_review_regressions.py,
  test_kv_cache_invalidation_2927.py: monkeypatch effective_user
- test_session_endpoint_owner_scope.py: assert the owner-scope guard uses
  effective_user(request)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:56:22 +02:00
Kenny Van de Maele 074a1e6eff fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955)
* fix(search): add download budgets to web_fetch with truncation notice and hard ceiling

MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content
buffered and cached the entire response body first, so a large or hostile
URL could pull arbitrarily many bytes into memory and the content cache.

The fetch is now a capped streaming GET (SSRF redirect guard unchanged):
a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call
override via full/max_bytes on the web_fetch tool, and a hard ceiling
(WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed.
When Content-Length already declares a body over the ceiling the fetch
is refused before any body bytes are buffered. Truncated results carry
truncated/fetched_bytes/total_bytes, the tool output leads with a
partial-content notice telling the model how to re-fetch with full=true,
and the tool schema documents the flag. A truncated PDF is reported as
a budget error since a cut PDF is unparseable. The effective cap is part
of the content-cache key so a truncated fetch is never served to a
full-budget request.

Existing tests that faked httpx.get or the old _get_public_url signature
are adapted to the streaming interface; behavior pins are unchanged.

Fixes #3812

* fix(search): close compressed-body cap bypass and protect the partial notice

Addresses RaresKeY's review on #3955:

- Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the
  wire bytes (and Content-Length) can be a fraction of the decoded body, so a
  tiny compressed response could pass the hard-cap preflight and then expand
  past the ceiling in a single decoded chunk before the streamed cap could
  slice it. Identity makes Content-Length the true body size and keeps each
  streamed chunk bounded by the network read, so the hard ceiling actually
  bounds memory.
- Lead web_fetch output with the partial-content notice and cap the page
  title. The notice is the user-facing contract for partial fetches, but the
  title is untrusted, uncapped page content; placed ahead of the notice a giant
  title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads
  and the title is capped as a second guard.

Adds regressions: the fetch advertises identity encoding, and a truncated
result with an oversized title still surfaces the partial notice.

* fix(search): reject compressed responses that ignore the identity request

Requesting Accept-Encoding: identity is not enough on its own: a server can
ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would
decode that, so a tiny compressed body could balloon into one decoded chunk
far past the hard cap before the streamed loop slices it (and Content-Length,
the compressed wire length, makes the preflight and size metadata unreliable).

Refuse a non-identity Content-Encoding before reading the body. Adds a
regression where the server ignores the identity request and returns gzip;
the fetch is refused before any body is decoded.
2026-06-15 17:38:09 +00:00
Kenny Van de Maele 2fab378c6a refactor(search): import REQUEST_TIMEOUT from constants in providers.py (#4331)
providers.py redefined REQUEST_TIMEOUT = 20 locally, shadowing the same
value in src/constants.py and risking drift if the constant is bumped.
Import it from src.constants and drop the local copy; same value, one
source of truth.

Closes #4329
2026-06-15 17:22:08 +00:00
Michael 5bafc30622 fix(api): normalize non-object JSON bodies to empty dict in token PATCH (#3976)
* fix(api): normalize non-object JSON bodies to empty dict in token PATCH

Valid non-dict JSON (e.g. [] or null) reaches payload.get(...) and
raises AttributeError. Normalize to {} so the route returns a controlled
response instead of an unhandled 500.

Fixes #3966

* test(api): add regression tests for PATCH with non-object JSON bodies

Covers array body ([]), null body, and normal object body as requested
in alteixeira20's review of #3976.

---------

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-15 18:05:15 +01:00