Commit Graph

1097 Commits

Author SHA1 Message Date
Vykos 06d28e23ac Scope document session links by owner (#3005) 2026-06-07 12:47:20 +02:00
Vykos 7b4e6c4c1b Enforce task chain owner scope (#3006) 2026-06-07 12:43:43 +02:00
Vykos 3cff06781e Scope model helper endpoint resolution (#3007) 2026-06-07 12:40:23 +02:00
Vykos ff4508d396 Scope vision model resolution by owner (#3009) 2026-06-07 12:39:02 +02:00
ooovenenoso c11ce66e0e docs: note dev branch status in README (#3196) 2026-06-07 12:16:14 +02:00
Lucas Daniel 34bd8f0491 fix(email): guarantee IMAP conn.logout() on all exception paths (#1530)
Three IMAP connection leaks were recently fixed via try/finally
(#1325, #1330, #1423). This commit applies the same pattern to the
remaining callsites that still used inline logout-only cleanup.

routes/email_helpers.py:
- _fetch_sender_thread_context: conn was uninitialized when the outer
  try/except returned early on connect failure, causing the finally
  block to crash on conn.close()/conn.logout(). Merged the two
  separate try blocks into one and added conn=None guard.
- _pre_retrieve_context: ctx_conn.logout() was inside the loop body
  with no finally, so any exception in the folder/search loop leaked
  the socket. Moved cleanup into a finally block with ctx_conn=None
  guard.

mcp_servers/email_server.py:
- _list_emails: multiple inline conn.logout() calls on early-return
  paths; exception between them leaked the socket. Wrapped in
  try/finally.
- _read_email: same pattern — four separate logout() calls replaced
  by a single finally block.
- _reply_to_email: logout() called before the error check, so an
  exception in conn.select() leaked the socket. Wrapped in
  try/finally.
- _download_attachment: same pattern as _reply_to_email.

Also adds tests/test_imap_leak_fixes.py with 9 regression tests (one
per function/failure-mode) that monkeypatch _imap_connect and assert
conn.logout() is called exactly once even when IMAP operations raise.
2026-06-07 05:09:28 +01:00
Joeseph Grey f78539ba15 fix(caldav): disable redirects on the sync/write-back DAVClient (SSRF) (#2663)
validate_caldav_url resolves and vets the initial host, but caldav's
niquests session follows 3xx redirects by default, so a validated public
URL can be redirected at request time to loopback/link-local/private
space, re-opening the SSRF the host check closes. The existing redirect
guard only covered the settings test-connection path.

Add a shared _build_dav_client helper that pins the session to zero
redirects (any 3xx then raises instead of silently following an
attacker-chosen Location), and route both the pull (_sync_blocking) and
write-back (_writeback_blocking) paths through it. Mirrors the
follow_redirects=False already used on the test-connection path.

Tests exercise the real DAVClient request path (a 302 toward an internal
host is refused, the sink is never contacted; the PROPFIND is asserted to
reach the public server first so the check can't pass vacuously), confirm
the helper disables redirects on the installed client, guard against a
raw DAVClient creeping back in, cover mixed public/internal DNS results
in both orderings, and add the resolves-to-no-usable-records fail-closed
branch.
2026-06-07 05:05:24 +01:00
Giuseppe 95c2dca4b5 fix(security): add HSTS and Permissions-Policy to SecurityHeadersMiddleware (#3081)
* fix(security): add HSTS and Permissions-Policy headers to SecurityHeadersMiddleware

Strict-Transport-Security is sent only when the connection is HTTPS
(detected via request.url.scheme or X-Forwarded-Proto: https), so
plain-HTTP dev deployments behind a reverse proxy are unaffected.

Permissions-Policy disables camera, microphone, and geolocation APIs
unconditionally — Odysseus does not use them, and this prevents a
successful XSS from requesting browser-native sensor access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(security): scope Permissions-Policy microphone directive to same-origin

Reviewers on PR #3081 (alteixeira20, NubsCarson) flagged that
microphone=() blocks mic access for same-origin (self) too, breaking
Odysseus's own voice/STT flow (getUserMedia({audio: true}) in
static/js/voiceRecorder.js). Scope it to microphone=(self) so
third-party origins stay locked out while the app's own UI keeps mic
access; camera and geolocation remain fully disabled as unused.

Adds focused middleware tests covering HSTS scoping (HTTPS direct,
X-Forwarded-Proto, absent on plain HTTP) and the Permissions-Policy
same-origin microphone contract.

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 04:58:33 +01:00
Karandeep Bhardwaj 3940297655 fix(webhooks): redact IPv6 addresses in sanitized error messages (#3038)
* fix(webhooks): redact IPv6 addresses in sanitized error messages

sanitize_error() only stripped IPv4 literals, so a failed webhook
delivery to an internal IPv6 host (::1, fe80::/fc00:: ...) leaked the
address into Webhook.last_error, which is surfaced in the UI. The module
already treats internal IPv6 as sensitive (see _PRIVATE_NETWORKS and
src/url_safety.py); the scrubber just didn't keep up.

Add an IPv6 redaction pass covering bracketed, full 8-group, and
::-compressed forms. The pattern is scoped to leave clock times
("12:34:56"), MAC addresses, and C++ "::" tokens untouched, and the
::-branch uses a lookahead over a flat character class so there is no
nested quantifier to backtrack on (no ReDoS on long colon/hex runs).

Adds tests/test_webhook_sanitize_error_ipv6.py.

* webhook: validate IPv6 candidates with ipaddress, not a regex grammar

Per review on #3038: instead of hand-rolling the IPv6 grammar in a regex
(brittle, and easy to over-match colon-heavy text), use a loose regex to
find candidate tokens and let ipaddress.ip_address() decide. Only tokens
it parses as IPv6 are redacted, so the false-positive guards (clock times,
MACs, "std::vector") now come from the stdlib instead of a custom pattern.

This also covers cases the old pattern missed -- zone ids (fe80::1%eth0)
and IPv4-mapped addresses -- and no longer partially mangles invalid
colon strings (a 9-group token is preserved whole rather than losing its
first 8 groups). The bracketed branch is a single greedy class with no
X*:X* backtracking; verified ~1ms on 40k-char adversarial input.

Extends the test file with zone-id, IPv4-mapped, and invalid-token cases.

* webhook: redact bracketed/scoped/IPv4-mapped IPv6 as one unit

Review on #3038 found a few IP forms left partially redacted or malformed
by sanitize_error():

  [fe80::1%eth0]:8080        -> [[redacted]]:8080
  [::ffff:192.168.0.1]:8080  -> [[redacted][redacted]]:8080
  ::ffff:192.168.0.1         -> [redacted][redacted]

Two causes: the bracketed branch's character class dropped zone ids, so
scoped addresses fell through to the bare branch and left the brackets and
port behind; and the IPv4 pass ran first, stripping the embedded v4 of an
IPv4-mapped address so the v6 pass then redacted the "::ffff:" remnant
separately.

Fix:
- run the IP-candidate pass before the IPv4 pass, so IPv4-mapped forms are
  matched and redacted whole
- match the full bracketed authority ([...] + optional %zone + :port) as a
  single token, and redact a v4-or-v6 literal inside [ ] as one [redacted]
- extend the bare branch with a bounded (exactly-3) dotted-quad tail for
  IPv4-mapped forms; exactly-3 so it can't swallow a partial suffix and
  accidentally preserve an otherwise-valid address

Each form now collapses to a single [redacted]; the candidate finder stays
linear (~1.3ms on 40k-char adversarial input). Adds regression tests for
the three reported forms and keeps the timestamp/MAC/std::vector coverage.
2026-06-07 04:55:33 +01:00
Nicholai a3cb15d0a1 fix(agent): enforce guide-only tool policy (#3088) 2026-06-06 18:48:24 -06:00
@aaronjmars 108ee1e32b fix(security): close DNS-rebinding hole on diffusion_server (wildcard CORS + missing Host check) (#347)
* fix(security): close DNS-rebinding hole on diffusion_server

scripts/diffusion_server.py used to ship `allow_origins=["*"]` with the
default `--host=127.0.0.1` bind. Combined, that left the OpenAI-compatible
image API reachable from any browser tab via DNS-rebinding: an attacker page
resolves its own domain to 127.0.0.1 mid-fetch, the browser forwards the
request to the loopback server, the server processes it (no Host check), and
the wildcard CORS reply lets the attacker page read the result + drive the
GPU. CWE-346 + CWE-942 + CWE-352 (DNS-rebinding bridge).

Fix:
  - Drop the wildcard CORS at module load (default-deny).
  - Install `TrustedHostMiddleware` with a loopback allowlist so DNS-rebound
    requests are rejected by the middleware before any route runs.
  - Add additive `--allowed-host` / `--allowed-origin` CLI flags so operators
    who need browser access on a specific origin can opt in explicitly without
    re-introducing the wildcard.

Tests: tests/test_diffusion_server_security.py (9 cases) pin the allowlist
helpers, the default-deny CORS behavior, and the live middleware paths via
Starlette's TestClient.

Detected by Aeon + semgrep + manual review.
Severity: medium.
CWE-346 / CWE-942 / CWE-352.

* test(diffusion-server): drive ASGI app via httpx, not TestClient portal

The TrustedHost/CORS integration tests used `with TestClient(app) as
client:`, whose context-manager form spins up an anyio blocking portal to
run the app lifespan. Under the repo's pytest setup (anyio plugin active, a
stray asyncio_mode option, no pytest-asyncio) that portal deadlocks —
`test_trusted_host_middleware_rejects_attacker_host` hung indefinitely in
review before emitting any assertion output.

Replace the TestClient usage with a tiny _asgi_get() helper that drives the
ASGI app over httpx.ASGITransport on a fresh event loop (asyncio.run). No
portal, no lifespan, no dependency on the host project's async test plugins.
Host is taken from the request URL so TrustedHostMiddleware sees the exact
hostname under test; Origin goes through headers. Assertions are unchanged.

Focused test now passes in 0.12s; full file 9 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: aeonframework <aeonframework@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 23:34:39 +01:00
muhamed hamed b03d934ec6 fix: restore backup import after skills migration (#2980) 2026-06-06 21:46:32 +01:00
Lucas Daniel eb840459f5 ci: skip pytest smoke on documentation-only changes (#2768)
* ci: skip pytest smoke on documentation-only changes

Adding paths-ignore for **.md and docs/** so that PRs that touch only
markdown files do not trigger the full pytest suite. Runner minutes are
spent only when Python or config files change.

Closes #2646.

* ci: detect docs-only changes inside the job instead of paths-ignore

Previously paths-ignore on the pull_request trigger caused the entire
workflow to be skipped, which can leave required checks pending and block
merging. Instead, keep the workflow always-triggered and detect docs-only
changes inside python-tests with a git diff step; if every changed file
is a .md or docs/ path, the step reports success without running pytest.

The syntax jobs (python-syntax, node-syntax) are cheap enough to always run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 16:00:46 +01:00
Mohammed Riaz 6ccd4500d7 fix(chat): show requested and actual reply models
Show requested and actual reply models in chat labels when fallback or provider routing changes the responding model.
2026-06-06 04:30:16 -06:00
Merajul Arefin 2e37d72155 fix(chat): stop code-block button flicker during streaming (#3023)
Render streamed markdown incrementally (freeze finalized blocks,
re-render only the growing tail) instead of re-rendering the whole
message every token, which recreated every <pre> and dropped CSS :hover.
2026-06-06 04:08:54 -06:00
Ocean Bennett fb9c7cf3da fix(calendar): accept list event range aliases 2026-06-06 03:47:18 -06:00
Nicholai 33edc40eae fix: route misfenced web lookups to web tools
Fixes #3067
2026-06-06 03:46:31 -06:00
Giuseppe e87a1ad8d2 fix(deep-research): wrap fetched webpage content in untrusted-context sandbox
The goal-based extractor passed raw fetched webpage content straight
into the LLM prompt via string substitution, bypassing the
prompt-injection hardening layer in src/prompt_security.py.

Split EXTRACTOR_PROMPT into EXTRACTOR_SYSTEM (task instructions +
goal, trusted) and a second message built with untrusted_context_message()
(raw page content, sandboxed with <<<UNTRUSTED_SOURCE_DATA>>> guards).
This aligns the extractor with every other external-content injection
site in the codebase (agent_loop, chat_processor, chat_routes).

Fixes #3044

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 03:37:10 -06:00
Giuseppe 893cb8254f fix(sessions): retry resumeStream in poll loop when chatModule loads late
sessions.js executes before chat.js in ES module order, so
window.chatModule is not yet set when _checkServerStream runs on page
load. The resumeStream guard evaluates false and the spinner fallback
kicks in; that fallback only polls stream_status and never retries the
live-resume path, leaving the user with a dead spinner for the entire
duration of the detached agent run.

Fix: add a one-shot retry in the polling loop. On the first tick where
window.chatModule.resumeStream is available, attempt to attach. If it
succeeds, clear the interval and remove the spinner — live SSE streaming
takes over. If the run has already finished (404), the loop continues to
poll status and calls selectSession on completion.

Fixes #3048

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 03:36:30 -06:00
Maruf Hasan 870ae2823f fix: lower minimized-dock z-index so modals stack above it 2026-06-06 03:35:48 -06:00
Nicholai 86abcb75d0 fix: split Chroma embedding lanes (#3046) 2026-06-06 03:17:19 -06:00
Nicholai 463713c2c6 feat(search): unify session transcript search (#2877) 2026-06-05 18:08:31 -06:00
Mateus Oliveira c2017fa089 Phase 1: consolidate tool output constants into src/constants.py (#2989)
MAX_OUTPUT_CHARS, MAX_READ_CHARS, and MAX_DIFF_LINES are now
defined once in src/constants.py and imported by the three files
that previously duplicated them (tool_execution.py,
tool_implementations.py, agent_tools.py). agent_tools.py re-exports
them for backward compatibility.

Co-authored-by: mcnoliveira <mcnoliveira@gmail.com>
2026-06-05 23:05:02 +02:00
michaelxer 53fd856ea8 fix: raise imaplib line limit for large mailboxes (#2895)
Python's imaplib._MAXLINE defaults to 1 MB. Mailboxes with tens of
thousands of messages exceed this on UID SEARCH ALL, crashing with
'got more than 1000000 bytes'.

Set _MAXLINE to 50 MB after opening the connection so large mailboxes
work without error.

Fixes #2883

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-05 22:59:35 +02:00
Fijar Lazuardy 66599b02a2 allow user who disable auth to use chat (#2548)
* allow user who disable auth to use chat

* only check non user on verify session owner

* fix import source

* rollback 401 to 403 for unauthorized error due to unit test

* change unauthenticated http code error to 401 and fix unit tests
2026-06-05 22:54:19 +02:00
n2b12 fb3e89b011 VRAM detection under native Windows install (#1610)
* Convert to different style of comment to make it easier to work with, fix formatting inside Powershell script.

* Grab VRAM amount from driver's registry keys.

* Fixed regression on NVIDIA GPUs
2026-06-05 22:49:47 +02:00
Logan Davis f72e1bd412 feat(reminders): add generic webhook as a fourth reminder channel (#2952)
Replaces any Discord-specific reminder channel with a generic outbound
webhook channel. Users pick any saved Integration as the target and
supply a JSON payload template with {{title}} and {{message}}
placeholders — values are JSON-escaped before substitution. Works with
Discord, Slack, Teams, ntfy (JSON mode), or any service that accepts a
POST with a JSON body.

- `src/settings.py` — reminder_webhook_integration_id +
  reminder_webhook_payload_template defaults
- `routes/note_routes.py` — webhook delivery block; Integration lookup,
  template rendering, auth wiring; built-in preset defaults so
  discord_webhook works out of the box without a configured template;
  settings_override kwarg avoids test-button race condition
- `routes/auth_routes.py` — discord_webhook preset test handler
- `src/integrations.py` — discord_webhook preset with description +
  example templates; hides auth/key fields in the Integration form
- `src/builtin_actions.py` — webhook_sent delivery check
- `src/tool_implementations.py` — webhook aliases + enum updated
- `static/index.html` — Webhook channel option; Integration picker +
  payload template textarea
- `static/js/settings.js` — Integration list, populateWebhookIntegrations,
  syncChannelRows, hints, load/save, auto-fill preset templates,
  test-button override payload, hide auth/key for URL-auth presets

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 22:47:57 +02:00
ooovenenoso 2bdf43b74d feat(cookbook): add Gemma4 thinking chat template (#2955)
* feat(cookbook): add Gemma4 thinking chat template

* fix(cookbook): place Gemma4 thinking token in system turn
2026-06-05 22:43:31 +02:00
horribleCodes c8b4cd24e0 fix: Add WSL paths to hardware detection fallback (#2933)
This change extends both the `PATH` variable and the list of absolute paths used to locate the `nvidia-smi` package to include `/usr/lib/wsl/lib`.
This path is a candidate for the default location of nvidia-smi for WSL machines (tested on WSL Ubuntu 22.04.5).
2026-06-05 21:34:41 +02:00
Paweł Drużyński f4aa661502 fix ambiguous naming, remove redundant json imports, fix _MCP_ARG_PARSERS type annotations (#2874) 2026-06-05 21:30:22 +02:00
Ocean Bennett 5911b8c0dc fix(models): allow same endpoint URL with different keys (#2758)
* fix(models): allow same endpoint URL with different keys

* fix(models): show endpoint key fingerprints
2026-06-05 21:12:14 +02:00
nubs 08e543d1ff fix(tool-parsing): don't ship unconvertible <invoke> fence content to the code executor (#2926) 2026-06-05 21:08:54 +02:00
nubs 47a47bf71d fix(llm): guard against null arguments in streaming tool-call accumulator (#2923) 2026-06-05 20:57:36 +02:00
michaelxer 71dda5b106 fix: respect user round count in deep research (#2896)
The STOP_PROMPT did not include the target round count, so the LLM
could decide to stop after 2-3 rounds even when the user requested 8.
Additionally, min_rounds was capped at 3 regardless of max_rounds.

- Add max_rounds to STOP_PROMPT so the LLM knows the target
- Change min_rounds from min(3, max_rounds) to max(2, max_rounds - 2)

Fixes #2863

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-05 20:49:42 +02:00
Logan Davis ad82ee1c83 feat(calendar): support multiple CalDAV accounts (#2942)
* feat(calendar): support multiple CalDAV accounts

Replaces the single CalDAV credential slot with a named account list so
users can sync both a personal and work calendar simultaneously.

- Add `account_id` column to `CalendarCal` + startup migration
- `_load_caldav_accounts()` in caldav_sync.py reads `caldav_accounts`
  list from prefs, auto-migrating the legacy single `caldav` key on
  first use (no user action required)
- `sync_caldav()` iterates all accounts and aggregates counts/errors
- `writeback_event()` resolves credentials via `CalendarCal.account_id`,
  falling back to the first account for legacy rows
- New REST endpoints: GET/POST/PUT/DELETE `/api/calendar/config/accounts`
- Legacy GET/POST `/api/calendar/config` preserved for backward compat
- Settings UI: one card per account with Label, URL, Username, Password
  fields; Test button works for both unsaved (inline creds) and saved
  (by account_id) accounts; delete removes only that account
- Update test_caldav_url_hardening.py mock to include `_save_for_user`
  and updated `_sync_blocking` signature

* fix(calendar): restore #2765 PK scoping and #2819 writeback URL validation

Two regressions introduced by the multi-account refactor:

1. PK collision (#2765): _stable_cal_id was back to hashing only the URL,
   so two users — or one user with two accounts on the same server — would
   collide on the primary key. Restore owner+account_id in the hash key
   (format: "{owner}\n{account_id}\n{url}") and thread both values through
   _sync_blocking → _writeback_blocking → push_event → find_remote_calendar
   so the hash round-trips correctly on write-back.

2. URL validation dropped (#2819): _load_caldav_accounts imported
   _save_for_user at function scope, causing an ImportError on test mocks
   that only provide _load_for_user, which prevented writeback_event from
   reaching the validate_caldav_url call. Move the import inside the
   migration branch and wrap in try/except (best-effort save; next call
   re-migrates from the still-present legacy key).

Update fake_writeback_blocking in test_caldav_writeback.py to accept the
new owner/account_id optional params.
2026-06-05 20:32:50 +02:00
ghreprimand 545e692565 fix(auth): distinguish empty model allowlists (#2938)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-05 20:27:10 +02:00
nubs fa9f62b44c fix(compactor): shrink oversized tool_calls arguments so trim_for_context can fit a tool-only turn (#2949) 2026-06-05 20:23:38 +02:00
Giulio Zelante b448119919 feat(skills): import SKILL.md bundles from public GitHub URLs (#2576)
* feat(skills): import SKILL.md bundles from public GitHub URLs

Supports GitHub tree/blob/raw links and skills.sh pages that resolve to GitHub.
Installs SKILL.md plus sibling text assets under data/skills/imported/.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): admin-gate URL import and validate redirect hosts

- require_admin on POST /api/skills/import-from-url (matches other skill admin routes)
- reject cross-host redirects after httpx follow_redirects
- test for redirect host validation

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): match Brain Add panel import/submit button styles

- Skill URL Import: theme-io-btn + download icon (same as memory Import)
- Add Skill submit: confirm-btn confirm-btn-primary

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): allow api.github.com during directory import

Real imports hit the GitHub contents API after redirects; whitelist
api.github.com and add regression tests. Shrink Import button with flex:none.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): align skill Import button with URL input row

Match memory-add-input height (28px) in memory-add-row and center the
download icon with flexbox instead of vertical-align hacks.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): cancel modal-body margin on skill Import button

The skill Import button sits in .memory-add-row beside an input; the
global .modal-body button { margin-top: 6px } rule only affected buttons,
pushing Import down and misaligning the download icon. Reset margin-top
and match Memory Import SVG markup at 28px row height.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(skills): surface GitHub API errors on URL import

Pass through GitHub response messages (especially 403 rate limits) as
SkillImportError instead of a generic download failure.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-05 19:48:23 +02:00
Enes Öz 977daf0643 Improve edge-docked window behavior (#2779)
* Make edge-docked windows resizable

Add draggable resize seams for left and right docked windows.

Keep the main chat area from getting too narrow and remember each window's dock width.

* Show emoji shortcodes as icons by default

Keep text-only emoji mode opt-in so model output like 😊 goes through the normal emoji renderer.

* Fix dock resize seams and left dock layout

Hide the resize seam when another floating modal is open, and keep the left-docked window from covering the chat area.

* Keep narrow modal tabs usable

* Fix split layout with both edge docks

* Fix left snap after right dock

* Enable left edge snap for all windows

* Tighten dock resize handle observers

* Use edge docking for settings window
2026-06-05 17:07:08 +02:00
Kenny Van de Maele 8ce945d338 feat: Add plan mode to the chat agent (#638)
* feat: Add plan mode to the chat agent

Adds a plan mode: the agent investigates read-only, proposes a checklist, and
waits for approval before changing anything. On approval it runs with full
tools and checks items off as it goes. Enforcement reuses the existing
disabled_tools gate.

Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the
plan toggle from the chat input.

- src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP).
- src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the
  plan directive, force agent mode.
- static/: plan toggle pill, Approve & Run, dockable plan window, task-list
  checkboxes, and the /plan slash command.
- tests/test_plan_mode.py.

* Plan mode: persistent re-referenceable plan + agent write-back

Three improvements so a long plan survives a weak model and stays in reach:

1. Re-reference the plan (out-of-context fix). On the execution turn the frontend
   sends the approved checklist back (`approved_plan`); the backend pins it as a
   top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so
   the agent can always re-read the plan instead of losing the thread on a long
   run. New `build_active_plan_note()` (unit-tested).

2. Re-open / dock the plan anytime. The plan checklist is stored per-session
   (localStorage). When a plan exists, the plan-mode button opens a small menu
   ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan
   window — so it can stay docked while the agent works. The window live-refreshes
   as the plan changes.

3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps
   `- [x]` after finishing them, or to revise steps when the user asks. Marker
   tool (no I/O) → `plan_update` SSE event → the stored plan + docked window
   update live. The ACTIVE PLAN note instructs the agent to use it.

Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb),
src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse
`approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools
/ tool_index (always-available, not admin-gated).
Frontend: static/js/chat.js (plan store, send `approved_plan`, handle
`plan_update`, capture restated checklists), static/app.js (plan-button menu),
static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key).
Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py.

* Plan mode: drop bash/python, rely on read-only discovery tools

Shell can mutate (write files, hit the network) and can't be constrained to
read-only at the tool layer, so plan mode no longer relies on a prompt to keep
it well-behaved — bash/python are removed from the read-only allowlist and added
to the fail-closed block set. Discovery is covered by the dedicated read-only
tools (read_file, grep, glob, ls) instead.

Rewrites the plan-mode directive to state shell is disabled and lists the
available read-only tools positively. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Comment: note _MCP_READONLY_VERBS are prefixes not whole words

Clarifies that entries like "summar" are intentional stems matched via
startswith (covers summarise/summarize/summary), not typos. Addresses review
feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: clarify why gating inverts the allowlist into a denylist

Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the
comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an
allowlist, so it returns the inverse (all known tool names minus the allowlist).
The static mutator set is a backstop for the schema-derived name list, which
misses XML-only tools and can fail to import. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: stop hardcoding the read-only tool list in the directive

The model is already shown its available (read-only) tools by _assemble_prompt,
which removes every disabled tool. Enumerating them again in the directive only
duplicated that list and would drift as tools change. Point at the tools listed
below instead. Addresses review feedback on #638.
2026-06-05 16:32:25 +02:00
nubs 2e207fc315 fix(notes): track + remove the select-mode Esc keydown listener so it doesn't leak per open (#2792) 2026-06-05 16:25:05 +02:00
Greg Stevenson 01f1278811 fix: Settings now correctly displays CalDAV integrations when more than one isconfigured (#2901)
* fix(calendar): expose source in calendar list and add per-calendar delete

- GET /api/calendar/calendars now includes source field so the frontend
  can distinguish CalDAV collections from local calendars
- Add DELETE /api/calendar/calendars/{cal_id} to remove a specific
  calendar and its events by owner-scoped ID

* fix(settings): show all CalDAV calendars in integrations list

Previously one card was shown for the CalDAV server connection regardless
of how many calendar collections had been synced. The Calendars page showed
them all; Settings did not.

- Fetch /api/calendar/calendars alongside existing requests and render
  one card per source=caldav collection, falling back to the single
  server-level card if nothing has synced yet
- Delete now targets the specific calendar by ID rather than clearing
  the whole server config
- Confirm dialog shows the calendar name so the user can verify before
  removing
2026-06-05 16:11:08 +02:00
ooovenenoso 4bfe0c690a fix(calendar): cap RRULE expansion (#2902) 2026-06-05 16:05:14 +02:00
ooovenenoso c9d0c6db18 fix: quote IMAP mailbox arguments (#2170)
* fix: quote IMAP mailbox arguments

* fix: quote MCP move destinations

---------

Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-05 16:00:20 +02:00
nubs 6973c5427c fix(model-context): count tool_calls in estimate_tokens so compaction sees real size (#2751) 2026-06-05 15:56:54 +02:00
nubs 8354948a1c fix(llm): route harmony thinking streams (#2449) 2026-06-05 15:22:08 +02:00
L1 8159733c6c fix(caldav): pull Google Calendar events from the events collection, not the /user principal (#2531)
* fix(caldav): pull Google Calendar events from the events collection, not the /user principal

Google serves its CalDAV principal at .../caldav/v2/<id>/user but events live
under .../caldav/v2/<id>/events. The caldav library's principal->home-set
discovery does not reliably enumerate calendars from Google's /user endpoint,
so _sync_blocking fell into its 'treat the URL as a single calendar' fallback
and ran every calendar-query REPORT against the principal URL. /user holds no
VEVENTs, so the REPORT returned a clean but empty 200 for every date range:
auth succeeded, the calendar stayed empty (Apple Calendar works because iCloud
exposes standard discovery at the pasted URL).

Add _google_caldav_events_url() to map a recognised Google principal URL to its
events collection, and route both discovery-less fallbacks through
_open_url_as_calendar() so Google syncs hit /events while other servers' URLs
are used unchanged.

Fixes #2507

* fix(caldav): also map Google's legacy www.google.com/calendar/dav principal URL

Some Google accounts authenticate against the older CalDAV endpoint
(https://www.google.com/calendar/dav/<id>/user) rather than the newer
apidata.googleusercontent.com/caldav/v2 form (reported on #2507). Both have the
same principal-vs-events split, so map the legacy /user URL to its /events
collection as well. The legacy branch is gated on the /calendar/dav/ path so an
unrelated www.google.com URL ending in /user is left untouched.
2026-06-05 15:18:16 +02:00
Ernest Hysa 7367325819 fix(caldav): include owner in calendar ID hash to prevent PK collision (#2765)
_stable_cal_id hashed only the remote URL, producing the same calendar
ID for all users syncing the same CalDAV endpoint. The second user would
get an IntegrityError on the primary key. Now includes owner in the
hash so each user gets a distinct calendar row.
2026-06-05 15:12:54 +02:00
Ernest Hysa 3738df3b93 fix(tasks): validate then_task_id belongs to same owner on create/update (#2764)
then_task_id was stored without checking the target task's owner. A user
could chain their task to execute any other user's task on success via the
scheduler's _run_chained path. Now verifies the target task exists and
belongs to the requesting user before storing.
2026-06-05 15:12:47 +02:00
Ernest Hysa f5c9095222 fix(document): add 404 guard to version list/get endpoints (#2762)
list_versions and get_version used a soft 'if doc:' guard that skipped
ownership verification when the Document row was missing (e.g. after
hard delete). Orphaned DocumentVersion rows would be returned to any
caller without auth. Now raises 404 when the parent document is gone,
matching the pattern already used in restore_version.
2026-06-05 15:12:40 +02:00