Commit Graph

335 Commits

Author SHA1 Message Date
Mazen Tamer Salah d58202d10e fix(presets): persist presets atomically to avoid corruption on crash (#2169)
PresetManager.save() used a plain open("w") + json.dump, which truncates
presets.json before writing the new content. A crash, power loss, or
serialization error mid-write leaves the file truncated/empty and every
saved preset is lost.

Route the write through core.atomic_io.atomic_write_json (tmp file +
os.replace), matching how the rest of the codebase persists JSON state.
The helper is imported lazily so this module stays free of the heavy core
package import graph at module load time.

Adds tests/test_preset_atomic_save.py covering the source contract, a
failed-write leaving the existing file intact, and a round trip.
2026-06-08 19:16:37 +02:00
Mazen Tamer Salah 1209f258d7 fix(caldav): skip the prune when any object fails to parse (#3454)
* fix(caldav): don't prune the whole window when no objects could be parsed

The post-sync prune deletes local origin=="caldav" rows in the window whose UID
the server didn't just return. With an empty seen_uids it falls back to
`uid.isnot(None)` — a match-all delete. That's right when the calendar is
genuinely empty, but when the server returns objects and every one fails to
parse (malformed iCal / an icalendar error), seen_uids is empty only because
nothing could be read, so the match-all branch silently deletes every local
event in the 90-day-back/365-day-forward window.

Track whether any object failed to parse and gate the prune with a small pure
helper `_should_prune_window(seen_uids, parse_failed)`: prune when something was
read, or when the calendar is genuinely empty (no objects, no parse errors), but
never when objects came back unreadable.

Adds tests/test_caldav_prune_parse_failure.py for the three cases.

* fix(caldav): skip the prune on any parse failure, not just total

Review follow-up (#3454): _should_prune_window returned True whenever seen_uids
was non-empty, so a partial parse failure (say 48 of 50 objects parse) still
pruned the 2 unreadable-but-still-upstream events, because their UIDs were absent
from seen_uids. Any parse failure makes seen_uids an incomplete view of the
server, so pruning against it is unsafe whether the failure is total or partial.

Skip the prune on any parse failure (return not parse_failed); only prune on a
clean read (a genuinely empty window is still safe to prune). Tradeoff: one
permanently-unparseable event pauses deletion mirroring until it is fixed, which
is the safe direction (false-keep beats false-delete).

Replace the now-incorrect "partial failure still prunes" assertion with a
partial-failure regression: one object parses, one fails, so the prune is
skipped and the unparsed event's local copy is not deleted.

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-08 18:59:14 +02:00
Mazen Tamer Salah d71284194b fix(memory): only delete memories the model explicitly drops in tidy (#3455)
* fix(memory): only delete memories the model explicitly drops in tidy

The AI memory-tidy path computed deletions as the complement of the model's
`keep` list (`if mid not in keep_ids: continue`). When the model returned a
valid response that simply omitted some existing ids — a common LLM lapse — every
omitted memory was silently deleted, even though it was neither a duplicate nor
listed in `drop`.

Honor the explicit `drop` set instead: delete only ids the model dropped (minus
any it saw only truncated), and preserve everything else, still applying cleaned
text/category from `keep`.

Adds tests/test_consolidate_memory_explicit_drops.py: a memory the model omits
from both keep and drop survives; an explicitly dropped one is removed.

* refactor(memory): remove now-dead keep_ids from tidy

After deletion switched to drop_ids and text/category rewrites to cleaned_by_id,
keep_ids was written but never read. Remove the init, the .add(mid) in the keep
loop, and the truncated .update() (its truncated-protection is already covered by
`drop_ids -= truncated_ids`). Pure deletion, no behavior change; tests stay green.

Addresses review feedback on #3455.

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-08 18:54:45 +02:00
stocky789 1e0d9b92af feat: add ChatGPT Subscription provider (#2876)
* feat: Add ChatGPT Subscription support and related features

- Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI.
- Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status.
- Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance.
- Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints.
- Added new slash commands for managing skills, including listing, searching, and invoking skills.
- Implemented caching for skill catalog to optimize performance.
- Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing.
- Refactored existing code to accommodate new features and improve maintainability.

* refactor: share provider device-flow setup

- reuse one device-flow backend for Copilot and ChatGPT Subscription
- add one frontend device-flow helper for Settings and /setup
- put GitHub Copilot back into Add Models, now as a dropdown option
- make provider selection just select; clicking Add starts sign-in
- stop ChatGPT Subscription setup from opening auth tabs automatically
- make /setup copilot and /setup chatgpt-subscription work from chat
- show ChatGPT Subscription in the /setup suggestions
- show the real error message when setup fails
- add focused tests for the shared flow and setup UI

* feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX

Backend:
- Resolve runtime bearer for provider-auth endpoints at probe time via a
  shared _resolve_probe_key() that delegates to resolve_endpoint_runtime,
  applied across all probe/refresh call sites.
- Skip live completion probes and health pings for discovery-only providers
  (centralized behind _is_discovery_only_provider) — the Codex/Responses API
  has no such endpoints, so status is derived from cached models.
- Never persist the short lived ChatGPT bearer to the plaintext sessions
  table; proactively clear any stale bearer left by an earlier code path.
- Revoke orphaned ProviderAuthSession credentials when the last endpoint
  backing them is deleted (_delete_orphaned_provider_auth), surfaced via
  cleared_provider_auth in the delete response.

Frontend (admin.js):
- Auto-start the device-auth flow on provider selection so the authorization
  panel (code + Authorize) shows immediately instead of behind a "Sign in" click.
- Remove the redundant top button for device auth providers, move retry
  into the panel via an inline "Try again".
- Drop the self-evident hint text and add an execCommand clipboard fallback so
  Copy works in non-secure (HTTP/LAN) contexts.

* fix: harden chatgpt subscription provider

* chore: remove PR media from branch

* Fix chatgpt subscription recovery and token handling

---------

Co-authored-by: 5p00kyy <admin@5p00ky.dev>
2026-06-08 10:19:18 +02:00
Mike ac94885c84 refactor(constants): single source of truth for data dir (#3368)
* refactor(constants): single source of truth for data dir + merge core/src constants

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(contributing): use named src.constants for data paths, drop core/constants references

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 09:58:52 +02:00
Kenny Van de Maele 505d8bae5a fix(cookbook): locate cookbook_state.json via DATA_DIR, not hardcoded /app/data (#3332)
Three call sites hardcoded Path("/app/data/cookbook_state.json"), which only
exists in Docker; on a native run the real path is <repo>/data, so the state
file looked missing and cookbook serve-state was silently ignored. Two others
used os.environ.get("DATA_DIR", "data") (a relative fallback, since DATA_DIR is
never set as an env var). Route all five through core.constants.DATA_DIR so the
path is consistent and absolute on both Docker and native.

Part of #3331.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 00:13:47 +01:00
nubs 1a0e1c5d69 fix(documents): restore PDF library metadata and preview (#2483)
PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf.

Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs.

The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none.

Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker.

Fixes #2468
2026-06-07 23:23:27 +02:00
Kenny Van de Maele 76c1f42ab0 fix: route all agent loopback calls through internal_api_base() helper (#3322)
#2753 made the agent loopback base port-configurable but only for
_COOKBOOK_BASE in tool_implementations. Several other in-process loopback
calls still hardcoded http://localhost:7000 and broke off port 7000:
cookbook_serve_lifecycle (model-endpoints x2, shell/exec), builtin_actions
(model/serve), task_routes (calendar x3), and the gallery/email calls in
tool_implementations.

Extract the resolution (ODYSSEUS_INTERNAL_BASE / APP_PORT / 7000 fallback,
127.0.0.1 to avoid IPv6 ambiguity) into core.constants.internal_api_base()
and route every call site through it. Rename the now-misnamed _COOKBOOK_BASE
to _INTERNAL_BASE since it serves gallery/email/calendar/serve too. Adds a
test for the resolver plus a regression guard against reintroducing the
literal.

Part of #2752.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 22:22:09 +01:00
Gunnar Arias d85c5e335e fix(security): harden untrusted_context_message against delimiter spoofing (#3086)
* fix(security): harden untrusted_context_message against delimiter spoofing

Root cause: untrusted_context_message() did not sanitise content before
interpolating it into the <<<UNTRUSTED_SOURCE_DATA>>> / <<<END_UNTRUSTED_SOURCE_DATA>>>
delimited sandbox block. Malicious content embedding the literal delimiter
strings could prematurely close the sandbox and inject instructions that
the LLM treats as trusted.

Fix: add _escape_guard_markers() helper that replaces the guard marker
strings with structurally inert tokens (<<<_UNTRUSTED_DATA>>> and
<<<_END_UNTRUSTED_DATA>>>) before the content is wrapped. The function is
applied in untrusted_context_message() after casting content to str.

The existing ~13 call sites (chat_processor.py, agent_loop.py,
deep_research.py, chat_helpers.py, chat_routes.py) are unaffected because
they pass content through without inspecting the output delimiters.

Regression tests added in tests/test_prompt_security.py covering:
- _escape_guard_markers unit tests (open, close, both, benign passthrough)
- untrusted_context_message integration tests (delimiter spoofing
  neutralisation, type coercion, None handling, metadata preservation)

Resolves #3056

* fix(security): sanitize label for newlines and guard markers

Addresses reviewer feedback on PR #3086:
- Normalize label: strip CR/LF to prevent pre-guard line injection
- Escape guard marker literals in label via _escape_guard_markers()
- Add regression tests for label-based newline injection, GUARD_OPEN
  and GUARD_CLOSE in label, and exactly-one-structural-guard assertion

* fix(security): move Source label inside GUARD_OPEN block

The reviewer correctly identified that even after sanitizing the label,
any user-derived label text (e.g. `f"web page: {url}"`) still appeared
before GUARD_OPEN in the trusted framing zone, where the LLM treats it
as trusted instructions.

Fix: move the 'Source: {label}' line to inside the guarded block so
only the hardcoded UNTRUSTED_CONTEXT_HEADER sits before GUARD_OPEN.
The raw label is still kept in metadata["source"] for traceability.
_sanitize_label() and _escape_guard_markers() are kept for defence-in-
depth on the label stored inside the block.

Update test_label_newline_injection_is_blocked to assert no label-
derived instruction text appears before GUARD_OPEN (pre-guard zone is
now empty of any user-derived content).
2026-06-07 22:15:50 +01:00
Syed Ali Jaseem f939cb65ce refactor(tests): replace local function copies in test_endpoint_resolver with real imports (#3359)
* refactor(tests): replace local function copies in test_endpoint_resolver with real imports

The test file carried 9 verbatim copies of src/endpoint_resolver.py functions
to avoid import-pollution concerns, but these copies are a drift hazard — PR #3343
had to update both in parallel.  Replace them with direct imports so future changes
to endpoint_resolver are automatically exercised by the test suite.

Also fixes _ollama_api_root in endpoint_resolver.py: the bare-URL Ollama case
(e.g. http://nas:11434 with empty path) was already handled correctly in the test
copy but was missing from the real function, which would return /chat instead of
/api/chat for native Ollama endpoints without an explicit /api prefix.

Closes #3351

* refactor: import _ollama_api_root from llm_core instead of duplicating it

endpoint_resolver already imports _detect_provider and _host_match from
llm_core. Add _ollama_api_root to that import and remove the local copy,
collapsing two implementations to one source of truth.

llm_core's version is a superset (also strips /api/chat|tags|generate paths),
and since normalize_base already removes those suffixes upstream the result
is identical for every input used here.
2026-06-07 22:47:57 +02:00
nubs 865e61450e fix(upload): configure chat attachment size limit (#2439) 2026-06-07 22:42:24 +02:00
adabarbulescu a8859bb25c fix(llm): Properly detect remote Ollama bare URLs as native endpoints (fixes #3252) (#3343) 2026-06-07 21:19:19 +02:00
RaresKeY 3a91c11ff8 fix: block app_api access to Cookbook host controls (#3231) 2026-06-07 19:20:11 +02:00
PewDiePie c9198baa2e fix: make agent loopback base port env-configurable (#2752) (#2753)
_COOKBOOK_BASE was hardcoded to http://localhost:7000 with no env-var
override anywhere in the codebase. Tools that do an internal HTTP
loopback (app_api, trigger_research, cookbook state read/write) silently
fail with "All connection attempts failed" whenever the running uvicorn
isn't on port 7000 — which is most non-default deployments and any
side-by-side multi-instance setup. The misleading "Task triggered"
message from manage_tasks during a research request hides that the
underlying research never starts.

Resolution order, lowest to highest priority:
  1. Fallback http://127.0.0.1:7000 (preserves legacy default).
  2. APP_PORT — derive http://127.0.0.1:$APP_PORT (matches docker-compose
     which already reads APP_PORT).
  3. ODYSSEUS_INTERNAL_BASE — explicit override (e.g. behind a TLS proxy
     where loopback isn't 127.0.0.1).

127.0.0.1 instead of "localhost" avoids IPv6/DNS ambiguity for a
strictly-local call.

No API or schema change. Defaults preserved: existing setups on port
7000 are unaffected.

Caught by #2752.

Co-authored-by: pewdiepie-archdaemon <pewdiepie-archdaemon@users.noreply.github.com>
2026-06-07 18:47:47 +02:00
Sebastian Andres El Khoury Seoane 8d9d4ec9c6 feat(platform): Add support for APFEL as part of the dependencies and models for the Cookbook. (#2657)
* feat(platform): add support for Apple Silicon detection in platform compatibility

test(tests): enhance shell_routes tests for Apple Silicon compatibility

* fix issues with missing import

* fix: correct package name in package-lock.json and enhance package installation commands in shell_routes.py and cookbook.js

* feat: add Apfel startup and health checks on macOS

- bootstrap Apfel via Homebrew on arm64 macOS
- start `apfel --serve --port 11435` detached for Odysseus
- verify readiness via `/health`
- clean up the Apfel process on exit or Ctrl+C

* fix: duplicate variable declaration post-merge conflict
- Should fix `node` CI issues.

* fix: issues with the update status of the APFEL dependency.
- fixed by changing the main conditional that determines the update.

* Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file.

* Fix: whitespace issues with the model_routes file

* Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file. Final

* Fix: Fixed updates using PIP for APFEL instead of custom cmd
2026-06-07 17:28:02 +02:00
Muhammad Ikhwan Fathulloh 2a6921a455 Fix logical bugs in event bus and bulk session deletion (#3139) 2026-06-07 17:08:50 +02:00
Rudra Sarker c5ac89f01f fix: preserve partial deep research findings on non-timeout errors (#2189)
* fix: preserve partial deep research findings on non-timeout errors

* fix: preserve partial deep research findings on non-timeout errors
2026-06-07 16:53:14 +02:00
Wes Huber b9a96bca1a fix(research): avoid double split() call and potential IndexError (#2229)
cat.split()[0] was called in the condition and again in the body,
wasting a second split. More importantly, if cat were ever
whitespace-only, split() returns [] and [0] raises IndexError.
Assign to a local variable and guard with a truthiness check.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-07 16:46:21 +02:00
Wes Huber 706ea6a7b7 fix: TOCTOU race in personal file delete + IndexError on whitespace cmd (#2228)
1. routes/personal_routes.py: os.path.exists() then os.remove() is a
   classic TOCTOU race — another request or cleanup can delete the
   file between the check and the remove, raising FileNotFoundError.
   Replace with try/except FileNotFoundError.

2. src/tool_implementations.py: cmd.split()[0] crashes with IndexError
   when cmd is a non-empty whitespace-only string (split() returns []).
   Guard with (cmd.split() or [''])[0].

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-07 16:44:26 +02:00
M57 12cb39cbd9 feat: add OpenCode Zen and Go as provider options (#26)
- Add OpenCode Zen (https://opencode.ai/zen/v1) and Go (https://opencode.ai/zen/go/v1)
- Add provider detection via _host_match() in llm_core.py
- Add curated model list entries in model_routes.py
- Add webhook provider URLs
- Add provider icon (providers.js) and dropdown options (index.html)
- Add auto-detection patterns and setup URLs (slashCommands.js)
- Whitelist opencode.ai in URL validation (admin.js)
- Rebased on main to fix merge conflicts with _HOST_TO_CURATED refactor

Co-authored-by: M57 <hy4ri@users.noreply.github.com>
2026-06-07 16:43:00 +02:00
max-freddyfire 43c16fc7e4 fix(context_compactor): return original messages when compaction summary fails (#2174)
On summary LLM call failure, maybe_compact was returning system_msgs+recent
(dropping the older half) with was_compacted=False, misleading the caller into
thinking the list was unchanged. Return the original messages list unchanged so
no history is lost; the next trim_for_context call handles length if needed.

Fixes #2160

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 16:40:16 +02:00
YotamPeled adbcb3763f fix(agent): don't abort legitimate tool batches as runaway loops (#3183)
The loop-breaker's runaway backstop counted per-tool-type call totals and
tripped whenever any tool was used >=15 times — treating 15+ DISTINCT calls
to one tool as a stuck loop. A real batch (e.g. "add these 18 birthdays to my
calendar" emits 18 distinct manage_calendar create_event calls in one round)
got flagged "calling manage_calendar over and over", the calls were discarded
(next round tools_sent=0), and 0 events were created.

Count IDENTICAL repeated call signatures instead (same tool AND args), via a
small, unit-testable _detect_runaway_call() helper. Genuine batches pass; a
model truly stuck repeating one call still trips the backstop. Adds a
regression test.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:16:17 +02:00
danielroytel 5d3e3c7053 feat(tasks): assign folder='Tasks' at creation + backfill migration (#2834)
* feat: assign folder='Tasks' to task sessions at creation

Task sessions (LLM, action, research) now set folder='Tasks' on their
DbSession row, matching the pattern used by the Assistant folder. This
enables sidebar lens filtering without changing existing session
behaviour.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add backfill script for task session folders

One-shot script to set folder='Tasks' on existing [Task]/[Research]
sessions that predate the folder assignment in task_scheduler.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: replace standalone backfill script with automatic migration

Convert scripts/backfill_task_folders.py into _migrate_backfill_task_folders()
in core/database.py, called from init_db(). The migration is idempotent (only
touches rows where folder IS NULL/empty) and runs automatically on upgrade,
so operators no longer need a manual step to tag pre-existing task sessions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 15:33:17 +02:00
RaresKeY a3784da172 fix: block app_api access to shell routes (#3225) 2026-06-07 15:19:08 +02:00
Vykos 83b0ab7cd3 Scope auxiliary LLM endpoints by owner (#2996)
* fix(auth): scope auxiliary llm endpoints by owner

* fix(auth): scope auxiliary llm fallbacks by owner
2026-06-07 14:47:44 +02:00
Vykos 2149f0fb67 fix(rag): forward owner through manager wrapper (#2991) 2026-06-07 12:56:57 +02:00
Vykos 000932a6d9 fix(auth): gate api tokens from user routes (#2992) 2026-06-07 12:55:01 +02:00
Vykos 299538ea4e Harden note reminder dispatch ownership (#2999) 2026-06-07 12:52:27 +02:00
Vykos f2a79aaf5c Tighten manage notes owner checks (#3002) 2026-06-07 12:50:10 +02:00
Vykos 7b4e6c4c1b Enforce task chain owner scope (#3006) 2026-06-07 12:43:43 +02:00
Vykos ff4508d396 Scope vision model resolution by owner (#3009) 2026-06-07 12:39:02 +02:00
Joeseph Grey f78539ba15 fix(caldav): disable redirects on the sync/write-back DAVClient (SSRF) (#2663)
validate_caldav_url resolves and vets the initial host, but caldav's
niquests session follows 3xx redirects by default, so a validated public
URL can be redirected at request time to loopback/link-local/private
space, re-opening the SSRF the host check closes. The existing redirect
guard only covered the settings test-connection path.

Add a shared _build_dav_client helper that pins the session to zero
redirects (any 3xx then raises instead of silently following an
attacker-chosen Location), and route both the pull (_sync_blocking) and
write-back (_writeback_blocking) paths through it. Mirrors the
follow_redirects=False already used on the test-connection path.

Tests exercise the real DAVClient request path (a 302 toward an internal
host is refused, the sink is never contacted; the PROPFIND is asserted to
reach the public server first so the check can't pass vacuously), confirm
the helper disables redirects on the installed client, guard against a
raw DAVClient creeping back in, cover mixed public/internal DNS results
in both orderings, and add the resolves-to-no-usable-records fail-closed
branch.
2026-06-07 05:05:24 +01:00
Karandeep Bhardwaj 3940297655 fix(webhooks): redact IPv6 addresses in sanitized error messages (#3038)
* fix(webhooks): redact IPv6 addresses in sanitized error messages

sanitize_error() only stripped IPv4 literals, so a failed webhook
delivery to an internal IPv6 host (::1, fe80::/fc00:: ...) leaked the
address into Webhook.last_error, which is surfaced in the UI. The module
already treats internal IPv6 as sensitive (see _PRIVATE_NETWORKS and
src/url_safety.py); the scrubber just didn't keep up.

Add an IPv6 redaction pass covering bracketed, full 8-group, and
::-compressed forms. The pattern is scoped to leave clock times
("12:34:56"), MAC addresses, and C++ "::" tokens untouched, and the
::-branch uses a lookahead over a flat character class so there is no
nested quantifier to backtrack on (no ReDoS on long colon/hex runs).

Adds tests/test_webhook_sanitize_error_ipv6.py.

* webhook: validate IPv6 candidates with ipaddress, not a regex grammar

Per review on #3038: instead of hand-rolling the IPv6 grammar in a regex
(brittle, and easy to over-match colon-heavy text), use a loose regex to
find candidate tokens and let ipaddress.ip_address() decide. Only tokens
it parses as IPv6 are redacted, so the false-positive guards (clock times,
MACs, "std::vector") now come from the stdlib instead of a custom pattern.

This also covers cases the old pattern missed -- zone ids (fe80::1%eth0)
and IPv4-mapped addresses -- and no longer partially mangles invalid
colon strings (a 9-group token is preserved whole rather than losing its
first 8 groups). The bracketed branch is a single greedy class with no
X*:X* backtracking; verified ~1ms on 40k-char adversarial input.

Extends the test file with zone-id, IPv4-mapped, and invalid-token cases.

* webhook: redact bracketed/scoped/IPv4-mapped IPv6 as one unit

Review on #3038 found a few IP forms left partially redacted or malformed
by sanitize_error():

  [fe80::1%eth0]:8080        -> [[redacted]]:8080
  [::ffff:192.168.0.1]:8080  -> [[redacted][redacted]]:8080
  ::ffff:192.168.0.1         -> [redacted][redacted]

Two causes: the bracketed branch's character class dropped zone ids, so
scoped addresses fell through to the bare branch and left the brackets and
port behind; and the IPv4 pass ran first, stripping the embedded v4 of an
IPv4-mapped address so the v6 pass then redacted the "::ffff:" remnant
separately.

Fix:
- run the IP-candidate pass before the IPv4 pass, so IPv4-mapped forms are
  matched and redacted whole
- match the full bracketed authority ([...] + optional %zone + :port) as a
  single token, and redact a v4-or-v6 literal inside [ ] as one [redacted]
- extend the bare branch with a bounded (exactly-3) dotted-quad tail for
  IPv4-mapped forms; exactly-3 so it can't swallow a partial suffix and
  accidentally preserve an otherwise-valid address

Each form now collapses to a single [redacted]; the candidate finder stays
linear (~1.3ms on 40k-char adversarial input). Adds regression tests for
the three reported forms and keeps the timestamp/MAC/std::vector coverage.
2026-06-07 04:55:33 +01:00
Nicholai a3cb15d0a1 fix(agent): enforce guide-only tool policy (#3088) 2026-06-06 18:48:24 -06:00
Mohammed Riaz 6ccd4500d7 fix(chat): show requested and actual reply models
Show requested and actual reply models in chat labels when fallback or provider routing changes the responding model.
2026-06-06 04:30:16 -06:00
Ocean Bennett fb9c7cf3da fix(calendar): accept list event range aliases 2026-06-06 03:47:18 -06:00
Nicholai 33edc40eae fix: route misfenced web lookups to web tools
Fixes #3067
2026-06-06 03:46:31 -06:00
Giuseppe e87a1ad8d2 fix(deep-research): wrap fetched webpage content in untrusted-context sandbox
The goal-based extractor passed raw fetched webpage content straight
into the LLM prompt via string substitution, bypassing the
prompt-injection hardening layer in src/prompt_security.py.

Split EXTRACTOR_PROMPT into EXTRACTOR_SYSTEM (task instructions +
goal, trusted) and a second message built with untrusted_context_message()
(raw page content, sandboxed with <<<UNTRUSTED_SOURCE_DATA>>> guards).
This aligns the extractor with every other external-content injection
site in the codebase (agent_loop, chat_processor, chat_routes).

Fixes #3044

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 03:37:10 -06:00
Nicholai 86abcb75d0 fix: split Chroma embedding lanes (#3046) 2026-06-06 03:17:19 -06:00
Nicholai 463713c2c6 feat(search): unify session transcript search (#2877) 2026-06-05 18:08:31 -06:00
Mateus Oliveira c2017fa089 Phase 1: consolidate tool output constants into src/constants.py (#2989)
MAX_OUTPUT_CHARS, MAX_READ_CHARS, and MAX_DIFF_LINES are now
defined once in src/constants.py and imported by the three files
that previously duplicated them (tool_execution.py,
tool_implementations.py, agent_tools.py). agent_tools.py re-exports
them for backward compatibility.

Co-authored-by: mcnoliveira <mcnoliveira@gmail.com>
2026-06-05 23:05:02 +02:00
Fijar Lazuardy 66599b02a2 allow user who disable auth to use chat (#2548)
* allow user who disable auth to use chat

* only check non user on verify session owner

* fix import source

* rollback 401 to 403 for unauthorized error due to unit test

* change unauthenticated http code error to 401 and fix unit tests
2026-06-05 22:54:19 +02:00
Logan Davis f72e1bd412 feat(reminders): add generic webhook as a fourth reminder channel (#2952)
Replaces any Discord-specific reminder channel with a generic outbound
webhook channel. Users pick any saved Integration as the target and
supply a JSON payload template with {{title}} and {{message}}
placeholders — values are JSON-escaped before substitution. Works with
Discord, Slack, Teams, ntfy (JSON mode), or any service that accepts a
POST with a JSON body.

- `src/settings.py` — reminder_webhook_integration_id +
  reminder_webhook_payload_template defaults
- `routes/note_routes.py` — webhook delivery block; Integration lookup,
  template rendering, auth wiring; built-in preset defaults so
  discord_webhook works out of the box without a configured template;
  settings_override kwarg avoids test-button race condition
- `routes/auth_routes.py` — discord_webhook preset test handler
- `src/integrations.py` — discord_webhook preset with description +
  example templates; hides auth/key fields in the Integration form
- `src/builtin_actions.py` — webhook_sent delivery check
- `src/tool_implementations.py` — webhook aliases + enum updated
- `static/index.html` — Webhook channel option; Integration picker +
  payload template textarea
- `static/js/settings.js` — Integration list, populateWebhookIntegrations,
  syncChannelRows, hints, load/save, auto-fill preset templates,
  test-button override payload, hide auth/key for URL-auth presets

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 22:47:57 +02:00
Paweł Drużyński f4aa661502 fix ambiguous naming, remove redundant json imports, fix _MCP_ARG_PARSERS type annotations (#2874) 2026-06-05 21:30:22 +02:00
nubs 08e543d1ff fix(tool-parsing): don't ship unconvertible <invoke> fence content to the code executor (#2926) 2026-06-05 21:08:54 +02:00
nubs 47a47bf71d fix(llm): guard against null arguments in streaming tool-call accumulator (#2923) 2026-06-05 20:57:36 +02:00
michaelxer 71dda5b106 fix: respect user round count in deep research (#2896)
The STOP_PROMPT did not include the target round count, so the LLM
could decide to stop after 2-3 rounds even when the user requested 8.
Additionally, min_rounds was capped at 3 regardless of max_rounds.

- Add max_rounds to STOP_PROMPT so the LLM knows the target
- Change min_rounds from min(3, max_rounds) to max(2, max_rounds - 2)

Fixes #2863

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
2026-06-05 20:49:42 +02:00
Logan Davis ad82ee1c83 feat(calendar): support multiple CalDAV accounts (#2942)
* feat(calendar): support multiple CalDAV accounts

Replaces the single CalDAV credential slot with a named account list so
users can sync both a personal and work calendar simultaneously.

- Add `account_id` column to `CalendarCal` + startup migration
- `_load_caldav_accounts()` in caldav_sync.py reads `caldav_accounts`
  list from prefs, auto-migrating the legacy single `caldav` key on
  first use (no user action required)
- `sync_caldav()` iterates all accounts and aggregates counts/errors
- `writeback_event()` resolves credentials via `CalendarCal.account_id`,
  falling back to the first account for legacy rows
- New REST endpoints: GET/POST/PUT/DELETE `/api/calendar/config/accounts`
- Legacy GET/POST `/api/calendar/config` preserved for backward compat
- Settings UI: one card per account with Label, URL, Username, Password
  fields; Test button works for both unsaved (inline creds) and saved
  (by account_id) accounts; delete removes only that account
- Update test_caldav_url_hardening.py mock to include `_save_for_user`
  and updated `_sync_blocking` signature

* fix(calendar): restore #2765 PK scoping and #2819 writeback URL validation

Two regressions introduced by the multi-account refactor:

1. PK collision (#2765): _stable_cal_id was back to hashing only the URL,
   so two users — or one user with two accounts on the same server — would
   collide on the primary key. Restore owner+account_id in the hash key
   (format: "{owner}\n{account_id}\n{url}") and thread both values through
   _sync_blocking → _writeback_blocking → push_event → find_remote_calendar
   so the hash round-trips correctly on write-back.

2. URL validation dropped (#2819): _load_caldav_accounts imported
   _save_for_user at function scope, causing an ImportError on test mocks
   that only provide _load_for_user, which prevented writeback_event from
   reaching the validate_caldav_url call. Move the import inside the
   migration branch and wrap in try/except (best-effort save; next call
   re-migrates from the still-present legacy key).

Update fake_writeback_blocking in test_caldav_writeback.py to accept the
new owner/account_id optional params.
2026-06-05 20:32:50 +02:00
nubs fa9f62b44c fix(compactor): shrink oversized tool_calls arguments so trim_for_context can fit a tool-only turn (#2949) 2026-06-05 20:23:38 +02:00
Kenny Van de Maele 8ce945d338 feat: Add plan mode to the chat agent (#638)
* feat: Add plan mode to the chat agent

Adds a plan mode: the agent investigates read-only, proposes a checklist, and
waits for approval before changing anything. On approval it runs with full
tools and checks items off as it goes. Enforcement reuses the existing
disabled_tools gate.

Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the
plan toggle from the chat input.

- src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP).
- src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the
  plan directive, force agent mode.
- static/: plan toggle pill, Approve & Run, dockable plan window, task-list
  checkboxes, and the /plan slash command.
- tests/test_plan_mode.py.

* Plan mode: persistent re-referenceable plan + agent write-back

Three improvements so a long plan survives a weak model and stays in reach:

1. Re-reference the plan (out-of-context fix). On the execution turn the frontend
   sends the approved checklist back (`approved_plan`); the backend pins it as a
   top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so
   the agent can always re-read the plan instead of losing the thread on a long
   run. New `build_active_plan_note()` (unit-tested).

2. Re-open / dock the plan anytime. The plan checklist is stored per-session
   (localStorage). When a plan exists, the plan-mode button opens a small menu
   ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan
   window — so it can stay docked while the agent works. The window live-refreshes
   as the plan changes.

3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps
   `- [x]` after finishing them, or to revise steps when the user asks. Marker
   tool (no I/O) → `plan_update` SSE event → the stored plan + docked window
   update live. The ACTIVE PLAN note instructs the agent to use it.

Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb),
src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse
`approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools
/ tool_index (always-available, not admin-gated).
Frontend: static/js/chat.js (plan store, send `approved_plan`, handle
`plan_update`, capture restated checklists), static/app.js (plan-button menu),
static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key).
Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py.

* Plan mode: drop bash/python, rely on read-only discovery tools

Shell can mutate (write files, hit the network) and can't be constrained to
read-only at the tool layer, so plan mode no longer relies on a prompt to keep
it well-behaved — bash/python are removed from the read-only allowlist and added
to the fail-closed block set. Discovery is covered by the dedicated read-only
tools (read_file, grep, glob, ls) instead.

Rewrites the plan-mode directive to state shell is disabled and lists the
available read-only tools positively. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Comment: note _MCP_READONLY_VERBS are prefixes not whole words

Clarifies that entries like "summar" are intentional stems matched via
startswith (covers summarise/summarize/summary), not typos. Addresses review
feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: clarify why gating inverts the allowlist into a denylist

Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the
comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an
allowlist, so it returns the inverse (all known tool names minus the allowlist).
The static mutator set is a backstop for the schema-derived name list, which
misses XML-only tools and can fail to import. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: stop hardcoding the read-only tool list in the directive

The model is already shown its available (read-only) tools by _assemble_prompt,
which removes every disabled tool. Enumerating them again in the directive only
duplicated that list and would drift as tools change. Point at the tools listed
below instead. Addresses review feedback on #638.
2026-06-05 16:32:25 +02:00