Commit Graph

430 Commits

Author SHA1 Message Date
RaresKeY 33fe7276be fix(endpoints): normalize URL handling (#4338) 2026-06-16 03:59:18 +01:00
RaresKeY 4d10c16d02 fix(auth): clean up rename and null-owner ownership (#4340) 2026-06-16 03:33:02 +01:00
TheDragonTail 0f966d6b9f fix(embeddings): fall back to default cache dir when FASTEMBED_CACHE_PATH is empty (#3434)
docker-compose.yml injects FASTEMBED_CACHE_PATH=${FASTEMBED_CACHE_PATH:-},
which sets the variable to an empty string when the host has not defined it.
FASTEMBED_CACHE_DIR used os.getenv("FASTEMBED_CACHE_PATH", default), and
os.getenv only returns the default when the variable is ABSENT -- so the empty
value won and FASTEMBED_CACHE_DIR became "". os.makedirs("") then raised
[Errno 2] No such file or directory: '', FastEmbed failed to initialise, and
every vector feature (RAG, semantic memory, tool index) silently degraded on
the default Docker stack.

Treat an empty value like an absent one via `os.getenv(...) or default`.
Add a regression test covering the empty, unset, and explicit cases.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 03:11:48 +01:00
Afonso Coutinho 7b09491557 fix: check-in calendar digest leaks every user's events (missing owner scope) (#1925)
* fix: check-in calendar digest leaks every user's events (no owner scope)

* Seed dtend on calendar events in digest test so the NOT NULL column is satisfied
2026-06-16 02:42:41 +01:00
Kenny Van de Maele fafaf089c5 refactor(search): centralize the web-scraping User-Agent into one constant (#4325)
The outbound UA for web_fetch / web_search was inlined in four places with
two different values and nothing keeping them current: content.py pinned a
mid-2021 Chrome 91 build, and providers.py sent a bare Mozilla/5.0 in three
spots. Some sites serve a degraded or blocked page to a UA that old.

Add WEB_FETCH_USER_AGENT to src/constants.py (env-overridable, matching the
existing Copilot/Kimi UA-constant pattern) and import it in content.py and
providers.py. Default to a current, common desktop UA so pages return their
normal HTML: the market-leading desktop OS (Windows; NT 10.0 covers Windows
10 and 11) and browser (Chrome) on a current stable build. The version is now
bumped in one place.

Service-specific self-identifying agents (Copilot, Kimi, webhooks, cookbook)
are intentionally left separate. Adds a regression pinning the constant shape,
the env override, and a guard against a new inline Mozilla literal in the
search sources.

Closes #4324
2026-06-16 01:33:47 +00:00
holden093 dd2e23c9af fix(agent): report phone numbers from resolve_contact when a matched contact has no email (#4327)
When a CardDAV contact matched the search query but had no email
address (only phone numbers), the tool silently dropped it and
returned 'No contacts found'.  Fall back to the contact's phone
number(s) so the caller still receives usable information.

Refs: #4178 (the contacts-domain classifier fix that made the model
actually call resolve_contact for contacts queries, surfacing this
pre-existing gap)
2026-06-16 00:03:33 +02:00
Kenny Van de Maele 074a1e6eff fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955)
* fix(search): add download budgets to web_fetch with truncation notice and hard ceiling

MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content
buffered and cached the entire response body first, so a large or hostile
URL could pull arbitrarily many bytes into memory and the content cache.

The fetch is now a capped streaming GET (SSRF redirect guard unchanged):
a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call
override via full/max_bytes on the web_fetch tool, and a hard ceiling
(WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed.
When Content-Length already declares a body over the ceiling the fetch
is refused before any body bytes are buffered. Truncated results carry
truncated/fetched_bytes/total_bytes, the tool output leads with a
partial-content notice telling the model how to re-fetch with full=true,
and the tool schema documents the flag. A truncated PDF is reported as
a budget error since a cut PDF is unparseable. The effective cap is part
of the content-cache key so a truncated fetch is never served to a
full-budget request.

Existing tests that faked httpx.get or the old _get_public_url signature
are adapted to the streaming interface; behavior pins are unchanged.

Fixes #3812

* fix(search): close compressed-body cap bypass and protect the partial notice

Addresses RaresKeY's review on #3955:

- Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the
  wire bytes (and Content-Length) can be a fraction of the decoded body, so a
  tiny compressed response could pass the hard-cap preflight and then expand
  past the ceiling in a single decoded chunk before the streamed cap could
  slice it. Identity makes Content-Length the true body size and keeps each
  streamed chunk bounded by the network read, so the hard ceiling actually
  bounds memory.
- Lead web_fetch output with the partial-content notice and cap the page
  title. The notice is the user-facing contract for partial fetches, but the
  title is untrusted, uncapped page content; placed ahead of the notice a giant
  title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads
  and the title is capped as a second guard.

Adds regressions: the fetch advertises identity encoding, and a truncated
result with an oversized title still surfaces the partial notice.

* fix(search): reject compressed responses that ignore the identity request

Requesting Accept-Encoding: identity is not enough on its own: a server can
ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would
decode that, so a tiny compressed body could balloon into one decoded chunk
far past the hard cap before the streamed loop slices it (and Content-Length,
the compressed wire length, makes the preflight and size metadata unreliable).

Refuse a non-identity Content-Encoding before reading the body. Adds a
regression where the server ignores the identity request and returns gzip;
the fetch is refused before any body is decoded.
2026-06-15 17:38:09 +00:00
Lucas Daniel f4e8990635 chore: add warnings to silent except Exception blocks (#3212)
* log(app): add warnings to silent except Exception blocks

- Internal tool auth header failure now logs a warning instead of
  silently passing, making auth bypass easier to spot in logs.
- Token last_used_at update failure now logs at DEBUG (fire-and-forget,
  non-critical, but useful when debugging token tracking issues).
- Image ownership verification failure now logs a warning so unexpected
  access-check errors surface instead of silently allowing the request.

* log(chat_routes): add warnings to silent except Exception blocks

- clear_orphaned_session_endpoint: log before rollback so failures
  appear in traces when users see stale/deleted model options.
- _endpoint_has_model (JSON parse): log malformed cached_models instead
  of silently treating endpoint as valid.
- _has_any_visible_model (JSON parse): log malformed cached_models
  instead of silently returning empty list.
- timezone header parse: log failure so time-zone-related tool bugs
  (wrong scheduled times, calendar events) are traceable.
- attachments JSON parse: log failure so silently-dropped attachments
  are visible in server logs.

* log(email_routes): add warnings to silent except Exception blocks

- Email alias resolution failure now logs a warning instead of silently
  returning an empty list, making broken account configs diagnosable.

* log(document_routes): add warnings to silent except Exception blocks

- Export ZIP request body parse failure now logs a warning so empty
  exports caused by malformed requests are diagnosable.
- clear_active_document failure on detach now logs a warning to help
  trace doc re-injection bugs like #1160.

* log(agent_loop): add warnings to silent except Exception blocks

- builtin tool overrides load failure now logs a warning so misconfigured
  settings don't silently fall back to defaults without a trace.
- Timezone context injection failure now logs a warning to help debug
  incorrect scheduled times in agent-created tasks.
- PDF form-backed document detection failure now logs a warning so
  broken form-doc UI is traceable to the root cause.

* log(llm_core): add warnings to silent except Exception blocks

- Malformed URL in _is_ollama_native_url now logs a warning so bad
  endpoint configs are traceable instead of silently returning False.
- Model list fetch failure now logs a warning with the endpoint URL so
  endpoints that silently vanish from the model picker are diagnosable.

* log: pass exception via exc_info instead of string interpolation

* fix(logging): avoid logging raw URLs in llm_core error paths

Drop the raw url/base_chat_url from the Ollama-detection and
model-list-fetch warning logs added by this sweep, since these values
can contain private hostnames, internal IPs, credentials, or other
deployment details.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 17:49:27 +01:00
Kfir Sadeh fc3a5e555e feat(paths): abstract runtime path logic for frozen distribution packages (#969)
* feat(core): abstract runtime path logic for frozen distribution packages

* Address review feedback: revert browser MCP check, persistent data dir default when frozen, and add path tests
2026-06-15 17:44:10 +01:00
Ashvin 7fd937fa57 fix(calendar): parse "mins"/"hrs" reminder offsets in manage_calendar (#4266)
_reminder_minutes matched the offset with (?:m|min|minute|minutes)\b and
(?:h|hr|hour|hours)\b. The trailing \b makes the common plural
abbreviations "mins"/"hrs" fail to match (after "min" the "s" is a word
char, so no boundary), so reminder_minutes "5 mins" or "2 hrs" returned
None and the event was created with no reminder, silently.

Widen the two unit regexes and the matching reminder_only description
regex to a strict superset that also accepts mins/hrs. The sibling
duration parser already accepts these forms (it has no \b), so this only
brings the reminder parser in line.
2026-06-15 17:37:28 +02:00
Catalin Iliescu c41caac438 fix(cookbook): only persist successfully stopped scheduled serves (#4267)
Co-authored-by: Cata <cata@bigjohn.local>
2026-06-15 17:30:18 +02:00
RaresKeY f66a23d19d fix(ai): validate generated image result URLs (#4289) 2026-06-15 16:40:49 +02:00
pewdiepie-archdaemon 1cc9a003fd Fix failing post-merge tests 2026-06-15 22:49:06 +09:00
pewdiepie-archdaemon 6d507f8128 Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615
# Conflicts:
#	src/tool_implementations.py
#	static/js/research/panel.js
2026-06-15 21:20:15 +09:00
pewdiepie-archdaemon 2cbd55b8bd Open email context for agent, email search across All Mail, cookbook serve polish
- Agent: pass the open email reader (uid/folder/account/from/subject/body
  preview) on every chat submit so 'reply to this' / 'write email saying
  hi' route to ui_control open_email_reply with the right UID instead of
  inventing a new .md draft. Code-level enforcement (chat_routes strips
  create_document + send_email when active_email is set); cross-session
  active_doc_id is now trusted instead of being silently dropped.
  set_active_email/clear_active_email tool-layer helpers in
  tool_implementations.

- ui_control open_email_reply: optional body argument so the agent can
  open-and-write in one call; envelope now forwards uid/folder/account/
  body/panel through tool_output. Tool description sharpened and the
  parser rejects empty bodies on reply/reply-all (forces the agent to
  write rather than open an empty draft).

- Email library: search now runs against [Gmail]/All Mail when the
  current folder is INBOX (archived emails surface). Whirlpool spinner
  + 'Searching…' placeholder while in flight. Each search result is
  stamped with its source folder so clicks open the right email instead
  of whatever shares its UID in INBOX. Search no longer re-applies the
  same text pill locally (which only checks subject/from/snippet, never
  body) so body-only matches don't get dropped after IMAP returns them.
  Initial inbox load bumped 100→500.

- Email favorites: 'Favorite (pin to top)' / 'Unfavorite' in both the
  card menu and the open-reader more menu, backed by a new
  /api/email/flag/{uid}?on=true|false endpoint. Flagged emails always
  bubble to the top of the grid regardless of active sort.

- AI reply in doc editor: never overwrites existing draft text or the
  quoted history. AI suggestion is prepended; AI-generated 'On …
  wrote:' re-quotes are stripped so the original quote isn't visually
  edited.

- Cookbook serve: pre-launch GPU driver / has_gpu / install / version-
  floor checks (vllm minimax_m2 needs 0.10.0+, deepseek_r1 needs 0.7.0
  etc.) before the launch chain starts. Detect 'another model already
  running on this host' and offer Stop & launch (with graceful then
  force tmux kill helpers, port release wait). Per-vendor deep-link
  buttons (vLLM recipe / SGLang cookbook) with hardware hash. Backend
  picker is now a custom dropdown with accent-coloured logos for vLLM,
  SGLang, llama.cpp, Ollama, Diffusers; same glyphs added next to
  package names in Dependencies. Runtime-readiness note moved inside
  the panel (green when ready, red when missing) with an × dismiss.
  Esc collapses the expanded card; expanded card scrolls when it
  overflows; Trust Remote / Auto Tool / Reasoning Parser / Enforce
  Eager / Prefix Caching / Expert Parallel / Speculative / MoE Env on
  one row (Reasoning Parser auto-detected per model family).
  Dtype→Row 1, GPUs→Row 2 (rightmost). Removed redundant GPU 'auto'
  input — command builders read from the GPU button strip. Default
  cookbook open is Download tab.

- Cookbook hwfit: 'Model (latest)' / 'Model (oldest)' header sorts by
  release_date; release dates can be backfilled with the new
  scripts/backfill_model_release_dates.py and recipe metadata pulled
  with scripts/import_from_vllm_recipes.py against the upstream
  vllm-project/recipes catalog (vllm_recipe + min_vllm_version stamped
  on entries).

- Calendar: Quick add hint cycles a random Odysseus-themed example per
  open (wooden horse Friday, crew muster 10am daily, council on
  Ithaca, …). Typing a time like '11pm' in the event title updates
  the hero clock live.

- Doc editor: email-mode Reply button (sparkle icon, accent) opens the
  same Fast/Full + context popover the email reader uses; Ctrl+Alt+M
  toggles markdown preview.

- Memories panel: custom sort picker with per-option icons, default
  'Latest', visible Enabled/Disabled toggle text matching the section
  description style.
2026-06-15 20:47:51 +09:00
andrewemer cd02ac7ef6 fix(agent): skill-prescribed tools never reach the model's schema list (#4008)
* Agent: make skill-prescribed tools actually callable

The skill index and matched-skill procedures are injected into the
prompt, but tool selection never followed: manage_skills wasn't in the
RAG-selected schema list (so the model substituted manage_memory), and
a matched skill could prescribe tools (grep, read_file) the model had
no schema for. Now:

- manage_skills rides along whenever the owner has any skills indexed
- a Jaccard-matched skill's requires_toolsets join the selection
- viewing a skill mid-turn via manage_skills unlocks its
  requires_toolsets for subsequent rounds
- admin-intent turns send _ADMIN_TOOLS schemas, matching the prompt
  text _build_base_prompt already advertises
- index_for(active_toolsets=None) no longer hides requires_toolsets
  skills from callers that don't know the active set

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* Agent: validate skill requires_toolsets against known tools, not TOOL_SECTIONS

grep/glob/ls ship as function schemas without a prompt-prose section,
so gating on TOOL_SECTIONS silently dropped them from a skill's
requires_toolsets.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-15 20:32:43 +09:00
cirim e7abb7559d fix(research): keep Discuss chats grounded on their report (#4006)
* fix(research): preserve Discuss spin-off primer during context trimming

trim_for_context() kept only system_msgs[:1] as essential and dropped the
rest under budget pressure. A research "Discuss" spin-off seeds the report
as a system message that sits after the preface system messages, so it
landed in extra_system and was the first thing evicted once the chat grew
— the conversation then lost its grounding and drifted off task.

Treat any system message carrying research_spinoff_from metadata as
essential, alongside the leading system prompt, so the seeded report
survives trimming. maybe_compact already retains all system messages.

Tests: tests/test_context_compactor.py::TestResearchPrimerPreserved

* fix(research): ground Discuss spin-off chats on the seeded report

build_chat_context injected global memory (pinned + hybrid-retrieved) and
personal-doc RAG every turn, keyed off the user-level memory_enabled pref
and a request-scoped use_rag flag — never the session. A research spin-off,
whose primer declares the report the sole knowledge base, thus had
unrelated keyword-matched facts pulled in ("wrong data") competing with the
report; its rag=False flag was also ignored (use_rag defaulted on).

Add _session_is_research_spinoff(sess) (detects the primer research_spinoff_from
metadata; handles ChatMessage and dict forms) and, for such sessions,
disable memory injection and force RAG off.

Tests: tests/test_chat_helpers.py spin-off detection cases

---------

Co-authored-by: Dan (cirim) <claude@cirim.org>
2026-06-15 20:31:57 +09:00
Josh Patra f5d3e5098a fix(llm): omit temperature for Kimi K2.5 and K2.6 (#3960) 2026-06-15 20:29:22 +09:00
Josh Patra 4ee5ed4dce fix(memory): return complete memory lists (#3885) 2026-06-15 20:28:25 +09:00
Achilleas90 ffc0f1dccc Harden CalDAV write-back with retries (#1193)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-15 15:59:31 +09:00
KYDNO 955455b797 fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549)
* fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions

Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(kimi): omit temperature for kimi-for-coding API calls

Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-15 15:56:54 +09:00
Abhishek Kumbhar a172522d87 fix(integrations): prevent blank API integrations (#3840)
* fix(integrations): validate unified API form fields

* fix(integrations): validate API integration fields server-side
2026-06-15 15:40:36 +09:00
Vishnu d6a3c9a0fe fix(utility): use utility model for background tasks (auto-title, memory audit) instead of chat model (#4027) 2026-06-15 15:33:19 +09:00
Dividesbyzer0 33c26bab88 fix(agent): parse raw json web search calls (#4088) 2026-06-15 15:19:38 +09:00
cyq e52d078ea1 fix(agent): detect Polish web lookup intent (#4091) 2026-06-15 15:19:03 +09:00
nsgds 7ae6133d7f fix(agent): don't let a materialized default budget defeat context-window scaling (#4122)
* fix(agent): don't let a materialized default budget defeat context scaling

#1230 scales agent_input_token_budget to the model's context window unless
the user explicitly set a budget, detected via is_setting_overridden(). But
the settings-save path materializes every DEFAULT_SETTINGS key into
settings.json (load_settings merges defaults; handlers persist the merged
dict), so the persisted default 6000 reads as "overridden" and the budget
code takes the min(6000, ctx) branch — silently re-capping long-context
models at 6000 for anyone who has ever saved a setting. This reintroduces
the exact regression #1170/#1230 set out to fix.

Add is_setting_customized() (saved value != default) and gate the scaling
on it instead of mere presence. A persisted default is not a user choice.

is_setting_overridden has exactly one consumer (this budget path), so the
change is contained. Tests cover the materialized-default regression, a
deliberately-chosen budget still being honoured, and the absent-key case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent): rework context-budget fix per review (#4122)

Address RaresKeY's review:

P2 (explicitness): is_setting_customized treated a saved value equal to the
default as "not explicit", which ALSO blocked a user from deliberately pinning
the default budget. Reframe the default value itself as the AUTO sentinel —
agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context
window", any other value is an explicit cap. A materialized default still reads
as auto (fixing the original regression), and any non-default value the user
chooses is now honoured. Drop the now-unused is_setting_customized helper.

P2 (fallback context): auto-scaling trusted get_context_length() even when it
returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known
window), over-allocating on self-hosted/proxy setups. Add get_context_length_known()
(also returns whether the window was actually discovered); the budget block
passes 0 when unknown so auto-scaling stays conservative instead of inflating to
an unproven window.

hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that
contract and answered the reviewer's question rather than silently reversing it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(agent): lock the materialized-default budget regression (review on #4121)

Per WGlynn's review on the issue: add an end-to-end regression that saves an
UNRELATED setting (which makes the settings-save path materialize the budget
default into settings.json) and asserts the budget still auto-scales rather than
re-reading as an explicit 6000 cap — locking the exact reopening shut.

To make the test bite the production decision (not just re-derive it), extract
`budget_is_explicit()` into src/context_budget.py and use it from the agent loop.
It keys off value-vs-default (the default is the auto sentinel), NOT settings
presence — which is the whole point, since the save path materializes defaults.

Note: after this PR's rework, is_setting_overridden has ZERO production callers,
so the merged-dict materialization smell can't reach any setting through a
presence check today (WGlynn's durability concern).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent): bind the budget context window to its own provenance (review #4122)

RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop
kept only the `known` flag from get_context_length_known() and budgeted off the
passed-in `context_length`, which can come from a *different* lookup. Two failures:
- local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT
  fallback while the fresh probe proves the real (smaller) served context — we'd
  scale off the stale value;
- callers that don't pass context_length (scheduled tasks, teacher escalation,
  skill test runs, bg_monitor) were capped at 6000 even when a long window is
  discoverable.

Extract budget_context_for_model() which returns the freshly-probed window when
known else 0, binding the flag to the value it proves; the agent loop uses it.
Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(agent): fix stale budget comments + tighten to the contract (review #4122)

- settings.py: an explicit budget is clamped to the window only — hard_max is
  auto-only (#1190); drop the incorrect "and to hard_max".
- is_setting_overridden docstring: drop the stale "adaptive budgets" example;
  point value-sensitive callers at context_budget.budget_is_explicit.
- Tighten the budget-block comments to the contract (default = auto sentinel,
  non-default = explicit cap, hard_max = auto-only ceiling).

Comment/docstring-only; no behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(agent): correct budget issue citations (#1190 → merged #1230/#1273)

The context-budget contract (auto-sentinel, explicit budgets honoured,
hard_max auto-only) merged via #1230#1190 was the earlier, closed,
superseded PR. Re-point the contract comments at #1230 (the live source,
already cited for the auto-sentinel two lines up in settings.py).

The configurable hard_max setting (`agent_input_token_hard_max`) was a
reviewer requirement first raised on #1190, omitted from the merged #1230,
and actually added in #1273 — credit #1273 for it and correct the test
comment's history (it previously implied this PR completed the requirement).

Comment/docstring-only; no behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 15:17:28 +09:00
Dividesbyzer0 589fcd314a fix(image): patch realesrgan torchvision compatibility (#4110) 2026-06-15 15:16:41 +09:00
Max Hsu 039431f5ea fix(mcp): detect npx cache entries before probing (#4034) 2026-06-15 15:14:48 +09:00
Dividesbyzer0 7f571c8f7e fix(agent): keep gpt-oss on text tool mode
Treat gpt-oss local OpenAI-compatible models as text/fenced-tool models unless the endpoint explicitly declares native tool support.
2026-06-15 15:11:52 +09:00
cirim 056d1fb960 fix(llm): make connect timeout configurable
Use a configurable LLM_CONNECT_TIMEOUT for call and stream connect budgets instead of the previous hard-coded 3s default.
2026-06-15 15:11:38 +09:00
Muhammed Midlaj 4b0a977988 fix(models): probe /v1/models for path-less LM Studio endpoints
Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.
2026-06-15 15:09:50 +09:00
Boudbois2271 54690997ec fix(calendar): treat same-day list_events range as full day
Expand zero-width or inverted list_events windows to one day so start=end single-day queries return that day's events.
2026-06-15 15:09:19 +09:00
Wes Huber be046dd29a fix(cookbook): preserve state during lifecycle tick
Log malformed cookbook state and re-read fresh state before writing scheduled-stop mutations so concurrent UI changes are preserved.
2026-06-15 15:07:03 +09:00
holden093 4c41834dc7 fix(youtube): consolidate duplicate handler
Make src.youtube_handler a compatibility wrapper around services.youtube.youtube_handler so transcript state, URL parsing, and timeout behavior no longer diverge.
2026-06-15 15:03:41 +09:00
holden093 96052c5e8a fix(agent): add contacts domain to tool classifier
Add a contacts domain rule pack and deterministic contact intent detection so contact prompts surface resolve_contact/manage_contact tools.
2026-06-15 15:03:19 +09:00
adabarbulescu afc81bdd7b fix: drop thinking deltas from background agent loops
Skip thinking-only deltas when accumulating background, scheduled-task, and teacher captured reply text.
2026-06-15 15:03:09 +09:00
Dividesbyzer0 a07fe35936 fix(agent): honor explicit web search requests
Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.
2026-06-15 15:02:10 +09:00
RaresKeY a7766d0b7f fix(agent): honor auth-disabled tool access after setup
Check explicit auth-disabled mode before configured-admin ownership checks so single-user mode keeps full agent tool access after setup.
2026-06-15 15:01:48 +09:00
Tom 2857723e47 fix(security): restrict API-key encryption key file to 0o600
Lock the API key encryption key file to owner-only permissions on creation and when reading existing keys, with regression coverage for permissions and encryption roundtrip.
2026-06-15 15:00:11 +09:00
Michael a633611823 fix(agent): let retrieval run for non-English low-signal queries
Allow non-workspace low-signal prompts to fall through to tool retrieval so non-English requests are not limited to always-available tools.
2026-06-15 14:58:56 +09:00
muhamed hamed 3b3c0d6254 fix: detect HuggingFace token when downloading cookbook models (#3459)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:53:16 +01:00
Mazen Tamer Salah f5c1eb4b9d fix(settings): degrade load_features to defaults on PermissionError
load_settings() already catches PermissionError, but load_features() caught only
FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable
data/features.json (e.g. root-owned after a deploy) therefore raised instead of
falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything
that reads feature flags. Add PermissionError to the except tuple to match
load_settings().

Adds tests/test_load_features_permission_error.py.

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:20:10 +01:00
Marius Popa 2a4bba2b9e fix(api-keys): preserve encrypted keys when saving providers (#1920)
* fix(api-keys): preserve encrypted keys when saving providers

* test(api-keys): cover malformed raw key entries

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 18:23:54 +01:00
Kenny Van de Maele 620fdd0859 feat(agent): confine agent file/shell tools to a selectable workspace (#3665)
* feat(agent): workspace confinement via context-local binding + get_workspace tool

Bind the per-turn workspace once in execute_tool_block; the shared path
resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd
helper (agent_cwd) read it, so file tools + bash/python are confined centrally
and a new tool that uses the shared helpers cannot accidentally bypass it.

Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory
modal (reusing existing modal/button CSS), the /workspace slash command, and a
get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic
(realpath/normcase/commonpath) and docker-safe (container paths, no host
assumptions). Reopens #2023.

* ux(workspace): clarify workspace is not a sandbox

Picker modal note + pill tooltip + get_workspace tool/output wording now state
plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder,
but bash/python only start there (cwd) and are not sandboxed. Modal note reuses
the existing .muted class.

* fix(agent): treat an active workspace as file-work intent

A vague low-signal message (e.g. "look at the local project") matches no
domain keywords, so tool retrieval is skipped and only always-available tools
are offered — leaving the agent with no file access even though a workspace is
set. When a workspace is active, include the file/code tools (incl.
get_workspace) on low-signal turns so the agent can act on the folder.

Also requires the tool index (ChromaDB) to be reachable for normal retrieval;
that is an environment dependency, not part of this change.

* ux(workspace): hide pill + overflow entry in chat mode

Workspace only scopes the agent's file/shell tools, so the pill and the
overflow 'Workspace' entry are agent-only now — hidden in chat mode like the
bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is
called from the agent/chat setMode handler.

* prompt(tools): steer bash/python to defer to the dedicated file tools

bash/python schema descriptions (what native-tool-calling models read) were
bare and gave no steer, so models would do file ops via the shell (e.g. writing
SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python
in the schema + tool-index + prompt section to prefer read_file/write_file/
edit_file/grep/glob/ls and only be used for what those do not cover.

* prompt(tools): keep bash/python deferral generic (no hardcoded tool names)

Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc.
by name, so the guidance does not go stale if those tools are renamed.

* style(workspace): drop em-dashes from added code comments/strings

* ux(workspace): terser non-sandbox note in picker (no tool-name list)

* ux(workspace): mirror terse non-sandbox wording in pill tooltip

* chore: untrack local venv symlink (run-only, not part of the feature)

* prompt(workspace): keep get_workspace text generic (no hardcoded tool names)

* fix(agent): low-signal + workspace surfaces only read-only file tools

Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message
in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but
not write_file/edit_file/bash/python -- those wait for a request that actually
calls for them (RAG retrieval still adds them on a real ask).

* feat(workspace): cap browse listing at 500 dirs with a truncated hint

Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local
_MAX_BROWSE_DIRS so a directory with thousands of children does not dump every
row into the picker; the response carries a truncated flag and the modal tells
the user to type a path to jump in.

* chore: untrack local venv symlink (run-only artifact)

* fix(workspace): vet the workspace root against the sensitive-path deny list at bind time

The in-workspace resolver deny-lists sensitive paths inside the workspace,
but the empty-path search root is the workspace itself, so a workspace of
~/.ssh could be listed via ls with no path. vet_workspace() (public, in
tool_execution next to the resolvers) rejects non-directories and sensitive
roots before the path is ever bound; chat_routes uses it instead of its
inline isdir check.

* fix(workspace): reject filesystem roots and stop showing rejected workspaces as active

Review findings from #3665:

P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes
every absolute path 'inside' the workspace and collapses confinement into
host-wide file access. A root is its own dirname, so reject when
dirname(resolved) == resolved; the browse response now carries a selectable
flag and the picker disables 'Use this folder' on unselectable dirs.

P3: /workspace set stored any string client-side and the chat route silently
dropped rejected values, so the pill could claim a confinement that was not
in effect. New admin-gated /api/workspace/vet validates manual paths before
they persist (canonical path returned), and when a posted workspace is
rejected at send time the stream emits workspace_rejected so the client
clears the stored value and toasts instead of continuing silently.

* fix(workspace): check caller privilege before vetting the posted workspace

Review finding: /api/chat_stream called vet_workspace() on the posted value
for every caller and emitted workspace_rejected on failure, so a non-admin
who can chat but cannot use file/shell tools could distinguish existing
directories from missing/file/sensitive/root paths by whether the event
appeared. The resolution now lives in _resolve_request_workspace, which
drops the submitted value uniformly for non-admin callers, with no vetting
and no event, before the path ever touches the filesystem. Admin and
single-user behavior is unchanged. Test pins that valid and invalid paths
are indistinguishable for a non-admin and that vet_workspace is never
invoked for them.
2026-06-11 18:17:54 +02:00
Michael 95c54ac3cb fix: use _truncate for tool output display limits in agent_loop (#3831)
Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate
helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit
truncation indicator when content is cut.

Scoped down from the original PR: only agent/tool-output display
behavior, no integrations.py changes.

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 17:05:13 +01:00
Kenny Van de Maele 263d41c58a fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers (#3945)
* fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers

_apply_local_cache_affinity adds session_id + cache_prompt for llama.cpp
KV-cache slot affinity (#2927), gated on _is_self_hosted_openai_compatible,
which treated any unknown OpenAI-compatible host as self-hosted. Strict
cloud providers added as custom endpoints (Mistral at api.mistral.ai)
reject unknown body fields, so every request failed with 422
extra_forbidden. Self-hosted now also requires the endpoint to resolve as
local via model_context.is_local_endpoint: loopback/private/tailscale
host, or endpoint kind explicitly configured as "local" (the escape hatch
for tunneled self-hosted servers). is_local_endpoint is promoted to a
public name since llm_core now shares it.

Fixes #3793

* test(llm): sweep cloud OpenAI-compatible hosts in affinity gating

Parametrized cases adapted from #3839 (credit: Shabablinchikow): deepseek,
x.ai, together, fireworks, and the Gemini OpenAI-compat endpoint must all
stay free of the llama.cpp extras, not just the Mistral host from #3793.

* fix(llm): narrow the Tailscale range to 100.64.0.0/10 in is_local_endpoint

Review finding on #3945: _PRIVATE_PREFIXES carried a bare "100." prefix,
treating all of 100.0.0.0/8 as local while Tailscale only uses the CGNAT
block 100.64.0.0/10. Public 100.x hosts (e.g. AWS ranges outside the
block) were classified local and still received the llama.cpp extras
this PR exists to keep away from strict providers. Match the narrowed
classification routes/model_routes.py already uses, with boundary tests
just below, inside, and just above the range.
2026-06-11 17:51:03 +02:00
Mazen Tamer Salah f941db29d3 fix(search): batch FTS hit lookups into one query (N+1) (#3909)
_search_fts ran the FTS MATCH query, then looked up each hit's full row with its
own db.query(...).filter(id == message_id).first() inside a loop, so a search
returning N hits issued N extra SELECTs. Fetch all hit rows in a single IN(...)
query via _fetch_messages_by_id and reassemble results in hit (relevance) order.

Adds tests/test_session_search_batch_fetch.py asserting a single batched query
(and no query for empty input). Existing session-search tests stay green.
2026-06-11 16:31:54 +02:00
RaresKeY c500bcb47d fix(uploads): migrate upload ownership on rename (#3617) 2026-06-11 16:01:04 +02:00
Mazen Tamer Salah f7a3605b16 fix(webhooks): keep references to in-flight delivery tasks (#3859)
fire() and fire_and_forget() scheduled delivery with bare create_task()/
loop.create_task() and kept no reference. asyncio holds only a weak reference to
a task, so the GC could collect a delivery (or the fire() coroutine itself)
before it completed, silently dropping the webhook.

Track in-flight tasks in a set on the manager via a _spawn_tracked() helper that
holds a strong reference for the task's lifetime and discards it on completion
(add_done_callback), and route both schedule sites through it.

Adds tests/test_webhook_task_refs.py.
2026-06-11 15:53:52 +02:00
George Lawton 4f48cfa9ae fix: omit temperature for Opus 4.7+ on native Anthropic path (#3117)
Anthropic removed the sampling parameters (temperature, top_p, top_k)
starting with Claude Opus 4.7 — sending temperature at all, even 0.0,
returns HTTP 400. _build_anthropic_payload sent it unconditionally, so
every native-Anthropic request to Opus 4.7/4.8 failed: the research probe
(ResearchHandler._probe_endpoint, temperature=0) aborted runs before they
started, and all DeepResearcher._llm calls 400'd.

Add _anthropic_rejects_temperature (version-gates opus-N-M >= (4,7)) and
omit temperature in the Anthropic builder for those models. Older Claude
models (Opus 4.6 and below, Sonnet/Haiku) keep temperature and the
existing [0,1] clamp.

The version gate is hardened against real-world model id shapes:
- a word-boundary anchor so a substring like `octopus-4-8` is not read
  as Opus and stripped of temperature;
- a 1-2 digit minor cap so a dated id such as `claude-opus-4-20250514`
  (Opus 4.0, listed in ANTHROPIC_MODELS) parses as major-only and keeps
  temperature, while dated 4.7+ snapshots still match;
- a non-string guard so a non-string model can't raise AttributeError
  (the previous builder never called .lower() on it).

Adds regression tests covering 4.7/4.8 omission, older/dated/legacy
retention, the substring overmatch, and non-string inputs.

Fixes #3065

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 16:27:40 +03:00