Commit Graph

758 Commits

Author SHA1 Message Date
Dividesbyzer0 33c26bab88 fix(agent): parse raw json web search calls (#4088) 2026-06-15 15:19:38 +09:00
cyq e52d078ea1 fix(agent): detect Polish web lookup intent (#4091) 2026-06-15 15:19:03 +09:00
nsgds 7ae6133d7f fix(agent): don't let a materialized default budget defeat context-window scaling (#4122)
* fix(agent): don't let a materialized default budget defeat context scaling

#1230 scales agent_input_token_budget to the model's context window unless
the user explicitly set a budget, detected via is_setting_overridden(). But
the settings-save path materializes every DEFAULT_SETTINGS key into
settings.json (load_settings merges defaults; handlers persist the merged
dict), so the persisted default 6000 reads as "overridden" and the budget
code takes the min(6000, ctx) branch — silently re-capping long-context
models at 6000 for anyone who has ever saved a setting. This reintroduces
the exact regression #1170/#1230 set out to fix.

Add is_setting_customized() (saved value != default) and gate the scaling
on it instead of mere presence. A persisted default is not a user choice.

is_setting_overridden has exactly one consumer (this budget path), so the
change is contained. Tests cover the materialized-default regression, a
deliberately-chosen budget still being honoured, and the absent-key case.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent): rework context-budget fix per review (#4122)

Address RaresKeY's review:

P2 (explicitness): is_setting_customized treated a saved value equal to the
default as "not explicit", which ALSO blocked a user from deliberately pinning
the default budget. Reframe the default value itself as the AUTO sentinel —
agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context
window", any other value is an explicit cap. A materialized default still reads
as auto (fixing the original regression), and any non-default value the user
chooses is now honoured. Drop the now-unused is_setting_customized helper.

P2 (fallback context): auto-scaling trusted get_context_length() even when it
returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known
window), over-allocating on self-hosted/proxy setups. Add get_context_length_known()
(also returns whether the window was actually discovered); the budget block
passes 0 when unknown so auto-scaling stays conservative instead of inflating to
an unproven window.

hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that
contract and answered the reviewer's question rather than silently reversing it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(agent): lock the materialized-default budget regression (review on #4121)

Per WGlynn's review on the issue: add an end-to-end regression that saves an
UNRELATED setting (which makes the settings-save path materialize the budget
default into settings.json) and asserts the budget still auto-scales rather than
re-reading as an explicit 6000 cap — locking the exact reopening shut.

To make the test bite the production decision (not just re-derive it), extract
`budget_is_explicit()` into src/context_budget.py and use it from the agent loop.
It keys off value-vs-default (the default is the auto sentinel), NOT settings
presence — which is the whole point, since the save path materializes defaults.

Note: after this PR's rework, is_setting_overridden has ZERO production callers,
so the merged-dict materialization smell can't reach any setting through a
presence check today (WGlynn's durability concern).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent): bind the budget context window to its own provenance (review #4122)

RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop
kept only the `known` flag from get_context_length_known() and budgeted off the
passed-in `context_length`, which can come from a *different* lookup. Two failures:
- local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT
  fallback while the fresh probe proves the real (smaller) served context — we'd
  scale off the stale value;
- callers that don't pass context_length (scheduled tasks, teacher escalation,
  skill test runs, bg_monitor) were capped at 6000 even when a long window is
  discoverable.

Extract budget_context_for_model() which returns the freshly-probed window when
known else 0, binding the flag to the value it proves; the agent loop uses it.
Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(agent): fix stale budget comments + tighten to the contract (review #4122)

- settings.py: an explicit budget is clamped to the window only — hard_max is
  auto-only (#1190); drop the incorrect "and to hard_max".
- is_setting_overridden docstring: drop the stale "adaptive budgets" example;
  point value-sensitive callers at context_budget.budget_is_explicit.
- Tighten the budget-block comments to the contract (default = auto sentinel,
  non-default = explicit cap, hard_max = auto-only ceiling).

Comment/docstring-only; no behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(agent): correct budget issue citations (#1190 → merged #1230/#1273)

The context-budget contract (auto-sentinel, explicit budgets honoured,
hard_max auto-only) merged via #1230#1190 was the earlier, closed,
superseded PR. Re-point the contract comments at #1230 (the live source,
already cited for the auto-sentinel two lines up in settings.py).

The configurable hard_max setting (`agent_input_token_hard_max`) was a
reviewer requirement first raised on #1190, omitted from the merged #1230,
and actually added in #1273 — credit #1273 for it and correct the test
comment's history (it previously implied this PR completed the requirement).

Comment/docstring-only; no behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 15:17:28 +09:00
Dividesbyzer0 589fcd314a fix(image): patch realesrgan torchvision compatibility (#4110) 2026-06-15 15:16:41 +09:00
cyq 5e0cdb6cbb fix(mcp): share oauth redirect URI (#4087) 2026-06-15 15:15:53 +09:00
Max Hsu 039431f5ea fix(mcp): detect npx cache entries before probing (#4034) 2026-06-15 15:14:48 +09:00
cyq aac589ee49 fix(cookbook): diagnose sglang native deps (#4112) 2026-06-15 15:14:37 +09:00
Dividesbyzer0 8cff1f87ee fix(cookbook): stop local Windows process trees
Track the inner Bash runner PID for local Windows Cookbook tasks and stop the full child process tree during cleanup.
2026-06-15 15:12:48 +09:00
Dividesbyzer0 ec4f91afdd fix(cookbook): normalize llama-cpp-python cache types
Map llama-cpp-python --type_k/--type_v cache names to integer enum values after serve-command validation while preserving native llama-server flags.
2026-06-15 15:12:18 +09:00
Dividesbyzer0 7f571c8f7e fix(agent): keep gpt-oss on text tool mode
Treat gpt-oss local OpenAI-compatible models as text/fenced-tool models unless the endpoint explicitly declares native tool support.
2026-06-15 15:11:52 +09:00
cirim 056d1fb960 fix(llm): make connect timeout configurable
Use a configurable LLM_CONNECT_TIMEOUT for call and stream connect budgets instead of the previous hard-coded 3s default.
2026-06-15 15:11:38 +09:00
Muhammed Midlaj 4b0a977988 fix(models): probe /v1/models for path-less LM Studio endpoints
Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.
2026-06-15 15:09:50 +09:00
Dividesbyzer0 ece6cebc03 fix(cookbook): create bin dir before llama-server link
Ensure ~/bin exists before the llama.cpp accelerated build script creates the llama-server link.
2026-06-15 15:03:55 +09:00
holden093 4c41834dc7 fix(youtube): consolidate duplicate handler
Make src.youtube_handler a compatibility wrapper around services.youtube.youtube_handler so transcript state, URL parsing, and timeout behavior no longer diverge.
2026-06-15 15:03:41 +09:00
holden093 96052c5e8a fix(agent): add contacts domain to tool classifier
Add a contacts domain rule pack and deterministic contact intent detection so contact prompts surface resolve_contact/manage_contact tools.
2026-06-15 15:03:19 +09:00
adabarbulescu afc81bdd7b fix: drop thinking deltas from background agent loops
Skip thinking-only deltas when accumulating background, scheduled-task, and teacher captured reply text.
2026-06-15 15:03:09 +09:00
osmanakkawi 71ccd59b54 fix(chat): make resend message non-destructive
Keep normal resend from truncating session history while preserving replace-from-here behavior for regenerate flows.
2026-06-15 15:02:48 +09:00
Ashvin b20cea347a fix(hwfit): serve profiles for sub-8192 context models
Allow serve-profile generation for models whose trained context window is below 8192 while preserving the 8K shrink floor for larger models.
2026-06-15 15:02:22 +09:00
Dividesbyzer0 a07fe35936 fix(agent): honor explicit web search requests
Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.
2026-06-15 15:02:10 +09:00
RaresKeY a7766d0b7f fix(agent): honor auth-disabled tool access after setup
Check explicit auth-disabled mode before configured-admin ownership checks so single-user mode keeps full agent tool access after setup.
2026-06-15 15:01:48 +09:00
nopoz 6824fbb729 fix(gallery): validate upstream result image URLs
Validate image URLs returned by upstream diffusion/OpenAI responses before server-side fetches to prevent SSRF through result image retrieval.
2026-06-15 15:01:28 +09:00
nopoz f14ea6d67d fix(codex): validate stored SSH host and port
Validate cookbook task remoteHost and sshPort values before building SSH shell commands in the Codex bridge.
2026-06-15 15:01:03 +09:00
Tom 59efa8a44b fix(personal): confine remove_directory_from_rag to PERSONAL_DIR
Resolve remove_directory_from_rag paths through the same PERSONAL_DIR confinement helper used by add_directory_to_rag before removal sinks are reached.
2026-06-15 15:00:35 +09:00
Tom 2857723e47 fix(security): restrict API-key encryption key file to 0o600
Lock the API key encryption key file to owner-only permissions on creation and when reading existing keys, with regression coverage for permissions and encryption roundtrip.
2026-06-15 15:00:11 +09:00
Catalin Iliescu 7dedc51d9f fix(tests): isolate webhook task reference imports
Isolate src.database/src.webhook_manager imports in test_webhook_task_refs so collection does not leak stubbed modules into later tests.
2026-06-15 14:57:47 +09:00
catalini82 9d7a3d66c0 test(memory): cover owner isolation for memory search
Co-authored-by: Cata <cata@bigjohn.local>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 22:21:30 +01:00
Rolly Calma 20cf94f53d fix(platform): read proc version with utf-8
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:58:22 +01:00
muhamed hamed 3b3c0d6254 fix: detect HuggingFace token when downloading cookbook models (#3459)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:53:16 +01:00
Mazen Tamer Salah f5c1eb4b9d fix(settings): degrade load_features to defaults on PermissionError
load_settings() already catches PermissionError, but load_features() caught only
FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable
data/features.json (e.g. root-owned after a deploy) therefore raised instead of
falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything
that reads feature flags. Add PermissionError to the except tuple to match
load_settings().

Adds tests/test_load_features_permission_error.py.

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:20:10 +01:00
Michael c0cc0f954c fix: read allow_bash/allow_web_search from JSON body (#3229) (#3281)
* fix: read allow_bash/allow_web_search from JSON body (#3229)

API callers using Content-Type: application/json had bash and web
tools silently disabled because allow_bash / allow_web_search were
only read from FormData (which is empty for JSON requests).

Changes:
- Fall back to JSON body for allow_bash and allow_web_search values
- Only add bash/web_search to disabled_tools when explicitly set to a
  falsy value; when unset (None), defer to per-user privilege checks
- Admins with can_use_bash=True now get bash enabled by default

Fixes #3229

* fix: always send explicit allow_bash/allow_web_search from frontend

The backend 'is not None' guard (from prior commit) is correct for API
callers, but the frontend only sent allow_bash=true when the toggle was
ON — omission meant 'unspecified' which the backend treated as 'don't
disable'. Now the frontend always sends an explicit true/false value:

- allow_bash: sent on every request (checked ? 'true' : 'false')
- allow_web_search: explicit 'false' when toggle is off in agent mode

With explicit frontend values, the 'is not None' guard is safe:
- explicit true → tool enabled
- explicit false → tool disabled
- None (API caller omission) → defer to per-user privilege

---------

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 19:14:41 +01:00
Marius Popa 2a4bba2b9e fix(api-keys): preserve encrypted keys when saving providers (#1920)
* fix(api-keys): preserve encrypted keys when saving providers

* test(api-keys): cover malformed raw key entries

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 18:23:54 +01:00
Alexandre Teixeira a79c0bd369 test: move area_cli tests into cli directory (#3842)
* test: move area_cli tests into cli directory

* test: include research CLI status in cli test move
2026-06-11 17:01:14 +00:00
Carles Siles 3e65326c3f fix: expand cookbook error output tail from 12 to 50 lines (#1538)
* fix: expand cookbook error output tail from 12 to 50 lines

When a task reaches status 'error', the status endpoint was returning
only the last 12 lines of the subprocess log. The existing context-menu
'Copy last 50 lines' action was therefore copying the same 12 lines,
making it useless for diagnosing failures that produce long stack traces
or build output.

- Set _tail_lines = 50 when status == 'error', keep 12 for running tasks
- Initialise exit_code = None before the status-classification block so
  it is always defined in the result dict (was only set inside the
  is_alive branch, potential NameError in the dead-session path)
- Include exit_code in the task-status response dict
- JS poller captures exit_code from live data into local task state

The frontend output panel and 'Copy last 50 lines' now show the actual
error context without any UI changes.

* refactor: extract output-tail logic to testable helper + behavioral tests

Addresses review feedback on #1538: the previous tests were source-level
string guards. Extract the tail-slicing into a dependency-free helper
(routes/cookbook_output.error_aware_output_tail) and replace the guards
with behavioral tests that exercise the actual logic:

- error status with a 200-line snapshot -> exactly the last 50 lines
- running/ready/completed/stopped/unknown -> last 12 lines
- short snapshot -> all lines, no padding
- empty snapshot -> empty string
- error tail is a strict superset (suffix-compatible) of the non-error tail

The helper has no FastAPI/SQLAlchemy imports so it unit-tests without
standing up the app.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 17:55:33 +01:00
Alexandre Teixeira 01fbee021b docs(tests): inventory first low-risk test directory split (#3764)
Add a documentation-only test layout inventory for the first low-risk split of the flat tests directory.

Records the current 28-file area_cli set, including tests/test_research_cli_status.py, and documents validation/non-goals for the future mechanical move.

Closes #3712
Part of #2523
2026-06-11 19:24:06 +03:00
Kenny Van de Maele 620fdd0859 feat(agent): confine agent file/shell tools to a selectable workspace (#3665)
* feat(agent): workspace confinement via context-local binding + get_workspace tool

Bind the per-turn workspace once in execute_tool_block; the shared path
resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd
helper (agent_cwd) read it, so file tools + bash/python are confined centrally
and a new tool that uses the shared helpers cannot accidentally bypass it.

Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory
modal (reusing existing modal/button CSS), the /workspace slash command, and a
get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic
(realpath/normcase/commonpath) and docker-safe (container paths, no host
assumptions). Reopens #2023.

* ux(workspace): clarify workspace is not a sandbox

Picker modal note + pill tooltip + get_workspace tool/output wording now state
plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder,
but bash/python only start there (cwd) and are not sandboxed. Modal note reuses
the existing .muted class.

* fix(agent): treat an active workspace as file-work intent

A vague low-signal message (e.g. "look at the local project") matches no
domain keywords, so tool retrieval is skipped and only always-available tools
are offered — leaving the agent with no file access even though a workspace is
set. When a workspace is active, include the file/code tools (incl.
get_workspace) on low-signal turns so the agent can act on the folder.

Also requires the tool index (ChromaDB) to be reachable for normal retrieval;
that is an environment dependency, not part of this change.

* ux(workspace): hide pill + overflow entry in chat mode

Workspace only scopes the agent's file/shell tools, so the pill and the
overflow 'Workspace' entry are agent-only now — hidden in chat mode like the
bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is
called from the agent/chat setMode handler.

* prompt(tools): steer bash/python to defer to the dedicated file tools

bash/python schema descriptions (what native-tool-calling models read) were
bare and gave no steer, so models would do file ops via the shell (e.g. writing
SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python
in the schema + tool-index + prompt section to prefer read_file/write_file/
edit_file/grep/glob/ls and only be used for what those do not cover.

* prompt(tools): keep bash/python deferral generic (no hardcoded tool names)

Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc.
by name, so the guidance does not go stale if those tools are renamed.

* style(workspace): drop em-dashes from added code comments/strings

* ux(workspace): terser non-sandbox note in picker (no tool-name list)

* ux(workspace): mirror terse non-sandbox wording in pill tooltip

* chore: untrack local venv symlink (run-only, not part of the feature)

* prompt(workspace): keep get_workspace text generic (no hardcoded tool names)

* fix(agent): low-signal + workspace surfaces only read-only file tools

Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message
in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but
not write_file/edit_file/bash/python -- those wait for a request that actually
calls for them (RAG retrieval still adds them on a real ask).

* feat(workspace): cap browse listing at 500 dirs with a truncated hint

Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local
_MAX_BROWSE_DIRS so a directory with thousands of children does not dump every
row into the picker; the response carries a truncated flag and the modal tells
the user to type a path to jump in.

* chore: untrack local venv symlink (run-only artifact)

* fix(workspace): vet the workspace root against the sensitive-path deny list at bind time

The in-workspace resolver deny-lists sensitive paths inside the workspace,
but the empty-path search root is the workspace itself, so a workspace of
~/.ssh could be listed via ls with no path. vet_workspace() (public, in
tool_execution next to the resolvers) rejects non-directories and sensitive
roots before the path is ever bound; chat_routes uses it instead of its
inline isdir check.

* fix(workspace): reject filesystem roots and stop showing rejected workspaces as active

Review findings from #3665:

P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes
every absolute path 'inside' the workspace and collapses confinement into
host-wide file access. A root is its own dirname, so reject when
dirname(resolved) == resolved; the browse response now carries a selectable
flag and the picker disables 'Use this folder' on unselectable dirs.

P3: /workspace set stored any string client-side and the chat route silently
dropped rejected values, so the pill could claim a confinement that was not
in effect. New admin-gated /api/workspace/vet validates manual paths before
they persist (canonical path returned), and when a posted workspace is
rejected at send time the stream emits workspace_rejected so the client
clears the stored value and toasts instead of continuing silently.

* fix(workspace): check caller privilege before vetting the posted workspace

Review finding: /api/chat_stream called vet_workspace() on the posted value
for every caller and emitted workspace_rejected on failure, so a non-admin
who can chat but cannot use file/shell tools could distinguish existing
directories from missing/file/sensitive/root paths by whether the event
appeared. The resolution now lives in _resolve_request_workspace, which
drops the submitted value uniformly for non-admin callers, with no vetting
and no event, before the path ever touches the filesystem. Admin and
single-user behavior is unchanged. Test pins that valid and invalid paths
are indistinguishable for a non-admin and that vet_workspace is never
invoked for them.
2026-06-11 18:17:54 +02:00
Michael 95c54ac3cb fix: use _truncate for tool output display limits in agent_loop (#3831)
Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate
helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit
truncation indicator when content is cut.

Scoped down from the original PR: only agent/tool-output display
behavior, no integrations.py changes.

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 17:05:13 +01:00
Kenny Van de Maele 263d41c58a fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers (#3945)
* fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers

_apply_local_cache_affinity adds session_id + cache_prompt for llama.cpp
KV-cache slot affinity (#2927), gated on _is_self_hosted_openai_compatible,
which treated any unknown OpenAI-compatible host as self-hosted. Strict
cloud providers added as custom endpoints (Mistral at api.mistral.ai)
reject unknown body fields, so every request failed with 422
extra_forbidden. Self-hosted now also requires the endpoint to resolve as
local via model_context.is_local_endpoint: loopback/private/tailscale
host, or endpoint kind explicitly configured as "local" (the escape hatch
for tunneled self-hosted servers). is_local_endpoint is promoted to a
public name since llm_core now shares it.

Fixes #3793

* test(llm): sweep cloud OpenAI-compatible hosts in affinity gating

Parametrized cases adapted from #3839 (credit: Shabablinchikow): deepseek,
x.ai, together, fireworks, and the Gemini OpenAI-compat endpoint must all
stay free of the llama.cpp extras, not just the Mistral host from #3793.

* fix(llm): narrow the Tailscale range to 100.64.0.0/10 in is_local_endpoint

Review finding on #3945: _PRIVATE_PREFIXES carried a bare "100." prefix,
treating all of 100.0.0.0/8 as local while Tailscale only uses the CGNAT
block 100.64.0.0/10. Public 100.x hosts (e.g. AWS ranges outside the
block) were classified local and still received the llama.cpp extras
this PR exists to keep away from strict providers. Match the narrowed
classification routes/model_routes.py already uses, with boundary tests
just below, inside, and just above the range.
2026-06-11 17:51:03 +02:00
Mazen Tamer Salah f941db29d3 fix(search): batch FTS hit lookups into one query (N+1) (#3909)
_search_fts ran the FTS MATCH query, then looked up each hit's full row with its
own db.query(...).filter(id == message_id).first() inside a loop, so a search
returning N hits issued N extra SELECTs. Fetch all hit rows in a single IN(...)
query via _fetch_messages_by_id and reassemble results in hit (relevance) order.

Adds tests/test_session_search_batch_fetch.py asserting a single batched query
(and no query for empty input). Existing session-search tests stay green.
2026-06-11 16:31:54 +02:00
Kenny Van de Maele bfac1d55d6 fix(search): read plain-text, Markdown, and JSON URLs in fetch_webpage_content (#3809)
raw.githubusercontent.com serves Markdown as text/plain, JSON APIs and raw
config files serve application/json, and a lot of code and tool documentation
lives in .md/.txt. fetch_webpage_content only handled PDF and HTML, so a
non-HTML body produced empty content and web_fetch reported 'no readable text
content'. Add a branch that returns the body verbatim for non-HTML text/*,
JSON (application/json and +json), and a .md/.txt/.text/.json URL-suffix
fallback for mislabeled octet-stream. HTML and PDF handling unchanged.

Fixes #3808
2026-06-11 14:24:53 +00:00
AkioKoneko 4fa4d0100a fix(email): keep FETCH attributes Gmail sends after the header literal (all Gmail mail showed as unread) (#3785)
* fix(email): keep FETCH attributes Gmail sends after the header literal

imaplib returns a UID FETCH response as an interleaved list of
(meta, literal) tuples plus bare bytes elements. Which attributes land
where is server-specific: Dovecot sends FLAGS before the RFC822.HEADER
literal (inside the tuple meta), Gmail sends them after it, as a bare
` FLAGS (\Seen))` element. The email list grouping loop and the search
loop only inspected tuples, so on Gmail every message lost its FLAGS and
the whole mailbox rendered as unread/unflagged, with mark-read appearing
to have no effect.

Extract the grouping into _group_uid_fetch_records(), fold bare bytes
parts into the current message meta there, and reuse it in both the
batched list fetch and the per-UID search fetch. Covered by unit tests
with captured Gmail-shaped and Dovecot-shaped responses.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(email): use raw byte literals for IMAP backslash escapes

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 16:12:39 +02:00
RaresKeY c500bcb47d fix(uploads): migrate upload ownership on rename (#3617) 2026-06-11 16:01:04 +02:00
Mazen Tamer Salah f7a3605b16 fix(webhooks): keep references to in-flight delivery tasks (#3859)
fire() and fire_and_forget() scheduled delivery with bare create_task()/
loop.create_task() and kept no reference. asyncio holds only a weak reference to
a task, so the GC could collect a delivery (or the fire() coroutine itself)
before it completed, silently dropping the webhook.

Track in-flight tasks in a set on the manager via a _spawn_tracked() helper that
holds a strong reference for the task's lifetime and discards it on completion
(add_done_callback), and route both schedule sites through it.

Adds tests/test_webhook_task_refs.py.
2026-06-11 15:53:52 +02:00
cyq 65d9603c8c fix(memory): validate session owner on manual add (#3807) 2026-06-11 15:44:10 +02:00
Ashvin a7b03398b6 fix(tokens): owner check on update and delete routes (#3899)
PATCH and DELETE /api/tokens/{id} both called require_admin but never
checked that the token belonged to the requesting admin. Any admin could
rename, re-scope, or delete another admin's token by ID.

create_token already stamps owner on every token — update and delete
just never read it. Fixed by comparing token.owner against
get_current_user(request) after the 404 guard, same pattern the rest of
the auth routes use. Check is skipped when current_user is falsy
(AUTH_ENABLED=false / single-user mode).

Fixes #3898
2026-06-11 15:34:44 +02:00
George Lawton 4f48cfa9ae fix: omit temperature for Opus 4.7+ on native Anthropic path (#3117)
Anthropic removed the sampling parameters (temperature, top_p, top_k)
starting with Claude Opus 4.7 — sending temperature at all, even 0.0,
returns HTTP 400. _build_anthropic_payload sent it unconditionally, so
every native-Anthropic request to Opus 4.7/4.8 failed: the research probe
(ResearchHandler._probe_endpoint, temperature=0) aborted runs before they
started, and all DeepResearcher._llm calls 400'd.

Add _anthropic_rejects_temperature (version-gates opus-N-M >= (4,7)) and
omit temperature in the Anthropic builder for those models. Older Claude
models (Opus 4.6 and below, Sonnet/Haiku) keep temperature and the
existing [0,1] clamp.

The version gate is hardened against real-world model id shapes:
- a word-boundary anchor so a substring like `octopus-4-8` is not read
  as Opus and stripped of temperature;
- a 1-2 digit minor cap so a dated id such as `claude-opus-4-20250514`
  (Opus 4.0, listed in ANTHROPIC_MODELS) parses as major-only and keeps
  temperature, while dated 4.7+ snapshots still match;
- a non-string guard so a non-string model can't raise AttributeError
  (the previous builder never called .lower() on it).

Adds regression tests covering 4.7/4.8 omission, older/dated/legacy
retention, the substring overmatch, and non-string inputs.

Fixes #3065

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 16:27:40 +03:00
Afonso Coutinho af61b2d4e6 test(research): cover complete status CLI alias
Adds focused regression coverage for the research CLI complete-to-done status alias.
2026-06-11 15:49:12 +03:00
Rohithmatham12 9f47c5ff87 fix: quote kernels repair package spec 2026-06-11 14:56:35 +03:00
Nacho Mata 73823c878e fix(windows): detect per-user Git for Windows bash under %LocalAppData%\Programs\Git (#3738)
find_bash() rejected the WindowsApps WSL stub and then probed only %LocalAppData%\Git, so per-user Git for Windows installs (winget / Inno Setup {userpf}) under %LocalAppData%\Programs\Git were never found and the Cookbook reported "needs Git Bash" despite Git being installed.

Add the Programs\Git subfolder to the LocalAppData fallback root.
2026-06-11 13:41:12 +02:00
RaresKeY 50fedff2f2 fix(email): scope learned sender signatures by owner (#3724) 2026-06-11 13:26:59 +02:00
Max Hsu 66c25cbc2f fix(models): reassign default endpoint when current default is disabled (#3649)
Adding a new endpoint only auto-set the global default chat endpoint when
none was configured (`if not settings.get("default_endpoint_id")`). When the
existing default pointed at an endpoint the user had since disabled, it was
never reassigned, so features that read the raw `default_endpoint_id` setting
(notably Memory → Tidy) failed with "No default model configured — set one in
Settings" even though an enabled endpoint existed.

Reassign the default when the configured endpoint is missing/disabled, via a
new pure `_default_endpoint_needs_assignment` helper. Adds unit coverage for
the helper plus route-level regression tests for the disabled/enabled cases.

Fixes #3586

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:17:31 +02:00