Commit Graph

360 Commits

Author SHA1 Message Date
Vishnu d6a3c9a0fe fix(utility): use utility model for background tasks (auto-title, memory audit) instead of chat model (#4027) 2026-06-15 15:33:19 +09:00
Ashvin 23837f4571 fix(cookbook): report dead finished downloads as completed instead of stopped (#4025)
When a download's tmux pane is gone, the status endpoint trusted only the
HF-cache probe to tell completed from stopped. The probe derives its cache
root from its own environment, but the download runner exports
HF_HOME=<local_dir> (the #2722 fix), so custom-dir downloads land in
<local_dir>/hub where the probe never looks - and ollama pulls don't touch
the HF cache at all. Finished downloads were reported as stopped forever,
and tasks already persisted as completed were demoted back to stopped on
the next poll. This is the backend half of #3897, deliberately left out of
the frontend fix in #4000.

- honor the conclusive runner markers first: DOWNLOAD_OK -> completed
  (keeping the "Fetching 0 files" error guard), DOWNLOAD_FAILED -> error
- pass the task's local_dir through to the cache probes so they check the
  cache the download actually wrote to, keeping the env-var fallback for
  default-cache downloads
- move the probe scripts and marker classification into
  routes/cookbook_output.py (dependency-free) with behavioral tests

Fixes #4017
2026-06-15 15:26:55 +09:00
Dividesbyzer0 b28aa1f2c4 fix(cookbook): allow local Windows Diffusers serving (#4077) 2026-06-15 15:21:01 +09:00
Dividesbyzer0 589fcd314a fix(image): patch realesrgan torchvision compatibility (#4110) 2026-06-15 15:16:41 +09:00
cyq 5e0cdb6cbb fix(mcp): share oauth redirect URI (#4087) 2026-06-15 15:15:53 +09:00
cyq aac589ee49 fix(cookbook): diagnose sglang native deps (#4112) 2026-06-15 15:14:37 +09:00
Dividesbyzer0 8cff1f87ee fix(cookbook): stop local Windows process trees
Track the inner Bash runner PID for local Windows Cookbook tasks and stop the full child process tree during cleanup.
2026-06-15 15:12:48 +09:00
Dividesbyzer0 ec4f91afdd fix(cookbook): normalize llama-cpp-python cache types
Map llama-cpp-python --type_k/--type_v cache names to integer enum values after serve-command validation while preserving native llama-server flags.
2026-06-15 15:12:18 +09:00
Muhammed Midlaj 4b0a977988 fix(models): probe /v1/models for path-less LM Studio endpoints
Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.
2026-06-15 15:09:50 +09:00
Dividesbyzer0 ece6cebc03 fix(cookbook): create bin dir before llama-server link
Ensure ~/bin exists before the llama.cpp accelerated build script creates the llama-server link.
2026-06-15 15:03:55 +09:00
Dividesbyzer0 a07fe35936 fix(agent): honor explicit web search requests
Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.
2026-06-15 15:02:10 +09:00
nopoz 6824fbb729 fix(gallery): validate upstream result image URLs
Validate image URLs returned by upstream diffusion/OpenAI responses before server-side fetches to prevent SSRF through result image retrieval.
2026-06-15 15:01:28 +09:00
nopoz f14ea6d67d fix(codex): validate stored SSH host and port
Validate cookbook task remoteHost and sshPort values before building SSH shell commands in the Codex bridge.
2026-06-15 15:01:03 +09:00
Tom 59efa8a44b fix(personal): confine remove_directory_from_rag to PERSONAL_DIR
Resolve remove_directory_from_rag paths through the same PERSONAL_DIR confinement helper used by add_directory_to_rag before removal sinks are reached.
2026-06-15 15:00:35 +09:00
els-hub 21ff44e9e8 perf(email): run blocking IMAP routes in threadpool
Fixes #4232

Convert email search and archive handlers from async def to sync def so FastAPI runs their blocking IMAP I/O in the threadpool instead of the event loop.
2026-06-15 14:54:13 +09:00
muhamed hamed 3b3c0d6254 fix: detect HuggingFace token when downloading cookbook models (#3459)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:53:16 +01:00
Michael c0cc0f954c fix: read allow_bash/allow_web_search from JSON body (#3229) (#3281)
* fix: read allow_bash/allow_web_search from JSON body (#3229)

API callers using Content-Type: application/json had bash and web
tools silently disabled because allow_bash / allow_web_search were
only read from FormData (which is empty for JSON requests).

Changes:
- Fall back to JSON body for allow_bash and allow_web_search values
- Only add bash/web_search to disabled_tools when explicitly set to a
  falsy value; when unset (None), defer to per-user privilege checks
- Admins with can_use_bash=True now get bash enabled by default

Fixes #3229

* fix: always send explicit allow_bash/allow_web_search from frontend

The backend 'is not None' guard (from prior commit) is correct for API
callers, but the frontend only sent allow_bash=true when the toggle was
ON — omission meant 'unspecified' which the backend treated as 'don't
disable'. Now the frontend always sends an explicit true/false value:

- allow_bash: sent on every request (checked ? 'true' : 'false')
- allow_web_search: explicit 'false' when toggle is off in agent mode

With explicit frontend values, the 'is not None' guard is safe:
- explicit true → tool enabled
- explicit false → tool disabled
- None (API caller omission) → defer to per-user privilege

---------

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 19:14:41 +01:00
Carles Siles 3e65326c3f fix: expand cookbook error output tail from 12 to 50 lines (#1538)
* fix: expand cookbook error output tail from 12 to 50 lines

When a task reaches status 'error', the status endpoint was returning
only the last 12 lines of the subprocess log. The existing context-menu
'Copy last 50 lines' action was therefore copying the same 12 lines,
making it useless for diagnosing failures that produce long stack traces
or build output.

- Set _tail_lines = 50 when status == 'error', keep 12 for running tasks
- Initialise exit_code = None before the status-classification block so
  it is always defined in the result dict (was only set inside the
  is_alive branch, potential NameError in the dead-session path)
- Include exit_code in the task-status response dict
- JS poller captures exit_code from live data into local task state

The frontend output panel and 'Copy last 50 lines' now show the actual
error context without any UI changes.

* refactor: extract output-tail logic to testable helper + behavioral tests

Addresses review feedback on #1538: the previous tests were source-level
string guards. Extract the tail-slicing into a dependency-free helper
(routes/cookbook_output.error_aware_output_tail) and replace the guards
with behavioral tests that exercise the actual logic:

- error status with a 200-line snapshot -> exactly the last 50 lines
- running/ready/completed/stopped/unknown -> last 12 lines
- short snapshot -> all lines, no padding
- empty snapshot -> empty string
- error tail is a strict superset (suffix-compatible) of the non-error tail

The helper has no FastAPI/SQLAlchemy imports so it unit-tests without
standing up the app.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 17:55:33 +01:00
Kenny Van de Maele 620fdd0859 feat(agent): confine agent file/shell tools to a selectable workspace (#3665)
* feat(agent): workspace confinement via context-local binding + get_workspace tool

Bind the per-turn workspace once in execute_tool_block; the shared path
resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd
helper (agent_cwd) read it, so file tools + bash/python are confined centrally
and a new tool that uses the shared helpers cannot accidentally bypass it.

Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory
modal (reusing existing modal/button CSS), the /workspace slash command, and a
get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic
(realpath/normcase/commonpath) and docker-safe (container paths, no host
assumptions). Reopens #2023.

* ux(workspace): clarify workspace is not a sandbox

Picker modal note + pill tooltip + get_workspace tool/output wording now state
plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder,
but bash/python only start there (cwd) and are not sandboxed. Modal note reuses
the existing .muted class.

* fix(agent): treat an active workspace as file-work intent

A vague low-signal message (e.g. "look at the local project") matches no
domain keywords, so tool retrieval is skipped and only always-available tools
are offered — leaving the agent with no file access even though a workspace is
set. When a workspace is active, include the file/code tools (incl.
get_workspace) on low-signal turns so the agent can act on the folder.

Also requires the tool index (ChromaDB) to be reachable for normal retrieval;
that is an environment dependency, not part of this change.

* ux(workspace): hide pill + overflow entry in chat mode

Workspace only scopes the agent's file/shell tools, so the pill and the
overflow 'Workspace' entry are agent-only now — hidden in chat mode like the
bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is
called from the agent/chat setMode handler.

* prompt(tools): steer bash/python to defer to the dedicated file tools

bash/python schema descriptions (what native-tool-calling models read) were
bare and gave no steer, so models would do file ops via the shell (e.g. writing
SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python
in the schema + tool-index + prompt section to prefer read_file/write_file/
edit_file/grep/glob/ls and only be used for what those do not cover.

* prompt(tools): keep bash/python deferral generic (no hardcoded tool names)

Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc.
by name, so the guidance does not go stale if those tools are renamed.

* style(workspace): drop em-dashes from added code comments/strings

* ux(workspace): terser non-sandbox note in picker (no tool-name list)

* ux(workspace): mirror terse non-sandbox wording in pill tooltip

* chore: untrack local venv symlink (run-only, not part of the feature)

* prompt(workspace): keep get_workspace text generic (no hardcoded tool names)

* fix(agent): low-signal + workspace surfaces only read-only file tools

Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message
in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but
not write_file/edit_file/bash/python -- those wait for a request that actually
calls for them (RAG retrieval still adds them on a real ask).

* feat(workspace): cap browse listing at 500 dirs with a truncated hint

Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local
_MAX_BROWSE_DIRS so a directory with thousands of children does not dump every
row into the picker; the response carries a truncated flag and the modal tells
the user to type a path to jump in.

* chore: untrack local venv symlink (run-only artifact)

* fix(workspace): vet the workspace root against the sensitive-path deny list at bind time

The in-workspace resolver deny-lists sensitive paths inside the workspace,
but the empty-path search root is the workspace itself, so a workspace of
~/.ssh could be listed via ls with no path. vet_workspace() (public, in
tool_execution next to the resolvers) rejects non-directories and sensitive
roots before the path is ever bound; chat_routes uses it instead of its
inline isdir check.

* fix(workspace): reject filesystem roots and stop showing rejected workspaces as active

Review findings from #3665:

P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes
every absolute path 'inside' the workspace and collapses confinement into
host-wide file access. A root is its own dirname, so reject when
dirname(resolved) == resolved; the browse response now carries a selectable
flag and the picker disables 'Use this folder' on unselectable dirs.

P3: /workspace set stored any string client-side and the chat route silently
dropped rejected values, so the pill could claim a confinement that was not
in effect. New admin-gated /api/workspace/vet validates manual paths before
they persist (canonical path returned), and when a posted workspace is
rejected at send time the stream emits workspace_rejected so the client
clears the stored value and toasts instead of continuing silently.

* fix(workspace): check caller privilege before vetting the posted workspace

Review finding: /api/chat_stream called vet_workspace() on the posted value
for every caller and emitted workspace_rejected on failure, so a non-admin
who can chat but cannot use file/shell tools could distinguish existing
directories from missing/file/sensitive/root paths by whether the event
appeared. The resolution now lives in _resolve_request_workspace, which
drops the submitted value uniformly for non-admin callers, with no vetting
and no event, before the path ever touches the filesystem. Admin and
single-user behavior is unchanged. Test pins that valid and invalid paths
are indistinguishable for a non-admin and that vet_workspace is never
invoked for them.
2026-06-11 18:17:54 +02:00
AkioKoneko 4fa4d0100a fix(email): keep FETCH attributes Gmail sends after the header literal (all Gmail mail showed as unread) (#3785)
* fix(email): keep FETCH attributes Gmail sends after the header literal

imaplib returns a UID FETCH response as an interleaved list of
(meta, literal) tuples plus bare bytes elements. Which attributes land
where is server-specific: Dovecot sends FLAGS before the RFC822.HEADER
literal (inside the tuple meta), Gmail sends them after it, as a bare
` FLAGS (\Seen))` element. The email list grouping loop and the search
loop only inspected tuples, so on Gmail every message lost its FLAGS and
the whole mailbox rendered as unread/unflagged, with mark-read appearing
to have no effect.

Extract the grouping into _group_uid_fetch_records(), fold bare bytes
parts into the current message meta there, and reuse it in both the
batched list fetch and the per-UID search fetch. Covered by unit tests
with captured Gmail-shaped and Dovecot-shaped responses.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(email): use raw byte literals for IMAP backslash escapes

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 16:12:39 +02:00
RaresKeY c500bcb47d fix(uploads): migrate upload ownership on rename (#3617) 2026-06-11 16:01:04 +02:00
cyq 65d9603c8c fix(memory): validate session owner on manual add (#3807) 2026-06-11 15:44:10 +02:00
Ashvin a7b03398b6 fix(tokens): owner check on update and delete routes (#3899)
PATCH and DELETE /api/tokens/{id} both called require_admin but never
checked that the token belonged to the requesting admin. Any admin could
rename, re-scope, or delete another admin's token by ID.

create_token already stamps owner on every token — update and delete
just never read it. Fixed by comparing token.owner against
get_current_user(request) after the 404 guard, same pattern the rest of
the auth routes use. Check is skipped when current_user is falsy
(AUTH_ENABLED=false / single-user mode).

Fixes #3898
2026-06-11 15:34:44 +02:00
RaresKeY 50fedff2f2 fix(email): scope learned sender signatures by owner (#3724) 2026-06-11 13:26:59 +02:00
Max Hsu 66c25cbc2f fix(models): reassign default endpoint when current default is disabled (#3649)
Adding a new endpoint only auto-set the global default chat endpoint when
none was configured (`if not settings.get("default_endpoint_id")`). When the
existing default pointed at an endpoint the user had since disabled, it was
never reassigned, so features that read the raw `default_endpoint_id` setting
(notably Memory → Tidy) failed with "No default model configured — set one in
Settings" even though an enabled endpoint existed.

Reassign the default when the configured endpoint is missing/disabled, via a
new pure `_default_endpoint_needs_assignment` helper. Adds unit coverage for
the helper plus route-level regression tests for the disabled/enabled cases.

Fixes #3586

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:17:31 +02:00
RaresKeY d5603ee575 fix(research): migrate active task owners on rename (#3618) 2026-06-11 01:17:02 +02:00
Mazen Tamer Salah 9c00da6d1c fix(hwfit): tolerate non-numeric gpu_count in /api/hwfit/models (#3639)
* fix(hwfit): tolerate non-numeric gpu_count in /api/hwfit/models

The route did `n = int(gpu_count)` with no guard, so a non-numeric query param
like `?gpu_count=abc` raised ValueError and returned HTTP 500. Parse it
defensively (mirroring the gpu_group guard a few lines above): a malformed value
is ignored, exactly like omitting the param, and valid values still apply.

Adds tests/test_hwfit_gpu_count_nonnumeric.py: a non-numeric gpu_count returns a
ranking instead of raising, and a numeric value is still accepted.

* test(hwfit): cover non-numeric manual_gpu_count too

Follow-up to the gpu_count guard: add a regression test for the sibling
manual_gpu_count query param (the hardware simulator in _apply_manual_hardware),
which dev already guards by defaulting to 1 on a non-numeric value. This pins
that behaviour so the endpoint's count parsing is fully covered and cannot
regress to a 500.
2026-06-11 01:01:58 +02:00
RaresKeY d1a5a7d680 fix(hwfit): validate remote SSH detection targets (#3718) 2026-06-11 00:43:49 +02:00
Mazen Tamer Salah 96975f8dd9 fix(contacts): tolerate non-string body in /api/contacts/import (#3638)
import_vcf built `text = data.get("vcf") or data.get("text") or ""`, so a
non-string JSON value (a number, list, etc.) stayed in place and the following
`text.strip()` raised AttributeError, returning HTTP 500. Coerce vcf/text/csv
with str() so non-string input degrades to the existing structured "no data"
response, matching the file's convention elsewhere.

Adds tests/test_contacts_import_nonstring.py covering non-string vcf, non-string
csv, and an empty body.
2026-06-10 17:50:22 +02:00
RaresKeY 800d391234 fix(auth): roll back rename on owner migration failure (#3616) 2026-06-10 17:28:27 +02:00
Ashvin 9c8df89973 fix(auth): case-insensitive skill owner match on rename (#3614)
SKILL.md files written with mixed-case owner (e.g. 'owner: Alice') were
skipped because the regex had no IGNORECASE flag. _usage.json keys like
'Alice::skill-name' were missed by the startswith prefix check for the
same reason.

Both comparisons now match the same way the deep_research and memory
blocks do — case-insensitively against old_username.

Fixes #3611
2026-06-10 17:20:36 +02:00
Ashvin 6f73c8afaa fix(sessions): use owner_filter for list_sessions queries when auth disabled (#3622)
Direct DbSession.owner == user becomes WHERE owner IS NULL when user is None
(auth disabled), hiding all sessions that carry an explicit owner. Same flaw
on the Document and GalleryImage sub-queries (active-doc and gallery badges).
Replace all three with owner_filter(), which is a no-op when user is falsy.

Fixes #3620
2026-06-10 17:07:07 +02:00
RaresKeY cd3fb4e96b fix(auth): fail closed when deleting user tokens fails (#3733) 2026-06-10 16:24:27 +02:00
Yeoh Ing Ji 3e49658204 refactor(tools): extract document tools to handle registry (#3666)
* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* refactor: update import paths for document tools

* fix(tests): correct source path for document ID test
2026-06-10 10:41:52 +02:00
Lucas Daniel 55ff22c6d5 fix(chat): stabilize system prompt, sequence memory extraction, and send stable session id to preserve KV cache (#3360)
* fix(chat): stabilize system prompt, sequence memory extraction, send stable session id to preserve KV cache

Fixes #2927. As diagnosed in the issue, three things in Odysseus's request
pattern actively destroyed local backends' (llama.cpp / LM Studio) KV-cache
continuity, forcing a full prompt re-evaluation (15-30s+) on every turn:

1. Dynamic content folded into the system prompt every turn. Both the chat
   preface (ChatProcessor.build_context_preface) and the agent system prompt
   (_build_system_prompt) injected current_datetime_prompt() — text that
   changes every minute — directly into system-role messages, which llm_core
   then concatenates into the single system message sent as the cached
   prefix. Any byte difference there invalidates the entire cache. Moved this
   to a new current_datetime_context_message() helper that returns a
   standalone user-role message, inserted near the end of the array (right
   before the latest user turn) instead of mixed into the system prompt. The
   static system prefix (preset prompt + safety policy + agent base prompt)
   now stays byte-identical across turns of the same session.

2. Memory/skill extraction side-requests competed with the main completion.
   run_post_response_tasks fired extract_and_store / maybe_extract_skill via
   asyncio.create_task — fire-and-forget coroutines that could overlap the
   next turn's main request and steal llama.cpp's limited processing slots,
   evicting the cached checkpoint. They're now queued through a new
   _run_extraction_jobs_sequentially helper that waits for the session's
   stream to go idle and runs the jobs strictly one at a time.

3. No stable session identifier was sent to local backends, so llama.cpp
   assigned a new processing slot via LRU every turn ("session_id=<empty>
   server-selected (LCP/LRU)"), losing slot affinity. Added
   _apply_local_cache_affinity() in llm_core, which sets session_id and
   cache_prompt: true on outgoing payloads — gated to self-hosted
   OpenAI-compatible endpoints only (never api.openai.com or other cloud
   providers, which reject unrecognized request fields with a 400). Threaded
   session_id through stream_llm / llm_call_async / stream_agent_loop from
   the existing Odysseus session id.

Tests in tests/test_kv_cache_invalidation_2927.py exercise the real payload-
assembly and scheduling code paths: byte-identical system prefix across two
turns of the same session (with a regression check that genuinely changed
instructions DO still change it), the dynamic time block landing as a
user-role message, extraction jobs waiting for the stream to go idle and
running sequentially, and the outgoing payload carrying a stable session_id
(same across turns of one session, different across sessions) only for
self-hosted endpoints. Updated tests/test_user_time.py for the new message
placement.

* fix(tests): accept owner= kwarg in normalize_model_id monkeypatch

The upstream normalize_model_id signature now takes an owner= keyword
argument, and chat_helpers.py passes owner=getattr(sess, "owner", None)
at the call site. Update the test stub lambda to **kwargs so it handles
the new argument without breaking, and update chat_helpers.py to forward
the owner parameter consistently.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 22:46:54 +01:00
TimHoogervorst 8878443426 fix(calanders): Removed/merged duplicate calender delete endpoints (#3682)
* merged two delete_calander functions performing the same thing

* added proper 404 raise when nothing is found

* removed 404 HTTPException and jus reverted it back to raise
2026-06-09 22:35:55 +02:00
arnodecorte 38dc9a0a41 Allow cookbook scopes for API tokens (#3090)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 21:03:40 +01:00
RaresKeY 5d33393a28 fix(gallery): fail closed for null-user owner scope (#3613) 2026-06-09 20:20:21 +02:00
Ashvin 60d25e0e26 fix(cookbook): use COOKBOOK_STATE_FILE constant for state path (#3623)
The module derived its state file path as Path(os.environ.get("DATA_DIR", "data"))
/ "cookbook_state.json". The correct env var is ODYSSEUS_DATA_DIR, which is
already read by src/constants.py and exported as COOKBOOK_STATE_FILE. When
ODYSSEUS_DATA_DIR is set (Docker, custom installs), the old code read the wrong
env var and silently wrote state to data/cookbook_state.json relative to CWD
while every other file resolved under the custom data directory.

Fixes #3621
2026-06-09 17:39:06 +02:00
Sheikh Rahat Mahmud 9180847c0e feat(diagnostics): add consolidated service health endpoint for degraded-state reporting (#964)
* Add consolidated service health endpoint for degraded-state reporting

ROADMAP (High Priority) asks for "Better degraded-state reporting for
ChromaDB, SearXNG, email, ntfy, and provider probes." Until now there was no
single readout of which subsystems are actually working: /api/health is only a
liveness ping and each subsystem's signal lives in a different module, so a
misconfigured self-host install gives no consolidated picture.

This adds an admin-only GET /api/diagnostics/services endpoint backed by a new
src/service_health.py aggregator. Each subsystem reports a uniform
{name, status, detail, meta} where status is ok | degraded | down | disabled,
and the response rolls up an overall verdict (worst non-disabled status).

Probes are deliberately non-intrusive and safe to poll:
- ChromaDB: reads the .healthy flags on the RAG and memory vector stores.
- SearXNG: GET /healthz (2xx), falling back to the instance root (<500). No
  search query is run.
- ntfy: GET the server's built-in /v1/health. No test notification is sent.
- email: short IMAP connect+logout per configured account (no credentials in
  meta).
- providers: probe each enabled ModelEndpoint's model list (no api_key in meta).

Probe functions take their inputs as parameters and isolate the network call to
injectable callables, so they unit-test without touching the network (same
pattern as the merged provider-endpoint tests). Network probes run concurrently
off the event loop via asyncio.to_thread with bounded per-probe timeouts.

memory_vector is now passed into setup_diagnostics_routes (new optional param,
backward-compatible) so ChromaDB's vector-memory store can be reported too.

Tests: tests/test_service_health.py — 29 tests covering every status mapping
per subsystem, the overall rollup, and that no secrets leak into meta.

Verification:
  python -m pytest tests/test_service_health.py -q          # 29 passed
  python -m py_compile src/service_health.py routes/diagnostics_routes.py app.py
  python -m pytest tests/test_endpoint_resolver.py tests/test_provider_endpoints.py -q

Backend + tests only; an Admin/Settings UI badge that renders this endpoint is
a natural follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(diagnostics): bound service-health wall-clock and redact secrets

Addresses review on #964.

Blocker 1 — genuinely bounded wall-clock:
- providers_health and email_health now fan out per-item probes across a
  bounded thread pool (_bounded_map) with a hard total budget (_FANOUT_BUDGET),
  instead of probing endpoints/accounts sequentially. Stragglers are reported
  as a controlled `timeout` and never block; the pool is shut down with
  wait=False so the response returns on time regardless of endpoint/account
  count.
- The IMAP connect path now honors the service-health budget: _imap_connect
  gained a pass-through `timeout` param and the probe calls it with
  _PROBE_TIMEOUT instead of the default 15s.
- collect_service_health runs the four network subsystems concurrently, each
  under a per-subsystem deadline (_SUBSYSTEM_DEADLINE), with an overall
  wait_for ceiling (_AGGREGATE_DEADLINE) as a backstop.

Blocker 2 — no secret/raw-error leakage in the response:
- _safe_url strips userinfo, query, and fragment from every URL surfaced in
  meta (searxng instance, ntfy base, provider name fallback), keeping only
  scheme/host/port/path.
- _classify_error maps every probe failure to a controlled category token
  (timeout, connection_refused, dns_error, tls_error, network_error,
  http_error, auth_or_protocol_error, …) — raw str(exception), which can embed
  credentialed URLs or server text, is never returned.

Tests (tests/test_service_health.py, +tests/test_diagnostics_service_route.py):
- URL userinfo/query redaction for searxng/ntfy/providers.
- secret-bearing exception strings map to categories and don't leak.
- multiple slow providers/accounts stay bounded (single + 25-endpoint cases).
- subsystems run concurrently; aggregate deadline yields a controlled result.
- route-level unauthenticated (401) / non-admin (403) / admin (200) coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(diagnostics): isolate route tests so they don't leak module globals

The new route tests replaced src.service_health.collect_service_health and
routes.diagnostics_routes.require_admin via direct assignment, which persisted
for the rest of the pytest session. In CI's full alphabetical run that fake
collector (returning services=[]) leaked into the later collect_service_health
tests and failed them. Switch to monkeypatch.setattr so both are restored after
each test. No production code change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 16:00:24 +01:00
Maruf Hasan c3fcaf15b7 feat(providers): add NVIDIA AI provider endpoint support (#3456)
* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com)

* feat: add NVIDIA option to provider settings dropdown and aliases

* test: add NVIDIA provider detection and endpoint tests

* Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering

- nvidia.com -> 'nvidia' curated key for proper provider routing
- _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed
- _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip,
  kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse,
  -embedqa, -nemoretriever

* Expand non-chat model filtering for NVIDIA embedding/guard/video models

Add _NON_CHAT_PREFIXES: embed, recurrent
Add _NON_CHAT_CONTAINS: topic-control, guard, calibration,
  ai-synthetic-video, cosmos-reason2

Catches remaining unfiltered non-chat models from NVIDIA catalog:
embedding (llama-nemotron-embed, embed-qa), guard (llama-guard,
nemoguard-topic-control), calibration (ising-calibration),
video (ai-synthetic-video-detector, cosmos-reason2),
recurrent (recurrentgemma-2b)

* Filter non-chat models in _probe_endpoint via _is_chat_model()

Previously _is_chat_model() was only used in the per-model probe
and _first_chat_model(), so non-chat models still appeared in the
model picker even though they were filtered in those specific paths.
Applying the filter at _probe_endpoint() return ensures non-chat
models (embeddings, safety guards, reward, calibration, video
detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never
enter cached_models and never appear in the picker.

* Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models

Prefix checks (mid.startswith) miss models with org prefixes like
baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc.
Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught
regardless of the org prefix.

Adds: embed, bge, recurrent, starcoder, gemma-2b

* fix(model-routes): drop collision-prone substrings from global non-chat filter

The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES
and _NON_CHAT_CONTAINS tuples. These are intended to filter out
embedding, retrieval, safety, and vision models from NVIDIA's catalog
that are not chat-completions-capable. However, four of the added
substrings collide with legitimate chat models served by other providers:

  - gemma-2b  matches google/gemma-2b-it (instruct chat model)
  - starcoder matches bigcode/starcoder2-15b (code completion model)
  - recurrent matches google/recurrentgemma-2b (language model)
  - guard     matches meta-llama/Llama-Guard-3-8B (safety classifier)

Removing these four from the global tuples keeps the NVIDIA-specific
filtering intact (safety, embedding, retrieval, and vision models are
still caught by other tokens such as content-safety, -safety, -reward,
embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while
preventing false negatives for instruct/code models on other providers.

Tests added for gemma-2b-it, google/gemma-2b-it, and
bigcode/starcoder2-15b-instruct asserting they are recognized as chat
models.

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS

Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS
entries redundant since the prefix check runs first.

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* style: fix indentation of groq and xai test cases in test_provider_endpoints.py

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-09 11:06:12 +02:00
Ashvin 2fdb4813db fix(auth): sync file-backed and in-memory owner caches on user rename (#3397)
The DB owner-rename loop in rename_user patched every SQL column named
owner, but three non-SQL stores were left behind:

1. session_manager.sessions -- in-memory Session objects carry s.owner
   set at server-boot time. get_sessions_for_user() does an exact
   s.owner == username check, so the renamed user chat sidebar goes empty
   until a server restart.

2. data/deep_research/*.json -- each completed research report is a
   standalone JSON file with an owner field. research_routes filters
   by d.get(owner) == user, making every report invisible to the
   renamed user.

3. data/memory.json -- a flat JSON array; each entry carries an owner
   field. memory_manager.load(owner=user) filters on it, so all memories
   vanish from the memory panel.

Fix: after the SQL loop, patch all three:
- iterate sm.sessions and update owner in-place (exposed via app.state)
- walk data/deep_research/*.json and rewrite owner with atomic_write_json
- update matching entries in memory.json with atomic_write_json

All three use the same case-insensitive lower() comparison the SQL loop
already uses. Each step is independently wrapped so a single failure
does not abort the others or the rename itself.

Fixes #3362
2026-06-09 10:19:45 +02:00
Kenny Van de Maele 0aba00f4cf refactor(tools): remove dead workspace-confinement plumbing (#3590)
Commit e6b1009 removed the workspace feature's entry point (deleted
routes/workspace_routes.py + static/js/workspace.js and dropped the
workspace-param parsing in chat_routes), but left the downstream backend
plumbing dangling: chat_routes passed a hardcoded workspace=None into
stream_agent_loop, which forwarded it to execute_tool_block, so the
workspace value was permanently None and every workspace-gated branch
was unreachable.

Remove the now-dead code (no behavior change, since workspace was always
None):
- src/tool_execution.py: drop _resolve_tool_path_in_workspace and the
  workspace params/branches on execute_tool_block, _direct_fallback,
  _call_mcp_tool, _do_edit_file, and _resolve_search_root; restore the
  bash/python/bg cwd to _AGENT_WORKDIR.
- src/agent_loop.py: drop the workspace param on stream_agent_loop, the
  dead 'ACTIVE WORKSPACE' system-prompt block, and the workspace forward.
- routes/chat_routes.py: drop the hardcoded workspace=None arg and var.
- tests: delete test_workspace_confine.py (tested the removed feature) and
  the workspace assertion in test_tool_policy.py.

Full suite: 2903 passed, 1 skipped.
2026-06-09 08:30:50 +02:00
Afonso Coutinho fbed9027b0 fix: backup import dropping a user's skill on cross-tenant title/id collision (#2057)
* Fix backup import dropping a user's skill on cross-tenant title/id collision

The skills block of import_data deduped incoming skills against
skills_manager.load_all(), which returns EVERY tenant's skills. So when
a user imports their own backup, any skill whose id or title collides
with another user's skill was silently skipped — the importing user
lost their own data. This is the same cross-tenant bug already fixed
for the memories block just above (#1743); the skills block was left
with the old pattern. Filter the dedup sets to the importing user's own
skills (owner == user); the full store is still saved back, preserving
other users' skills.

* Restore sys.modules after stubbing so backup test does not break collection of later src.* test modules

* Patch backup_routes auth helpers via monkeypatch instead of sys.modules stubs so the test is import-order robust

* Give FakeSkillsManager an add_skill method matching the disk-backed skills API
2026-06-09 08:04:22 +02:00
Disorder AA d9141c6e56 fix(cookbook): allow spaces and non-ASCII characters in model directory paths (#3473)
* fix(cookbook): allow spaces in model directory paths

Allow POSIX external-drive paths and Windows drive paths with spaces while keeping shell metacharacters rejected.

* fix(cookbook): also allow non-ASCII (Unicode) characters in model dir paths

The ASCII-only allowlist that rejected spaces also rejected Cyrillic,
accented Latin and CJK folder names (e.g. /Volumes/Модели,
D:\AI Models\Модели) with 400 Invalid local_dir. Switch the path
character class from [A-Za-z0-9._ -] to [\w. -] (\w is Unicode-aware on
Python 3 str patterns) so localized folder names validate, while shell
metacharacters (; & | ` $ quotes newlines) stay rejected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(cookbook): reject local_dir path segments starting with '-'

The local_dir allowlist includes '-', so a directory like /models/-rf
(or D:\models\-rf) could be parsed as a CLI flag by hf/etc. (option
injection) — and quoting does not stop a value from being read as an
option. Guard against it inside the validator so the safety stays fully
self-contained there rather than depending on consumers' quoting.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 07:58:38 +02:00
pewdiepie-archdaemon d397b3db2f Restore dropped regression fixes 2026-06-09 10:31:43 +09:00
pewdiepie-archdaemon 37c573d865 Fix model endpoint route test regressions 2026-06-09 10:16:38 +09:00
pewdiepie-archdaemon e6b1009b89 Remove non-merge-ready workspace and terminal agent hooks 2026-06-09 09:48:59 +09:00
pewdiepie-archdaemon fa8c93ec0a Cookbook UI: Ollama browser, advanced serve fold, API tokens form, diagnosis toolbar, polish
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.

Highlights:
- Ollama as a first-class backend in the Cookbook:
  * Download input accepts ollama-style names (name:tag) → backend=ollama
  * /api/cookbook/ollama/library (cached scrape of ollama.com + curated
    fallback so classic models like qwen2.5 stay reachable)
  * "Browse Ollama library" toggle below Download with size chips
  * Engine=Ollama in hwfit toolbar merges the Ollama library into the
    main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
    columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
  loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
  checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
  pulled up 8px to align with the row's checkboxes, GGUF File dropdown
  exposed for Ollama backend, GPU re-render on Edit serve restore,
  _forceBackend flag so saved serveState wins over backend detection,
  cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
  Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
  Clear-offline buttons in Added Models toolbar, offline-pill removed
  (opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
  reachable (vLLM returns 404 on bare /v1, fully working endpoints
  were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
  serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
  Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
  serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
  ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
  auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
2026-06-09 09:46:19 +09:00
pewdiepie-archdaemon 2a2a93d845 Remove plan mode from merge-ready UI 2026-06-09 09:40:20 +09:00