Commit Graph

1521 Commits

Author SHA1 Message Date
pewdiepie-archdaemon 89efd7d44b Email list: collapse to a single row — [sender] [subject] [date]
Subject was on its own line below sender/date. Move it inline so each
email occupies one row: sender capped at 35% width (ellipsis), subject
takes the remaining space (ellipsis), date pins to the right. Tighter
list density at the cost of dropping the spare line for snippet text
(none was being rendered anyway).
2026-06-10 22:45:19 +09:00
pewdiepie-archdaemon 41980df6f1 Sidebar/Chats manage button: parent-hover reveal + clearer 'manage' text
Two related fixes for the Chats section header:
- The 'manage' label only slid out when the button itself was hovered.
  Add section-header-flex:hover to the reveal rule so hovering the sort
  icon (or anywhere in the section header) also opens the label.
- Parent-hover opacity bumped 0.45 → 0.85 so the 'manage' text reads
  much more clearly when revealed. Direct hover on the button still
  pushes to full opacity 1.
2026-06-10 22:44:50 +09:00
pewdiepie-archdaemon baa4449a03 Sidebar/Chats manage button: drop hover background tint
The shared .section-header-btn:hover rule paints a tinted background
across all section header buttons. On the Chats manage button this
showed as a box behind the sliding 'manage' label, which the user
didn't want. Override background to transparent for that one button.
2026-06-10 22:42:06 +09:00
pewdiepie-archdaemon 1ee51be420 Sessions: Esc + outside-click also close the Move-to-folder submenu
The session-dropdown Esc handler only closed .session-dropdown-menu,
leaving the .session-folder-submenu (Move to folder → folder list)
orphaned on screen. Same gap on the click-away path. Extend both
selectors to cover the submenu so a single Esc / outside-click
dismisses the whole stack.
2026-06-10 22:39:23 +09:00
pewdiepie-archdaemon 94931ba59f Chats sidebar: 'manage' label sits in flex flow so its area is clickable
Email's 'new' label is absolutely positioned to the LEFT of the '+'
icon, which works there because the '+' is the visible/clickable
anchor. The chats manage button has no visible glyph at rest, so the
label was rendered outside the button's bounding box — hovering
'manage' lost the :hover state and clicking it missed.

Override list-item-plus-label inside chats-manage-btn:
  position: static (in flex flow) + max-width:0 / max-width:80px
expand-on-hover so the button's clickable rect grows alongside the
text. Hover stays sticky; click hits.
2026-06-10 22:39:05 +09:00
pewdiepie-archdaemon 49ecd806a2 Chats sidebar: 'manage' slides in from the side like email's 'new'
The list-item-plus-label slide-in needs a visible anchor element so
the button takes up consistent width and the absolutely-positioned
label can fly in to the left of it. Email uses the '+' SVG as that
anchor; here we use an empty 13x13 spacer span instead — same
footprint, no glyph. Result: empty button at rest (still visible per
the chats-manage-btn fade rules), 'manage' slides in from the left
on direct hover.
2026-06-10 22:37:06 +09:00
pewdiepie-archdaemon 1eaa5c2a81 Sessions sort: nudge auto-sort icon + 'Group' text 4px left (10→6 left padding) 2026-06-10 22:35:21 +09:00
pewdiepie-archdaemon e107c5876e Chats sidebar: drop library SVG from manage button — text-only 'manage'
Removed the book/library SVG and list-item-plus-btn/-label classes.
The button is now a plain text button styled like email's 'new' label
(9.5px, 0.02em letter-spacing), reusing the existing chats-manage-btn
opacity hover-reveal rules so it still fades until you hover the
section.
2026-06-10 22:35:06 +09:00
SurprisedDuck e115b0155c fix(security): don't grant tool access in the pre-setup window (#3506)
* fix(security): don't grant tool access in the pre-setup window

owner_is_admin_or_single_user() returned True whenever auth was not
configured, which conflated two very different states:

  - intentional single-user mode (operator set AUTH_ENABLED=false), and
  - the pre-setup window (auth enabled, but no admin created yet).

In the second state, blocked_tools_for_owner() returned an empty set, so
server-execution tools (bash/python) and other admin-only tools were
ungated. The auth middleware already 401s /api/ requests pre-setup, but a
caller that bypasses it (trusted loopback / internal-tool path) could reach
those tools before setup completed.

Treat "not configured" as admin only when auth is intentionally disabled
(AUTH_ENABLED=false), mirroring the AUTH_ENABLED parsing in app.py and
core.middleware. Single-user mode is preserved; the pre-setup window is now
non-admin as defense-in-depth.

Adds regression tests for both states.

Fixes #3201

Supported by Claude Opus 4.8

* refactor(security): reuse _auth_disabled() instead of a duplicate helper

Addresses review on #3506: src/auth_helpers.py already has _auth_disabled()
with the identical AUTH_ENABLED parse. Drop the duplicate
_auth_intentionally_disabled() and call the existing helper via a lazy import
inside owner_is_admin_or_single_user (mirroring the lazy core.auth import) to
avoid any import cycle. Removes the now-unused `import os`. Behaviour and the
two regression tests are unchanged.

Supported by Claude Opus 4.8

---------

Co-authored-by: SurprisedDuck <288741682+SurprisedDuck@users.noreply.github.com>
2026-06-10 14:37:26 +02:00
broken💎shaders 59fc6604be Merge branch 'dev' into fix/no-scroll-snapping 2026-06-10 19:58:30 +08:00
ooovenenoso 725d174243 fix(research): track analyzed URLs separately (#3125)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-10 12:08:22 +01:00
Yeoh Ing Ji 3e49658204 refactor(tools): extract document tools to handle registry (#3666)
* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* refactor: update import paths for document tools

* fix(tests): correct source path for document ID test
2026-06-10 10:41:52 +02:00
pewdiepie-archdaemon 4f7061fd61 Settings overhaul + UI polish pass
Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:

Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
  tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
  pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
  with a cross-link to AI Defaults. Provider hints use real clickable
  anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
  carries only Endpoint+Model with a cross-link to Search; Vision /
  Default / Utility fallbacks unified under one numbered-row design
  matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
  Permissions toggle that expands the scope-edit panel, in-field
  copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
  under the button (drop-up on tight viewports); each integration
  form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
  the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
  row at the bottom that loops every category.

AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
  server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
  persona's system prompt; empty/unknown falls back to warm-neutral.

Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
  so dismissing a popup no longer closes the whole Settings modal.

Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
  search/integrations/reminders for card h2 icons + the Added Models
  sub-section icons.

Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
  of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
  blocks; shorter subtitle wording.

Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
  every integration form.
- Provider logos render inline next to the search fallback selects
  and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
  button.
- Image generation default flipped to off.
2026-06-10 15:15:13 +09:00
Alexandre Teixeira fc8e6366dd test: mark first slow tests from duration evidence (#3711) 2026-06-10 01:07:38 +02:00
Lucas Daniel 55ff22c6d5 fix(chat): stabilize system prompt, sequence memory extraction, and send stable session id to preserve KV cache (#3360)
* fix(chat): stabilize system prompt, sequence memory extraction, send stable session id to preserve KV cache

Fixes #2927. As diagnosed in the issue, three things in Odysseus's request
pattern actively destroyed local backends' (llama.cpp / LM Studio) KV-cache
continuity, forcing a full prompt re-evaluation (15-30s+) on every turn:

1. Dynamic content folded into the system prompt every turn. Both the chat
   preface (ChatProcessor.build_context_preface) and the agent system prompt
   (_build_system_prompt) injected current_datetime_prompt() — text that
   changes every minute — directly into system-role messages, which llm_core
   then concatenates into the single system message sent as the cached
   prefix. Any byte difference there invalidates the entire cache. Moved this
   to a new current_datetime_context_message() helper that returns a
   standalone user-role message, inserted near the end of the array (right
   before the latest user turn) instead of mixed into the system prompt. The
   static system prefix (preset prompt + safety policy + agent base prompt)
   now stays byte-identical across turns of the same session.

2. Memory/skill extraction side-requests competed with the main completion.
   run_post_response_tasks fired extract_and_store / maybe_extract_skill via
   asyncio.create_task — fire-and-forget coroutines that could overlap the
   next turn's main request and steal llama.cpp's limited processing slots,
   evicting the cached checkpoint. They're now queued through a new
   _run_extraction_jobs_sequentially helper that waits for the session's
   stream to go idle and runs the jobs strictly one at a time.

3. No stable session identifier was sent to local backends, so llama.cpp
   assigned a new processing slot via LRU every turn ("session_id=<empty>
   server-selected (LCP/LRU)"), losing slot affinity. Added
   _apply_local_cache_affinity() in llm_core, which sets session_id and
   cache_prompt: true on outgoing payloads — gated to self-hosted
   OpenAI-compatible endpoints only (never api.openai.com or other cloud
   providers, which reject unrecognized request fields with a 400). Threaded
   session_id through stream_llm / llm_call_async / stream_agent_loop from
   the existing Odysseus session id.

Tests in tests/test_kv_cache_invalidation_2927.py exercise the real payload-
assembly and scheduling code paths: byte-identical system prefix across two
turns of the same session (with a regression check that genuinely changed
instructions DO still change it), the dynamic time block landing as a
user-role message, extraction jobs waiting for the stream to go idle and
running sequentially, and the outgoing payload carrying a stable session_id
(same across turns of one session, different across sessions) only for
self-hosted endpoints. Updated tests/test_user_time.py for the new message
placement.

* fix(tests): accept owner= kwarg in normalize_model_id monkeypatch

The upstream normalize_model_id signature now takes an owner= keyword
argument, and chat_helpers.py passes owner=getattr(sess, "owner", None)
at the call site. Update the test stub lambda to **kwargs so it handles
the new argument without breaking, and update chat_helpers.py to forward
the owner parameter consistently.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 22:46:54 +01:00
Lucas Daniel d273085744 fix(integrations): truncate api_call JSON lists with sentinel instead of mid-string cut (#3540)
* fix(integrations): truncate api_call JSON lists with sentinel instead of mid-string cut

* fix(integrations): avoid mutating response dict in-place on truncation

* fix(integrations): truncate dict responses and bound list sentinel overhead

- Dict path now walks keys in insertion order, adding them one at a time
  while checking that the accumulated dict + _truncated marker fits within
  the 12 000-char limit. Previously the marker was appended without removing
  any content, so large dicts were not actually truncated.
- List path now subtracts the sentinel's serialised size (+ element-separator
  padding) from the budget before binary-searching, so the final array
  including the sentinel stays at or under the limit.
- Add regression tests: large-dict actually-truncated, small-dict pass-through,
  and list-with-sentinel respects the size bound.

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 22:34:08 +01:00
Kenny Van de Maele 8753daf357 chore: backport main-only changes to dev AGPL relicense + Cookbook serve fix (#3704)
* Change project license to AGPL-3.0-or-later

* Fix Cookbook serve server selection

---------

Co-authored-by: pewdiepie-archdaemon <pewdiepie-archdaemon@users.noreply.github.com>
2026-06-09 23:20:34 +02:00
Michael 2e6fff2212 fix: preserve reasoning_content in sanitized messages for Moonshot/Kimi (#3152)
Providers like Moonshot (Kimi K2.5/K2.6) require the reasoning_content
field to be present on assistant tool-call messages in multi-turn
conversations.  The sanitizer's allow-list was missing this field,
causing HTTP 400: 'thinking is enabled but reasoning_content is missing
in assistant tool call message at index N'.

Add reasoning_content to the allowed field set in
_sanitize_llm_messages and cover with regression tests.

Fixes #3118

Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 21:44:38 +01:00
TimHoogervorst 8878443426 fix(calanders): Removed/merged duplicate calender delete endpoints (#3682)
* merged two delete_calander functions performing the same thing

* added proper 404 raise when nothing is found

* removed 404 HTTPException and jus reverted it back to raise
2026-06-09 22:35:55 +02:00
Alexandre Teixeira a22c0fa85e test: pilot core database stub helper (#3685) 2026-06-09 22:23:33 +02:00
TimHoogervorst b1af29c7bc fix(chat): add aria-label and title attributes to dismiss button for accessibility (#3693) 2026-06-09 22:15:40 +02:00
OdWar420 2fae3b5f64 perf(http): gzip-compress text responses (#3690)
The frontend's text assets shipped uncompressed on every cold load. Add
Starlette's GZipMiddleware. Measured on the current assets:

- style.css   1,127 KB -> 238 KB  (-79%)
- index.html    202 KB ->  35 KB  (-83%)
- chat.js       238 KB ->  60 KB  (-75%)

minimum_size=1024 skips tiny bodies; Starlette excludes `text/event-stream` by
default, so the SSE streams (chat, shell, research, model-probe — all served with
media_type="text/event-stream") are never compressed or buffered. Composes
cleanly with the existing security-header middleware. No behavioural change.

Built by OdWar -- with Claude thinking alongside.
2026-06-09 22:12:24 +02:00
arnodecorte 38dc9a0a41 Allow cookbook scopes for API tokens (#3090)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 21:03:40 +01:00
Rohith Matam fbd8ee9033 fix: fall back for npx cache subprocess check (#3560)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 20:41:23 +01:00
Kenny Van de Maele de80b065f2 fix(macos): start ChromaDB in start-macos.sh so tool calling works (#3664)
* fix(macos): start ChromaDB from start-macos.sh so tool calling works

start-macos.sh never started ChromaDB, so the tool index failed to initialize
and tool/MCP injection silently degraded on native macOS installs (no Docker).
Start a local chroma from the venv before launching, mirroring the existing
Apfel background+trap pattern: idempotent (skips if 8100 is already serving),
honors CHROMADB_HOST/CHROMADB_PORT (skips when remote), logs to a file, persists
to data/chroma, and is killed in the exit trap.

Fixes #3297

* fix(macos): bind/probe ChromaDB on IPv4 loopback to match app resolution

Binding to the literal localhost can land on IPv6 ::1 while the app connects to
localhost->127.0.0.1, leaving them unable to reach each other. Pin bind + probe
to 127.0.0.1 (0.0.0.0 still honored).

* style(macos): trim chromadb comments (present-tense, no issue refs)
2026-06-09 19:37:18 +01:00
Rares Tudor 016157019c fix(tools): use _INTERNAL_BASE in serve-session endpoint registration (#3675)
#3322 renamed the loopback base to _INTERNAL_BASE, but a later Cookbook
commit reintroduced one call site using the old _COOKBOOK_BASE name,
raising NameError whenever the agent registers a model endpoint for a
running serve session.

Fixes #3669
2026-06-09 20:31:29 +02:00
RaresKeY 5d33393a28 fix(gallery): fail closed for null-user owner scope (#3613) 2026-06-09 20:20:21 +02:00
Alexandre Teixeira cdfda4bd16 test: add fast lane and duration visibility (#3659) 2026-06-09 20:11:47 +02:00
Sid 9e74a327f8 fix(llm): remove max_output_tokens from ChatGPT Subscription payload (#3656)
ChatGPT's Codex API rejects any request that includes max_output_tokens,
returning HTTP 400 "Unsupported parameter: max_output_tokens". This caused
Deep Research to always fail during the endpoint probe when a ChatGPT
Subscription model was selected.

Remove the conditional that set payload["max_output_tokens"] in
_build_chatgpt_responses_payload(). The parameter is simply not sent.

Also update the two affected tests:
- Rename test_chatgpt_subscription_payload_uses_max_output_tokens →
  test_chatgpt_subscription_payload_omits_max_output_tokens
- Rename test_chatgpt_subscription_payload_omits_empty_max_output_tokens →
  test_chatgpt_subscription_payload_omits_max_output_tokens_when_zero
- Assert max_output_tokens is absent rather than present

Fixes #3650
2026-06-09 17:42:12 +02:00
Ashvin 60d25e0e26 fix(cookbook): use COOKBOOK_STATE_FILE constant for state path (#3623)
The module derived its state file path as Path(os.environ.get("DATA_DIR", "data"))
/ "cookbook_state.json". The correct env var is ODYSSEUS_DATA_DIR, which is
already read by src/constants.py and exported as COOKBOOK_STATE_FILE. When
ODYSSEUS_DATA_DIR is set (Docker, custom installs), the old code read the wrong
env var and silently wrote state to data/cookbook_state.json relative to CWD
while every other file resolved under the custom data directory.

Fixes #3621
2026-06-09 17:39:06 +02:00
RosenTomov c46d37d876 test(tool_execution): stop two tests leaking src.tool_execution into the suite (#2686)
* Make in-venv pip-fallback test independent of the runner's environment

test_pip_install_fallback_chain_propagates_failure_in_venv simulated the in-venv case by probing the real interpreter (sys.prefix != sys.base_prefix). That assumes the test runner is itself inside a venv. CI runs pytest with no venv, so venv_check reported not-in-venv, the negated guard flipped, the --user branch fired, and the assertion failed. Make venv_check exit 0 directly to simulate the in-venv condition deterministically, mirroring the outside-venv companion test.

* Stop agent-tool import shims from leaking into the admin-gate test

test_function_call_non_object_args and test_unknown_tool_calls stub heavy DB/auth deps at import time to load the real agent-tool stack, but they popped src.tool_execution and left core.auth stubbed without restoring. Popping and re-importing src.tool_execution rebinds the src package's tool_execution attribute, so test_edit_file's later 'import src.tool_execution as te' resolved to a different module object than the one execute_tool_block lives in. The monkeypatch on _owner_is_admin then missed, the non-admin edit_file gate never fired, and the edit went through (exit_code 0). Stop touching src.tool_execution and restore the heavy stubs after import. Verified the full suite is green on Linux (Python 3.11, matching CI).

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 16:35:10 +01:00
Alexandre Teixeira d4ab09e8e1 test: add focused test selection runner (#3556) 2026-06-09 17:03:47 +02:00
Sheikh Rahat Mahmud 9180847c0e feat(diagnostics): add consolidated service health endpoint for degraded-state reporting (#964)
* Add consolidated service health endpoint for degraded-state reporting

ROADMAP (High Priority) asks for "Better degraded-state reporting for
ChromaDB, SearXNG, email, ntfy, and provider probes." Until now there was no
single readout of which subsystems are actually working: /api/health is only a
liveness ping and each subsystem's signal lives in a different module, so a
misconfigured self-host install gives no consolidated picture.

This adds an admin-only GET /api/diagnostics/services endpoint backed by a new
src/service_health.py aggregator. Each subsystem reports a uniform
{name, status, detail, meta} where status is ok | degraded | down | disabled,
and the response rolls up an overall verdict (worst non-disabled status).

Probes are deliberately non-intrusive and safe to poll:
- ChromaDB: reads the .healthy flags on the RAG and memory vector stores.
- SearXNG: GET /healthz (2xx), falling back to the instance root (<500). No
  search query is run.
- ntfy: GET the server's built-in /v1/health. No test notification is sent.
- email: short IMAP connect+logout per configured account (no credentials in
  meta).
- providers: probe each enabled ModelEndpoint's model list (no api_key in meta).

Probe functions take their inputs as parameters and isolate the network call to
injectable callables, so they unit-test without touching the network (same
pattern as the merged provider-endpoint tests). Network probes run concurrently
off the event loop via asyncio.to_thread with bounded per-probe timeouts.

memory_vector is now passed into setup_diagnostics_routes (new optional param,
backward-compatible) so ChromaDB's vector-memory store can be reported too.

Tests: tests/test_service_health.py — 29 tests covering every status mapping
per subsystem, the overall rollup, and that no secrets leak into meta.

Verification:
  python -m pytest tests/test_service_health.py -q          # 29 passed
  python -m py_compile src/service_health.py routes/diagnostics_routes.py app.py
  python -m pytest tests/test_endpoint_resolver.py tests/test_provider_endpoints.py -q

Backend + tests only; an Admin/Settings UI badge that renders this endpoint is
a natural follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(diagnostics): bound service-health wall-clock and redact secrets

Addresses review on #964.

Blocker 1 — genuinely bounded wall-clock:
- providers_health and email_health now fan out per-item probes across a
  bounded thread pool (_bounded_map) with a hard total budget (_FANOUT_BUDGET),
  instead of probing endpoints/accounts sequentially. Stragglers are reported
  as a controlled `timeout` and never block; the pool is shut down with
  wait=False so the response returns on time regardless of endpoint/account
  count.
- The IMAP connect path now honors the service-health budget: _imap_connect
  gained a pass-through `timeout` param and the probe calls it with
  _PROBE_TIMEOUT instead of the default 15s.
- collect_service_health runs the four network subsystems concurrently, each
  under a per-subsystem deadline (_SUBSYSTEM_DEADLINE), with an overall
  wait_for ceiling (_AGGREGATE_DEADLINE) as a backstop.

Blocker 2 — no secret/raw-error leakage in the response:
- _safe_url strips userinfo, query, and fragment from every URL surfaced in
  meta (searxng instance, ntfy base, provider name fallback), keeping only
  scheme/host/port/path.
- _classify_error maps every probe failure to a controlled category token
  (timeout, connection_refused, dns_error, tls_error, network_error,
  http_error, auth_or_protocol_error, …) — raw str(exception), which can embed
  credentialed URLs or server text, is never returned.

Tests (tests/test_service_health.py, +tests/test_diagnostics_service_route.py):
- URL userinfo/query redaction for searxng/ntfy/providers.
- secret-bearing exception strings map to categories and don't leak.
- multiple slow providers/accounts stay bounded (single + 25-endpoint cases).
- subsystems run concurrently; aggregate deadline yields a controlled result.
- route-level unauthenticated (401) / non-admin (403) / admin (200) coverage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test(diagnostics): isolate route tests so they don't leak module globals

The new route tests replaced src.service_health.collect_service_health and
routes.diagnostics_routes.require_admin via direct assignment, which persisted
for the rest of the pytest session. In CI's full alphabetical run that fake
collector (returning services=[]) leaked into the later collect_service_health
tests and failed them. Switch to monkeypatch.setattr so both are restored after
each test. No production code change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 16:00:24 +01:00
Maanas c1674fc2aa refactor(tools): migrate execution logic to src/agent_tools/ package with handler registry (#3435)
* refactor(tools): implement strict cohesive class coordinator pattern per #2917

* test: update edit_file tests to use EditFileTool class

* fix(tools): restore tool_policy param and security backstop in coordinator

* refactor(tools): migrate domain tools to agent_tools package per #2917

* test: update test imports for new agent_tools package

* fix: resolve circular import between tool_execution and agent_tools

* fix: remove leftover git conflict markers

* fix(tools): resolve pytest failure and document _apply method

* fix(tools): clean up whitespace and remove dead _tool_python helper

---------

Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 14:35:36 +01:00
Joshua Valderrama 35b4dd2824 fix: session context drifting — messages leaking between chats (#135) (#267)
* docs: add implementation plan for fixing chat context drifting (#135)

* fix: make Session.history immutable + fix {}.history crash

- Session.history now exposes a COPY of the internal _history list
- add_message() replaces history with a fresh copy each time
- get_context_messages() derives from _history directly
- replace_messages() updates both _history and history
- truncate_messages() updates both _history and history
- _persist_message() line 207: fixed {}.history fallback crash
- Added 11 tests for session isolation and edge cases

Addresses #135 root cause #1: shared mutable references

* fix: task scheduler uses SessionManager methods instead of overwriting sessions

- Added ensure_task_session() to SessionManager (checks cache first)
- Task scheduler now uses ensure_task_session() instead of direct dict assignment
- Task scheduler now uses SessionManager.add_message() for message persistence
- Removed direct sess_obj.history.append() that was silently losing data

Addresses #135 root causes #2 and #3

* fix: add age guard to cleanup_empty_sessions — don't delete sessions <1h old

Prevents the cleanup task from deleting sessions that were just created
and haven't received any messages yet (message_count == 0).

Addresses #135 root cause #5

* test: comprehensive session isolation tests (10/10 passing)

* refactor: consolidate _session_manager into singleton pattern

- Added set_session_manager_instance / get_session_manager_instance to core/models
- kept backward-compat aliases (set_session_manager, get_session_manager)
- session_manager.py re-exports the singleton functions
- ai_interaction.set_session_manager now syncs with the core singleton
- context_compactor uses get_session_manager_instance() instead of getattr hack
- app.py initializes the singleton once

Addresses #135 root cause #4: fragile global wiring

* test: add concurrent session isolation integration tests

Verifies:
- Concurrent add_message to different sessions doesn't cross-contaminate
- Rapid parallel writes maintain isolation
- Read-write concurrent access is safe

All 3 async tests pass, proving the immutable history fix works under concurrency

* fix: pre-import core.models in conftest to prevent test pollution

test_agent_loop.py stubs sys.modules['core.models'] = MagicMock() at
module level during collection. Any test collected after it imports
Session as a MagicMock. Pre-importing core.models in conftest.py
before test_agent_loop.py's module-level code runs prevents this.

* fix: make .history authoritative mutable list, address PR review

Per review feedback: keep .history as the authoritative mutable list so
existing code doing .history.pop(), .history = [...], etc. still works.
Fix the cross-contamination bug by ensuring __post_init__() gives each
Session its OWN unique history list (never shared).

Changes:
- core/models.py: .history IS the authoritative list. _history aliases it.
  Each Session gets its own list in __post_init__.
- core/session_manager.py: add_message() delegates to Session.add_message()
  instead of appending directly — no double-append, single source of truth.
- tests/test_session_manager.py: updated test to reflect that .history
  references see new messages (same list, not a snapshot).
- docs/plans/2026-06-01-fix-chat-context-drifting.md: removed (not for
  shipping — useful design context but too much process/doc to ship).

All 272 tests pass (3 pre-existing failures unrelated).

* Fix session manager message persistence

* Fix session history alias regressions

* Fix session history aliasing and task delivery
2026-06-09 14:12:52 +01:00
Maruf Hasan c3fcaf15b7 feat(providers): add NVIDIA AI provider endpoint support (#3456)
* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com)

* feat: add NVIDIA option to provider settings dropdown and aliases

* test: add NVIDIA provider detection and endpoint tests

* Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering

- nvidia.com -> 'nvidia' curated key for proper provider routing
- _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed
- _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip,
  kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse,
  -embedqa, -nemoretriever

* Expand non-chat model filtering for NVIDIA embedding/guard/video models

Add _NON_CHAT_PREFIXES: embed, recurrent
Add _NON_CHAT_CONTAINS: topic-control, guard, calibration,
  ai-synthetic-video, cosmos-reason2

Catches remaining unfiltered non-chat models from NVIDIA catalog:
embedding (llama-nemotron-embed, embed-qa), guard (llama-guard,
nemoguard-topic-control), calibration (ising-calibration),
video (ai-synthetic-video-detector, cosmos-reason2),
recurrent (recurrentgemma-2b)

* Filter non-chat models in _probe_endpoint via _is_chat_model()

Previously _is_chat_model() was only used in the per-model probe
and _first_chat_model(), so non-chat models still appeared in the
model picker even though they were filtered in those specific paths.
Applying the filter at _probe_endpoint() return ensures non-chat
models (embeddings, safety guards, reward, calibration, video
detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never
enter cached_models and never appear in the picker.

* Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models

Prefix checks (mid.startswith) miss models with org prefixes like
baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc.
Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught
regardless of the org prefix.

Adds: embed, bge, recurrent, starcoder, gemma-2b

* fix(model-routes): drop collision-prone substrings from global non-chat filter

The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES
and _NON_CHAT_CONTAINS tuples. These are intended to filter out
embedding, retrieval, safety, and vision models from NVIDIA's catalog
that are not chat-completions-capable. However, four of the added
substrings collide with legitimate chat models served by other providers:

  - gemma-2b  matches google/gemma-2b-it (instruct chat model)
  - starcoder matches bigcode/starcoder2-15b (code completion model)
  - recurrent matches google/recurrentgemma-2b (language model)
  - guard     matches meta-llama/Llama-Guard-3-8B (safety classifier)

Removing these four from the global tuples keeps the NVIDIA-specific
filtering intact (safety, embedding, retrieval, and vision models are
still caught by other tokens such as content-safety, -safety, -reward,
embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while
preventing false negatives for instruct/code models on other providers.

Tests added for gemma-2b-it, google/gemma-2b-it, and
bigcode/starcoder2-15b-instruct asserting they are recognized as chat
models.

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS

Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS
entries redundant since the prefix check runs first.

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>

* style: fix indentation of groq and xai test cases in test_provider_endpoints.py

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-09 11:06:12 +02:00
Mazen Tamer Salah 3c4ec8828b fix(embeddings): survive numpy embeddings when restoring a reset lane (#3410)
When a lane reset fails to rewrite the recreated collection, the recovery path
re-adds the preserved rows. It read the embeddings with
`preserved.get("embeddings") or []` and gated the loop with
`if ids and docs and old_embeddings:`. chromadb returns embeddings as a numpy
ndarray, whose truth value is ambiguous, so both expressions raise ValueError
inside the except block — the restore is abandoned and every preserved row is
lost (the collection was already deleted), exactly when the code is trying to
avoid data loss.

Use an explicit `is None` check and `len(...)`, and convert ndarray batches to
lists before re-adding.

Adds tests/test_embedding_lane_ndarray_restore.py (preserved embeddings come
back as np.ndarray); existing test_embedding_lanes.py still passes.
2026-06-09 10:40:17 +02:00
Ashvin 2fdb4813db fix(auth): sync file-backed and in-memory owner caches on user rename (#3397)
The DB owner-rename loop in rename_user patched every SQL column named
owner, but three non-SQL stores were left behind:

1. session_manager.sessions -- in-memory Session objects carry s.owner
   set at server-boot time. get_sessions_for_user() does an exact
   s.owner == username check, so the renamed user chat sidebar goes empty
   until a server restart.

2. data/deep_research/*.json -- each completed research report is a
   standalone JSON file with an owner field. research_routes filters
   by d.get(owner) == user, making every report invisible to the
   renamed user.

3. data/memory.json -- a flat JSON array; each entry carries an owner
   field. memory_manager.load(owner=user) filters on it, so all memories
   vanish from the memory panel.

Fix: after the SQL loop, patch all three:
- iterate sm.sessions and update owner in-place (exposed via app.state)
- walk data/deep_research/*.json and rewrite owner with atomic_write_json
- update matching entries in memory.json with atomic_write_json

All three use the same case-insensitive lower() comparison the SQL loop
already uses. Each step is independently wrapped so a single failure
does not abort the others or the rename itself.

Fixes #3362
2026-06-09 10:19:45 +02:00
nubs f1cda91683 fix(agent): scope skill index to owner (#2404)
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-09 09:51:29 +02:00
Kenny Van de Maele 0aba00f4cf refactor(tools): remove dead workspace-confinement plumbing (#3590)
Commit e6b1009 removed the workspace feature's entry point (deleted
routes/workspace_routes.py + static/js/workspace.js and dropped the
workspace-param parsing in chat_routes), but left the downstream backend
plumbing dangling: chat_routes passed a hardcoded workspace=None into
stream_agent_loop, which forwarded it to execute_tool_block, so the
workspace value was permanently None and every workspace-gated branch
was unreachable.

Remove the now-dead code (no behavior change, since workspace was always
None):
- src/tool_execution.py: drop _resolve_tool_path_in_workspace and the
  workspace params/branches on execute_tool_block, _direct_fallback,
  _call_mcp_tool, _do_edit_file, and _resolve_search_root; restore the
  bash/python/bg cwd to _AGENT_WORKDIR.
- src/agent_loop.py: drop the workspace param on stream_agent_loop, the
  dead 'ACTIVE WORKSPACE' system-prompt block, and the workspace forward.
- routes/chat_routes.py: drop the hardcoded workspace=None arg and var.
- tests: delete test_workspace_confine.py (tested the removed feature) and
  the workspace assertion in test_tool_policy.py.

Full suite: 2903 passed, 1 skipped.
2026-06-09 08:30:50 +02:00
pewdiepie-archdaemon 7690860ab1 Settings/Add Models: bump Local Type select width 57→62px 2026-06-09 15:12:57 +09:00
pewdiepie-archdaemon b6366e9da5 Settings/Add Models: fuse Local Type select + URL input into one bordered group 2026-06-09 15:12:12 +09:00
pewdiepie-archdaemon 64122269e9 Settings/Add Models: shrink Local Type select by 15px (72→57) 2026-06-09 15:11:07 +09:00
pewdiepie-archdaemon 1bdd515941 Settings/Add Models: drop 'Type:' label, keep the LLM/Image select 2026-06-09 15:10:48 +09:00
pewdiepie-archdaemon 8ac0ae72dc Settings/Add Models: Local card — Type and Add inline with URL field
Lift the LLM/Image Type select to the left of the URL input and the Add
button to its right, so the primary action (URL + Add) sits on one row.
Scan / Ollama / Key / Test stay on the action row below.
2026-06-09 15:09:28 +09:00
Afonso Coutinho fbed9027b0 fix: backup import dropping a user's skill on cross-tenant title/id collision (#2057)
* Fix backup import dropping a user's skill on cross-tenant title/id collision

The skills block of import_data deduped incoming skills against
skills_manager.load_all(), which returns EVERY tenant's skills. So when
a user imports their own backup, any skill whose id or title collides
with another user's skill was silently skipped — the importing user
lost their own data. This is the same cross-tenant bug already fixed
for the memories block just above (#1743); the skills block was left
with the old pattern. Filter the dedup sets to the importing user's own
skills (owner == user); the full store is still saved back, preserving
other users' skills.

* Restore sys.modules after stubbing so backup test does not break collection of later src.* test modules

* Patch backup_routes auth helpers via monkeypatch instead of sys.modules stubs so the test is import-order robust

* Give FakeSkillsManager an add_skill method matching the disk-backed skills API
2026-06-09 08:04:22 +02:00
Disorder AA d9141c6e56 fix(cookbook): allow spaces and non-ASCII characters in model directory paths (#3473)
* fix(cookbook): allow spaces in model directory paths

Allow POSIX external-drive paths and Windows drive paths with spaces while keeping shell metacharacters rejected.

* fix(cookbook): also allow non-ASCII (Unicode) characters in model dir paths

The ASCII-only allowlist that rejected spaces also rejected Cyrillic,
accented Latin and CJK folder names (e.g. /Volumes/Модели,
D:\AI Models\Модели) with 400 Invalid local_dir. Switch the path
character class from [A-Za-z0-9._ -] to [\w. -] (\w is Unicode-aware on
Python 3 str patterns) so localized folder names validate, while shell
metacharacters (; & | ` $ quotes newlines) stay rejected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(cookbook): reject local_dir path segments starting with '-'

The local_dir allowlist includes '-', so a directory like /models/-rf
(or D:\models\-rf) could be parsed as a CLI flag by hf/etc. (option
injection) — and quoting does not stop a value from being read as an
option. Guard against it inside the validator so the safety stays fully
self-contained there rather than depending on consumers' quoting.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 07:58:38 +02:00
pewdiepie-archdaemon b2458f9891 Settings/Add Models: split Local and API into separate cards, always show API key
Drop the in-card Local/API tab strip — each is now its own admin card with
a normal h2 heading (Local on top, API below). The API key input is
always visible (no more click-to-reveal toggle), matching how cloud
providers actually work. Local keeps the optional key reveal since
local servers usually don't need one.

Dead code removed: wireModelsTabs IIFE and the adm-epApiKeyBtn toggle wire.
2026-06-09 14:57:42 +09:00
pewdiepie-archdaemon 2252776a97 Settings: promote Added Models to its own sidebar menu
Move the Added Models endpoint lists out of the Add Models card into a
dedicated sidebar tab between Add Models and AI Defaults. The card now
focuses purely on adding (Local / API tabs), while the new panel owns
the existing endpoints + Probe and Clear-offline controls.

admin.js: defensive fallback so a stale 'added' value in localStorage
falls back to 'local' instead of leaving both panes hidden.
2026-06-09 14:52:48 +09:00
pewdiepie-archdaemon c9fecd53dc Settings: third 'Added Models' tab in Add Models card
Move the Added Local + Added API lists out of the per-type tabs into
a dedicated third tab. Each Add tab is now just the form; the new tab
collects both lists together with Local / API subheadings.

Card layout:
  Add Models  [Probe] [Clear offline]
    [Local]  [API]  [Added Models]

Tab content:
  Local         → Add Local form
  API           → Add API form
  Added Models  → Local list + API list (subheadings)

All endpoint list/form IDs preserved. Tab switcher JS is generic so
the new 'added' tab works without code changes.
2026-06-09 14:47:21 +09:00