* fix(email): keep FETCH attributes Gmail sends after the header literal
imaplib returns a UID FETCH response as an interleaved list of
(meta, literal) tuples plus bare bytes elements. Which attributes land
where is server-specific: Dovecot sends FLAGS before the RFC822.HEADER
literal (inside the tuple meta), Gmail sends them after it, as a bare
` FLAGS (\Seen))` element. The email list grouping loop and the search
loop only inspected tuples, so on Gmail every message lost its FLAGS and
the whole mailbox rendered as unread/unflagged, with mark-read appearing
to have no effect.
Extract the grouping into _group_uid_fetch_records(), fold bare bytes
parts into the current message meta there, and reuse it in both the
batched list fetch and the per-UID search fetch. Covered by unit tests
with captured Gmail-shaped and Dovecot-shaped responses.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test(email): use raw byte literals for IMAP backslash escapes
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
PATCH and DELETE /api/tokens/{id} both called require_admin but never
checked that the token belonged to the requesting admin. Any admin could
rename, re-scope, or delete another admin's token by ID.
create_token already stamps owner on every token — update and delete
just never read it. Fixed by comparing token.owner against
get_current_user(request) after the 404 guard, same pattern the rest of
the auth routes use. Check is skipped when current_user is falsy
(AUTH_ENABLED=false / single-user mode).
Fixes#3898
Adding a new endpoint only auto-set the global default chat endpoint when
none was configured (`if not settings.get("default_endpoint_id")`). When the
existing default pointed at an endpoint the user had since disabled, it was
never reassigned, so features that read the raw `default_endpoint_id` setting
(notably Memory → Tidy) failed with "No default model configured — set one in
Settings" even though an enabled endpoint existed.
Reassign the default when the configured endpoint is missing/disabled, via a
new pure `_default_endpoint_needs_assignment` helper. Adds unit coverage for
the helper plus route-level regression tests for the disabled/enabled cases.
Fixes#3586
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Fast/Full popover now has a kebab (three-dot) button alongside the
two preset choices. Clicking it expands a textarea below with a
'Draft with note' send button. The textarea is for the user to tell
the AI how to reply ('confirm Tuesday at 2', 'decline politely', 'say
we'll need an extra week') instead of accepting a generic draft.
Plumbing:
- emailLibrary.js: kebab button + note panel inside .email-ai-reply-choice
menu. Submitting calls _runAiReplyFromButton with mode='ai-reply-full'
and a noteHint string.
- _runAiReplyFromButton signature gains noteHint; passes it through
state._onEmailClick as opts.noteHint.
- emailInbox.js consumer: forwards opts.noteHint into _openEmail's new
5th arg, which puts it in the /api/email/ai-reply POST body as
user_hint.
- routes/email_routes.py /ai-reply: reads user_hint, appends a
'User's instructions for THIS reply' section to the user message
(priority over default tone/length). Also skips the per-message
AI-reply cache when a hint is set — the cached generic draft would
silently override the instructions otherwise.
The contacts manager in Settings was stuck at name+email inline only —
no address field, no phone input on add, no search to find anything in
a list of 100+ contacts.
UI:
- Add form gets phone and address inputs alongside name/email. The
email-required gate becomes name-OR-email so address/phone-only
entries are creatable.
- Edit form gets an address input, threaded into the PUT body.
- Search input above the list filters client-side by name / emails /
phones / address (debounced 80ms). Count badge shows N/M when a
filter is active.
Backend:
- /api/contacts/{uid} PUT now accepts address and routes it through
_update_contact (which already supports it after the previous
commit). Validation loosened: name OR email OR address.
- /api/contacts/add POST now accepts phone + address. Phone goes
through an immediate _update_contact since _create_contact's
signature only takes name+email+address.
Closes the gap that pushed the agent into manage_memory when the user
pasted an address and said 'save this for X'. manage_contact now
accepts an optional address arg end-to-end:
- routes/contacts_routes.py:
- _normalize_contact carries an 'address' field
- _build_vcard emits ADR:;;<address>;;;; (street component of the
RFC-6350 7-part ADR), only when address is non-empty
- _parse_vcards reads ADR, joins non-empty components with ', '
- _create_contact and _update_contact thread address through;
update preserves existing address when caller passes empty
- src/tool_implementations.py do_manage_contact:
- add accepts address; require at least name+address or email
(was: email required) so address-only contacts are addable
- update accepts address; require name OR emails OR address
- src/tool_schemas.py: schema gets a single 'address' string field
- src/tool_index.py + src/agent_loop.py: descriptions get one
'address' arg mention and a 'use this for save-X-for-person /
address pastes / phone-with-name' steering line. Net: a few
bytes added, not a paragraph.
Also: removed a stray name from the schema's manage_contact example
strings ('save Jonathan's email…') — no real names in the codebase.
Closes the auto-send hole that let earlier models invent signatures
(e.g. signing 'David' for a user named Felix) and SMTP them to real
recipients before the user could review.
New setting: agent_email_confirm (default True).
When on, the MCP send_email and reply_to_email tools no longer SMTP
directly — they write the composed email to scheduled_emails with a new
status 'agent_draft' (far-future send_at so the scheduled-send poller
ignores them) and return a {pending: true, pending_id, to, subject,
body, message: ...} payload. The model surfaces that to the user.
Backend endpoints to approve / cancel:
- GET /api/email/pending → list staged drafts for the owner
- POST /api/email/pending/{id}/approve → flip status to 'pending' +
backdate send_at so the
existing scheduled-send
poller delivers immediately
- DELETE /api/email/pending/{id} → status = 'cancelled'
UI:
- Settings / AI Defaults gets a new 'Email Safety' card with the
toggle, default on.
- Tool descriptions for send_email and reply_to_email now include the
pending behavior + an explicit 'DO NOT invent a signature, do not
type a person's name' guardrail.
Pass 2 (next): inline chat card with Send / Discard buttons so the user
doesn't have to type a confirmation reply. Today's prompt + the listing
endpoint give the model a clean path to surface drafts.
* fix(hwfit): tolerate non-numeric gpu_count in /api/hwfit/models
The route did `n = int(gpu_count)` with no guard, so a non-numeric query param
like `?gpu_count=abc` raised ValueError and returned HTTP 500. Parse it
defensively (mirroring the gpu_group guard a few lines above): a malformed value
is ignored, exactly like omitting the param, and valid values still apply.
Adds tests/test_hwfit_gpu_count_nonnumeric.py: a non-numeric gpu_count returns a
ranking instead of raising, and a numeric value is still accepted.
* test(hwfit): cover non-numeric manual_gpu_count too
Follow-up to the gpu_count guard: add a regression test for the sibling
manual_gpu_count query param (the hardware simulator in _apply_manual_hardware),
which dev already guards by defaulting to 1 on a non-numeric value. This pins
that behaviour so the endpoint's count parsing is fully covered and cannot
regress to a 500.
import_vcf built `text = data.get("vcf") or data.get("text") or ""`, so a
non-string JSON value (a number, list, etc.) stayed in place and the following
`text.strip()` raised AttributeError, returning HTTP 500. Coerce vcf/text/csv
with str() so non-string input degrades to the existing structured "no data"
response, matching the file's convention elsewhere.
Adds tests/test_contacts_import_nonstring.py covering non-string vcf, non-string
csv, and an empty body.
SKILL.md files written with mixed-case owner (e.g. 'owner: Alice') were
skipped because the regex had no IGNORECASE flag. _usage.json keys like
'Alice::skill-name' were missed by the startswith prefix check for the
same reason.
Both comparisons now match the same way the deep_research and memory
blocks do — case-insensitively against old_username.
Fixes#3611
Direct DbSession.owner == user becomes WHERE owner IS NULL when user is None
(auth disabled), hiding all sessions that carry an explicit owner. Same flaw
on the Document and GalleryImage sub-queries (active-doc and gallery badges).
Replace all three with owner_filter(), which is a no-op when user is falsy.
Fixes#3620
Wire the existing built-in PERSONAS catalog through to scheduled tasks
the same way I wired it to reminder synthesis. Repurposes the
dormant scheduled_tasks.character_id column.
UI (static/js/tasks.js)
- New 'Persona' select in the LLM / Research task form, with the five
built-in characters (socrates/razor/nietzsche/spark/odysseus) plus a
default 'no persona' option. Pre-populates from existing.character_id
on edit. Non-llm/research types explicitly clear it on save.
API (routes/task_routes.py)
- TaskCreate + TaskUpdate gain character_id: Optional[str].
- _task_to_dict echoes character_id back so the form can hydrate on
edit. Update endpoint stores '' as None to allow clearing.
Runner (src/task_scheduler.py)
- When task.character_id is set and matches a built-in persona, prepend
the persona prompt to the task system prompt so the model speaks in
that voice while still knowing it's running a scheduled task.
- crew_member.personality still wins as the base; character_id stacks
on top.
Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:
Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
with a cross-link to AI Defaults. Provider hints use real clickable
anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
carries only Endpoint+Model with a cross-link to Search; Vision /
Default / Utility fallbacks unified under one numbered-row design
matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
Permissions toggle that expands the scope-edit panel, in-field
copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
under the button (drop-up on tight viewports); each integration
form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
row at the bottom that loops every category.
AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
persona's system prompt; empty/unknown falls back to warm-neutral.
Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
so dismissing a popup no longer closes the whole Settings modal.
Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
search/integrations/reminders for card h2 icons + the Added Models
sub-section icons.
Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
blocks; shorter subtitle wording.
Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
every integration form.
- Provider logos render inline next to the search fallback selects
and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
button.
- Image generation default flipped to off.
* fix(chat): stabilize system prompt, sequence memory extraction, send stable session id to preserve KV cache
Fixes#2927. As diagnosed in the issue, three things in Odysseus's request
pattern actively destroyed local backends' (llama.cpp / LM Studio) KV-cache
continuity, forcing a full prompt re-evaluation (15-30s+) on every turn:
1. Dynamic content folded into the system prompt every turn. Both the chat
preface (ChatProcessor.build_context_preface) and the agent system prompt
(_build_system_prompt) injected current_datetime_prompt() — text that
changes every minute — directly into system-role messages, which llm_core
then concatenates into the single system message sent as the cached
prefix. Any byte difference there invalidates the entire cache. Moved this
to a new current_datetime_context_message() helper that returns a
standalone user-role message, inserted near the end of the array (right
before the latest user turn) instead of mixed into the system prompt. The
static system prefix (preset prompt + safety policy + agent base prompt)
now stays byte-identical across turns of the same session.
2. Memory/skill extraction side-requests competed with the main completion.
run_post_response_tasks fired extract_and_store / maybe_extract_skill via
asyncio.create_task — fire-and-forget coroutines that could overlap the
next turn's main request and steal llama.cpp's limited processing slots,
evicting the cached checkpoint. They're now queued through a new
_run_extraction_jobs_sequentially helper that waits for the session's
stream to go idle and runs the jobs strictly one at a time.
3. No stable session identifier was sent to local backends, so llama.cpp
assigned a new processing slot via LRU every turn ("session_id=<empty>
server-selected (LCP/LRU)"), losing slot affinity. Added
_apply_local_cache_affinity() in llm_core, which sets session_id and
cache_prompt: true on outgoing payloads — gated to self-hosted
OpenAI-compatible endpoints only (never api.openai.com or other cloud
providers, which reject unrecognized request fields with a 400). Threaded
session_id through stream_llm / llm_call_async / stream_agent_loop from
the existing Odysseus session id.
Tests in tests/test_kv_cache_invalidation_2927.py exercise the real payload-
assembly and scheduling code paths: byte-identical system prefix across two
turns of the same session (with a regression check that genuinely changed
instructions DO still change it), the dynamic time block landing as a
user-role message, extraction jobs waiting for the stream to go idle and
running sequentially, and the outgoing payload carrying a stable session_id
(same across turns of one session, different across sessions) only for
self-hosted endpoints. Updated tests/test_user_time.py for the new message
placement.
* fix(tests): accept owner= kwarg in normalize_model_id monkeypatch
The upstream normalize_model_id signature now takes an owner= keyword
argument, and chat_helpers.py passes owner=getattr(sess, "owner", None)
at the call site. Update the test stub lambda to **kwargs so it handles
the new argument without breaking, and update chat_helpers.py to forward
the owner parameter consistently.
---------
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* merged two delete_calander functions performing the same thing
* added proper 404 raise when nothing is found
* removed 404 HTTPException and jus reverted it back to raise
The module derived its state file path as Path(os.environ.get("DATA_DIR", "data"))
/ "cookbook_state.json". The correct env var is ODYSSEUS_DATA_DIR, which is
already read by src/constants.py and exported as COOKBOOK_STATE_FILE. When
ODYSSEUS_DATA_DIR is set (Docker, custom installs), the old code read the wrong
env var and silently wrote state to data/cookbook_state.json relative to CWD
while every other file resolved under the custom data directory.
Fixes#3621
* Add consolidated service health endpoint for degraded-state reporting
ROADMAP (High Priority) asks for "Better degraded-state reporting for
ChromaDB, SearXNG, email, ntfy, and provider probes." Until now there was no
single readout of which subsystems are actually working: /api/health is only a
liveness ping and each subsystem's signal lives in a different module, so a
misconfigured self-host install gives no consolidated picture.
This adds an admin-only GET /api/diagnostics/services endpoint backed by a new
src/service_health.py aggregator. Each subsystem reports a uniform
{name, status, detail, meta} where status is ok | degraded | down | disabled,
and the response rolls up an overall verdict (worst non-disabled status).
Probes are deliberately non-intrusive and safe to poll:
- ChromaDB: reads the .healthy flags on the RAG and memory vector stores.
- SearXNG: GET /healthz (2xx), falling back to the instance root (<500). No
search query is run.
- ntfy: GET the server's built-in /v1/health. No test notification is sent.
- email: short IMAP connect+logout per configured account (no credentials in
meta).
- providers: probe each enabled ModelEndpoint's model list (no api_key in meta).
Probe functions take their inputs as parameters and isolate the network call to
injectable callables, so they unit-test without touching the network (same
pattern as the merged provider-endpoint tests). Network probes run concurrently
off the event loop via asyncio.to_thread with bounded per-probe timeouts.
memory_vector is now passed into setup_diagnostics_routes (new optional param,
backward-compatible) so ChromaDB's vector-memory store can be reported too.
Tests: tests/test_service_health.py — 29 tests covering every status mapping
per subsystem, the overall rollup, and that no secrets leak into meta.
Verification:
python -m pytest tests/test_service_health.py -q # 29 passed
python -m py_compile src/service_health.py routes/diagnostics_routes.py app.py
python -m pytest tests/test_endpoint_resolver.py tests/test_provider_endpoints.py -q
Backend + tests only; an Admin/Settings UI badge that renders this endpoint is
a natural follow-up.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(diagnostics): bound service-health wall-clock and redact secrets
Addresses review on #964.
Blocker 1 — genuinely bounded wall-clock:
- providers_health and email_health now fan out per-item probes across a
bounded thread pool (_bounded_map) with a hard total budget (_FANOUT_BUDGET),
instead of probing endpoints/accounts sequentially. Stragglers are reported
as a controlled `timeout` and never block; the pool is shut down with
wait=False so the response returns on time regardless of endpoint/account
count.
- The IMAP connect path now honors the service-health budget: _imap_connect
gained a pass-through `timeout` param and the probe calls it with
_PROBE_TIMEOUT instead of the default 15s.
- collect_service_health runs the four network subsystems concurrently, each
under a per-subsystem deadline (_SUBSYSTEM_DEADLINE), with an overall
wait_for ceiling (_AGGREGATE_DEADLINE) as a backstop.
Blocker 2 — no secret/raw-error leakage in the response:
- _safe_url strips userinfo, query, and fragment from every URL surfaced in
meta (searxng instance, ntfy base, provider name fallback), keeping only
scheme/host/port/path.
- _classify_error maps every probe failure to a controlled category token
(timeout, connection_refused, dns_error, tls_error, network_error,
http_error, auth_or_protocol_error, …) — raw str(exception), which can embed
credentialed URLs or server text, is never returned.
Tests (tests/test_service_health.py, +tests/test_diagnostics_service_route.py):
- URL userinfo/query redaction for searxng/ntfy/providers.
- secret-bearing exception strings map to categories and don't leak.
- multiple slow providers/accounts stay bounded (single + 25-endpoint cases).
- subsystems run concurrently; aggregate deadline yields a controlled result.
- route-level unauthenticated (401) / non-admin (403) / admin (200) coverage.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* test(diagnostics): isolate route tests so they don't leak module globals
The new route tests replaced src.service_health.collect_service_health and
routes.diagnostics_routes.require_admin via direct assignment, which persisted
for the rest of the pytest session. In CI's full alphabetical run that fake
collector (returning services=[]) leaked into the later collect_service_health
tests and failed them. Switch to monkeypatch.setattr so both are restored after
each test. No production code change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com)
* feat: add NVIDIA option to provider settings dropdown and aliases
* test: add NVIDIA provider detection and endpoint tests
* Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering
- nvidia.com -> 'nvidia' curated key for proper provider routing
- _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed
- _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip,
kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse,
-embedqa, -nemoretriever
* Expand non-chat model filtering for NVIDIA embedding/guard/video models
Add _NON_CHAT_PREFIXES: embed, recurrent
Add _NON_CHAT_CONTAINS: topic-control, guard, calibration,
ai-synthetic-video, cosmos-reason2
Catches remaining unfiltered non-chat models from NVIDIA catalog:
embedding (llama-nemotron-embed, embed-qa), guard (llama-guard,
nemoguard-topic-control), calibration (ising-calibration),
video (ai-synthetic-video-detector, cosmos-reason2),
recurrent (recurrentgemma-2b)
* Filter non-chat models in _probe_endpoint via _is_chat_model()
Previously _is_chat_model() was only used in the per-model probe
and _first_chat_model(), so non-chat models still appeared in the
model picker even though they were filtered in those specific paths.
Applying the filter at _probe_endpoint() return ensures non-chat
models (embeddings, safety guards, reward, calibration, video
detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never
enter cached_models and never appear in the picker.
* Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models
Prefix checks (mid.startswith) miss models with org prefixes like
baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc.
Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught
regardless of the org prefix.
Adds: embed, bge, recurrent, starcoder, gemma-2b
* fix(model-routes): drop collision-prone substrings from global non-chat filter
The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES
and _NON_CHAT_CONTAINS tuples. These are intended to filter out
embedding, retrieval, safety, and vision models from NVIDIA's catalog
that are not chat-completions-capable. However, four of the added
substrings collide with legitimate chat models served by other providers:
- gemma-2b matches google/gemma-2b-it (instruct chat model)
- starcoder matches bigcode/starcoder2-15b (code completion model)
- recurrent matches google/recurrentgemma-2b (language model)
- guard matches meta-llama/Llama-Guard-3-8B (safety classifier)
Removing these four from the global tuples keeps the NVIDIA-specific
filtering intact (safety, embedding, retrieval, and vision models are
still caught by other tokens such as content-safety, -safety, -reward,
embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while
preventing false negatives for instruct/code models on other providers.
Tests added for gemma-2b-it, google/gemma-2b-it, and
bigcode/starcoder2-15b-instruct asserting they are recognized as chat
models.
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS
Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS
entries redundant since the prefix check runs first.
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* style: fix indentation of groq and xai test cases in test_provider_endpoints.py
---------
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
The DB owner-rename loop in rename_user patched every SQL column named
owner, but three non-SQL stores were left behind:
1. session_manager.sessions -- in-memory Session objects carry s.owner
set at server-boot time. get_sessions_for_user() does an exact
s.owner == username check, so the renamed user chat sidebar goes empty
until a server restart.
2. data/deep_research/*.json -- each completed research report is a
standalone JSON file with an owner field. research_routes filters
by d.get(owner) == user, making every report invisible to the
renamed user.
3. data/memory.json -- a flat JSON array; each entry carries an owner
field. memory_manager.load(owner=user) filters on it, so all memories
vanish from the memory panel.
Fix: after the SQL loop, patch all three:
- iterate sm.sessions and update owner in-place (exposed via app.state)
- walk data/deep_research/*.json and rewrite owner with atomic_write_json
- update matching entries in memory.json with atomic_write_json
All three use the same case-insensitive lower() comparison the SQL loop
already uses. Each step is independently wrapped so a single failure
does not abort the others or the rename itself.
Fixes#3362
Commit e6b1009 removed the workspace feature's entry point (deleted
routes/workspace_routes.py + static/js/workspace.js and dropped the
workspace-param parsing in chat_routes), but left the downstream backend
plumbing dangling: chat_routes passed a hardcoded workspace=None into
stream_agent_loop, which forwarded it to execute_tool_block, so the
workspace value was permanently None and every workspace-gated branch
was unreachable.
Remove the now-dead code (no behavior change, since workspace was always
None):
- src/tool_execution.py: drop _resolve_tool_path_in_workspace and the
workspace params/branches on execute_tool_block, _direct_fallback,
_call_mcp_tool, _do_edit_file, and _resolve_search_root; restore the
bash/python/bg cwd to _AGENT_WORKDIR.
- src/agent_loop.py: drop the workspace param on stream_agent_loop, the
dead 'ACTIVE WORKSPACE' system-prompt block, and the workspace forward.
- routes/chat_routes.py: drop the hardcoded workspace=None arg and var.
- tests: delete test_workspace_confine.py (tested the removed feature) and
the workspace assertion in test_tool_policy.py.
Full suite: 2903 passed, 1 skipped.
* Fix backup import dropping a user's skill on cross-tenant title/id collision
The skills block of import_data deduped incoming skills against
skills_manager.load_all(), which returns EVERY tenant's skills. So when
a user imports their own backup, any skill whose id or title collides
with another user's skill was silently skipped — the importing user
lost their own data. This is the same cross-tenant bug already fixed
for the memories block just above (#1743); the skills block was left
with the old pattern. Filter the dedup sets to the importing user's own
skills (owner == user); the full store is still saved back, preserving
other users' skills.
* Restore sys.modules after stubbing so backup test does not break collection of later src.* test modules
* Patch backup_routes auth helpers via monkeypatch instead of sys.modules stubs so the test is import-order robust
* Give FakeSkillsManager an add_skill method matching the disk-backed skills API
* fix(cookbook): allow spaces in model directory paths
Allow POSIX external-drive paths and Windows drive paths with spaces while keeping shell metacharacters rejected.
* fix(cookbook): also allow non-ASCII (Unicode) characters in model dir paths
The ASCII-only allowlist that rejected spaces also rejected Cyrillic,
accented Latin and CJK folder names (e.g. /Volumes/Модели,
D:\AI Models\Модели) with 400 Invalid local_dir. Switch the path
character class from [A-Za-z0-9._ -] to [\w. -] (\w is Unicode-aware on
Python 3 str patterns) so localized folder names validate, while shell
metacharacters (; & | ` $ quotes newlines) stay rejected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(cookbook): reject local_dir path segments starting with '-'
The local_dir allowlist includes '-', so a directory like /models/-rf
(or D:\models\-rf) could be parsed as a CLI flag by hf/etc. (option
injection) — and quoting does not stop a value from being read as an
option. Guard against it inside the validator so the safety stays fully
self-contained there rather than depending on consumers' quoting.
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.
Highlights:
- Ollama as a first-class backend in the Cookbook:
* Download input accepts ollama-style names (name:tag) → backend=ollama
* /api/cookbook/ollama/library (cached scrape of ollama.com + curated
fallback so classic models like qwen2.5 stay reachable)
* "Browse Ollama library" toggle below Download with size chips
* Engine=Ollama in hwfit toolbar merges the Ollama library into the
main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
pulled up 8px to align with the row's checkboxes, GGUF File dropdown
exposed for Ollama backend, GPU re-render on Edit serve restore,
_forceBackend flag so saved serveState wins over backend detection,
cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
Clear-offline buttons in Added Models toolbar, offline-pill removed
(opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
reachable (vLLM returns 404 on bare /v1, fully working endpoints
were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
Move every per-route upload byte-limit into src/upload_limits.py as a
validated, env-overridable constant via read_byte_limit_env:
- Add GALLERY_UPLOAD_MAX_BYTES, GALLERY_TRANSFORM_UPLOAD_MAX_BYTES,
MEMORY_IMPORT_MAX_BYTES, PERSONAL_UPLOAD_MAX_BYTES,
EMAIL_COMPOSE_UPLOAD_MAX_BYTES, STT_MAX_AUDIO_BYTES, ICS_MAX_BYTES.
- Routes import their constant instead of defining it locally: replaces 4
raw int(os.getenv(...)) and removes 3 hardcoded literals.
- The 3 previously-hardcoded limits (email compose, STT audio, calendar
ICS) are now env-overridable with the same ODYSSEUS_*_MAX_BYTES naming.
- Defaults unchanged, so behavior is unchanged unless an env var is set;
an invalid value now fails fast with a clear message instead of a bare
int() ValueError.
- Document all env vars in .env.example and the README.
Fixes#3364
* refactor(tools): consolidate duplicated _truncate and get_mcp_manager into src/tool_utils
Move all copies of _truncate(), get_mcp_manager(), and set_mcp_manager()
into a single leaf module (src/tool_utils.py) that imports only from
src.constants. This eliminates the lazy-import hack
('from src import agent_tools' inside function bodies) in tool_execution.py
and tool_implementations.py, and fixes a latent bug: the _truncate copy in
tool_execution.py was missing the isinstance guard and would crash on None.
Also deletes mcp_servers/_common.py — it was dead code with zero callers
anywhere in the codebase, containing its own copy of truncate() and
constants that already exist in src/constants.py.
* fix(tools): route remaining get_mcp_manager imports to src.tool_utils
The maintainer's feedback flagged src/task_scheduler.py:1857 and
routes/task_routes.py:977. A project-wide search found a third call site
in src/agent_loop.py that also imported get_mcp_manager from
src.agent_tools instead of src.tool_utils.
All three are now sourced from the canonical location in src.tool_utils.
---------
Co-authored-by: mcnoliveira <mcnoliveira@gmail.com>
Three issues combined to make the per-user 'Allowed models' checklist
unreliable (#3032):
1. admin.js _loadModelsForUser fetched /api/models, which is backed by
cached_models — endpoints that haven't been probed yet (e.g. a
freshly-added DeepSeek API endpoint) simply didn't show up in the
checklist. Switched to /api/model-endpoints, which always reflects
every configured endpoint regardless of cache state.
2. _saveModels sent allowed_models: [] both when the admin clicked
[All] (no restriction) and [None] (block everything) — the backend
had no way to distinguish the two.
3. _enforce_chat_privileges treated an empty allowed_models list as
'no restriction' (falsy -> skip the check), so [None] had no effect.
Added an explicit block_all_models privilege flag (defaulting to False,
and forced to False for admins) that admin.js now sets when zero models
are checked. _enforce_chat_privileges checks it first and 403s
regardless of allowed_models contents.
* fix(email): close IMAP socket when connect/login fails (#3174)
_imap_connect opened a live socket via _open_imap_connection and then
called conn.login() with no try/finally, and _open_imap_connection called
conn.starttls() unguarded. When auth fails (e.g. an Office 365 app password
on an MFA-enabled tenant, #3174) or STARTTLS is rejected, the already-open
socket was orphaned. Every IMAP caller funnels through _imap_connect,
including the 30-minute _auto_summarize_poller, so a persistently
misconfigured account leaked one descriptor per pass toward FD exhaustion.
The previously merged leak fixes (#1325/#1330/#1423/#1530) only guard the
post-connect body and monkeypatch _imap_connect to succeed, so this
connect-time path was uncovered. Wrap login() and starttls() so a failure
calls conn.shutdown() (low-level close; logout() can't run pre-auth) before
re-raising. Adds two regression tests that fail without the guard.
* fix(email): guard MCP IMAP+SMTP connect-time leaks too (#3174)
Folds in the sibling connect-time leaks vdmkenny flagged on #3363, so the
whole connect-then-step leak class is closed in one place:
- mcp_servers/email_server.py::_imap_connect — guard starttls() and login();
close pre-auth with conn.shutdown() before re-raising.
- mcp_servers/email_server.py::_smtp_connect — guard starttls() and login();
SMTP has no shutdown(), so close with conn.close() (socket close, no QUIT).
Routes SMTP (_send_smtp_message) is already safe via 'with smtplib.SMTP(...)'.
Adds four regression tests (one per guard), verified to fail without the fix.
* fix(presets): scope expand-prompt model resolution to owner
/api/presets/expand resolved its model endpoint with no owner, so in a
multi-user setup it could match another user's endpoint and use its URL
and decrypted api_key. Pass effective_user(request) to _resolve_model so
resolution is owner-scoped. Adds a regression test.
* fix(presets): scope teacher and audit model resolution to owner
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Alex Little <alexwilliamlittle@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
fork_session passed each source message's metadata dict by reference into the
new session. add_message() -> _persist_message() stamps _db_id (and timestamp)
onto that dict in place, so persisting the fork overwrote the SOURCE messages'
_db_id with the forked rows' ids — silently breaking edit/delete-by-id on the
original conversation.
Copy the metadata dict per message so the fork and source no longer alias.
Adds tests/test_fork_session_metadata.py asserting the source session's
message metadata is unchanged after a fork.
GET /api/models swallowed any non-HTTPException raised while checking
whether the caller is authenticated (bare except Exception: pass), so a
broken auth_manager or an exception from get_current_user silently
granted the full model list to an anonymous caller instead of rejecting
the request. Now any unexpected exception logs and returns HTTP 500.
Split out of #2360 per reviewer request to keep the deny-list and the
auth-gate fix as separate, single-purpose PRs.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
PATCH /api/tokens/{id} unconditionally recomputed scopes from
payload.get("scopes"). On a rename — body {"name": "..."} with no "scopes"
key — that is None, so _normalize_scopes(None) returned the default ["chat"]
and the handler overwrote token.scopes, silently dropping every scope the
token had been granted (e.g. email:read, calendar:write).
Only write scopes when the request actually includes them, and return the
token's real stored scopes in the response (matching the GET /tokens display
shape) instead of the recomputed default.
tests/test_api_token_routes.py: add rename-preserves-scopes,
explicit-scopes-applied, and missing-token-404 cases for the PATCH handler.