Commit Graph

356 Commits

Author SHA1 Message Date
pewdiepie-archdaemon ebd2332db4 Agent prompt builder: stop re-adding ALWAYS_AVAILABLE on top of filtered tools
Found the reason yesterday's tool-retrieval drop wasn't taking effect:
in _build_agent_prompt, when relevant_tools was provided, it computed
  tool_names = set(ALWAYS_AVAILABLE) | set(relevant_tools)
which silently re-added every tool get_tools_for_query had just
deliberately discarded. So when a 'save this for <person>' query
dropped manage_memory from the retrieved set, the prompt builder put
it right back, and the model saw both tools again.

Trust the relevant_tools set. get_tools_for_query already starts from
ALWAYS_AVAILABLE — any discard there is intentional and should
propagate. Only force-include ask_user and update_plan here as belt-
and-suspenders since the agent loop relies on those for its own
control flow.

Other callers (task_scheduler) already union ALWAYS_AVAILABLE or
ASSISTANT_ALWAYS_AVAILABLE into relevant_tools before passing it in,
so they're unaffected.
2026-06-11 09:49:20 +09:00
pewdiepie-archdaemon f5ad59317c Tool retrieval: HARD drop manage_memory when query is a contact-save pattern
Description-level steering wasn't enough — even with the explicit 'DO
NOT use for info about another person' in manage_memory's description,
models kept choosing memory over manage_contact. They can't if memory
isn't in the toolset.

New logic in ToolIndex.get_tools_for_query: detect three contact-save
patterns and discard manage_memory from the returned set (overriding
ALWAYS_AVAILABLE):

1. 'save [up to 3 words] for/to <name>' where <name> isn't a timing /
   pronoun stopword (later, tomorrow, me, you, future, etc.). Catches
   the canonical 'save this for X' and the wider 'save this address
   for X', 'save it for X'.
2. 'to/in/into (my) contacts' or 'address book'. Catches both 'add X
   to my contacts' and 'put this in my address book for X'.
3. Possessive: 'save (his/her/their) (address/phone/email/...)'.
   Stronger signal — also force-adds manage_contact to the set in
   case the keyword fallback missed it.

Verified: 8 positive contact patterns all drop memory, 10 false-
positive 'save X for later/tomorrow/me/the next thing' all keep it.
2026-06-11 09:46:34 +09:00
pewdiepie-archdaemon df47536b8d manage_memory descriptions: explicit deferral to manage_contact for person info
Even with manage_contact in the retrieved tool set, models were still
defaulting to manage_memory when the user pasted an address + 'save for
<person>'. Both tools were in front of the model and it picked memory.

Tighten both descriptions to steer at decision-time:
- agent_loop.py manage_memory description: clarify scope is facts
  about the USER, with an explicit 'DO NOT use for info about another
  person' + a 'use manage_contact instead' line.
- tool_index.py manage_memory description: same in shorter form, so the
  embedded retrieval signal is consistent with the prompt-time
  description.
2026-06-11 09:25:23 +09:00
pewdiepie-archdaemon 8a00f954a9 Tool retrieval: catch 'add X to (my) contacts' / 'address book' phrasings
The literal phrase 'add to contacts' missed when there was a name
between 'add' and 'to', e.g. 'add Pat to my contacts'. Anchor on the
tail with 'to my contacts', 'to contacts', 'to address book' so word
boundaries fire regardless of what sits in front.
2026-06-11 09:18:30 +09:00
pewdiepie-archdaemon 8632072ce0 Contacts: postal-address support via vCard ADR, keep tool prompt minimal
Closes the gap that pushed the agent into manage_memory when the user
pasted an address and said 'save this for X'. manage_contact now
accepts an optional address arg end-to-end:

- routes/contacts_routes.py:
  - _normalize_contact carries an 'address' field
  - _build_vcard emits ADR:;;<address>;;;; (street component of the
    RFC-6350 7-part ADR), only when address is non-empty
  - _parse_vcards reads ADR, joins non-empty components with ', '
  - _create_contact and _update_contact thread address through;
    update preserves existing address when caller passes empty
- src/tool_implementations.py do_manage_contact:
  - add accepts address; require at least name+address or email
    (was: email required) so address-only contacts are addable
  - update accepts address; require name OR emails OR address
- src/tool_schemas.py: schema gets a single 'address' string field
- src/tool_index.py + src/agent_loop.py: descriptions get one
  'address' arg mention and a 'use this for save-X-for-person /
  address pastes / phone-with-name' steering line. Net: a few
  bytes added, not a paragraph.

Also: removed a stray name from the schema's manage_contact example
strings ('save Jonathan's email…') — no real names in the codebase.
2026-06-11 09:14:52 +09:00
pewdiepie-archdaemon 153b788134 Tool retrieval: surface manage_contact for 'save X for <person>' patterns
When the user dumps a postal address or phone number alongside a
person's name and says 'save this for X', the vector retriever was
missing manage_contact because its description only mentioned the
literal word 'contact'. The model defaulted to manage_memory (which is
in ALWAYS_AVAILABLE), so the saved fact ended up as un-named memory
that wouldn't surface on a later 'what's X's address?' search.

- Rewrite manage_contact's index description to anchor on the
  semantics: 'save info about another person', including postal/
  mailing address, ZIP, phone, etc. Now it embeds close to address-
  paste queries.
- Extend the keyword intent-map with 'save this for', 'save it for',
  'mailing address', 'postal code', 'their address', etc. — common
  ways users say 'this belongs to a contact' without the literal word
  'contact'.
2026-06-11 08:56:42 +09:00
pewdiepie-archdaemon bc2d934b94 Agent email safety: stage drafts for user approval instead of auto-send
Closes the auto-send hole that let earlier models invent signatures
(e.g. signing 'David' for a user named Felix) and SMTP them to real
recipients before the user could review.

New setting: agent_email_confirm (default True).

When on, the MCP send_email and reply_to_email tools no longer SMTP
directly — they write the composed email to scheduled_emails with a new
status 'agent_draft' (far-future send_at so the scheduled-send poller
ignores them) and return a {pending: true, pending_id, to, subject,
body, message: ...} payload. The model surfaces that to the user.

Backend endpoints to approve / cancel:
- GET    /api/email/pending          → list staged drafts for the owner
- POST   /api/email/pending/{id}/approve → flip status to 'pending' +
                                           backdate send_at so the
                                           existing scheduled-send
                                           poller delivers immediately
- DELETE /api/email/pending/{id}     → status = 'cancelled'

UI:
- Settings / AI Defaults gets a new 'Email Safety' card with the
  toggle, default on.
- Tool descriptions for send_email and reply_to_email now include the
  pending behavior + an explicit 'DO NOT invent a signature, do not
  type a person's name' guardrail.

Pass 2 (next): inline chat card with Send / Discard buttons so the user
doesn't have to type a confirmation reply. Today's prompt + the listing
endpoint give the model a clean path to surface drafts.
2026-06-11 08:50:06 +09:00
pewdiepie-archdaemon 2bf372b41c Tasks: optional persona for LLM + research tasks (biases output voice)
Wire the existing built-in PERSONAS catalog through to scheduled tasks
the same way I wired it to reminder synthesis. Repurposes the
dormant scheduled_tasks.character_id column.

UI (static/js/tasks.js)
- New 'Persona' select in the LLM / Research task form, with the five
  built-in characters (socrates/razor/nietzsche/spark/odysseus) plus a
  default 'no persona' option. Pre-populates from existing.character_id
  on edit. Non-llm/research types explicitly clear it on save.

API (routes/task_routes.py)
- TaskCreate + TaskUpdate gain character_id: Optional[str].
- _task_to_dict echoes character_id back so the form can hydrate on
  edit. Update endpoint stores '' as None to allow clearing.

Runner (src/task_scheduler.py)
- When task.character_id is set and matches a built-in persona, prepend
  the persona prompt to the task system prompt so the model speaks in
  that voice while still knowing it's running a scheduled task.
- crew_member.personality still wins as the base; character_id stacks
  on top.
2026-06-10 23:36:18 +09:00
pewdiepie-archdaemon 4f7061fd61 Settings overhaul + UI polish pass
Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:

Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
  tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
  pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
  with a cross-link to AI Defaults. Provider hints use real clickable
  anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
  carries only Endpoint+Model with a cross-link to Search; Vision /
  Default / Utility fallbacks unified under one numbered-row design
  matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
  Permissions toggle that expands the scope-edit panel, in-field
  copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
  under the button (drop-up on tight viewports); each integration
  form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
  the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
  row at the bottom that loops every category.

AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
  server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
  persona's system prompt; empty/unknown falls back to warm-neutral.

Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
  so dismissing a popup no longer closes the whole Settings modal.

Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
  search/integrations/reminders for card h2 icons + the Added Models
  sub-section icons.

Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
  of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
  blocks; shorter subtitle wording.

Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
  every integration form.
- Provider logos render inline next to the search fallback selects
  and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
  button.
- Image generation default flipped to off.
2026-06-10 15:15:13 +09:00
pewdiepie-archdaemon 1a529d63d9 Fix remaining CI regressions 2026-06-09 10:21:56 +09:00
pewdiepie-archdaemon 37c573d865 Fix model endpoint route test regressions 2026-06-09 10:16:38 +09:00
pewdiepie-archdaemon fa8c93ec0a Cookbook UI: Ollama browser, advanced serve fold, API tokens form, diagnosis toolbar, polish
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.

Highlights:
- Ollama as a first-class backend in the Cookbook:
  * Download input accepts ollama-style names (name:tag) → backend=ollama
  * /api/cookbook/ollama/library (cached scrape of ollama.com + curated
    fallback so classic models like qwen2.5 stay reachable)
  * "Browse Ollama library" toggle below Download with size chips
  * Engine=Ollama in hwfit toolbar merges the Ollama library into the
    main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
    columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
  loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
  checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
  pulled up 8px to align with the row's checkboxes, GGUF File dropdown
  exposed for Ollama backend, GPU re-render on Edit serve restore,
  _forceBackend flag so saved serveState wins over backend detection,
  cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
  Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
  Clear-offline buttons in Added Models toolbar, offline-pill removed
  (opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
  reachable (vLLM returns 404 on bare /v1, fully working endpoints
  were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
  serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
  Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
  serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
  ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
  auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
2026-06-09 09:46:19 +09:00
pewdiepie-archdaemon 3b01760e95 Prepare tested main sync cleanup 2026-06-09 09:34:42 +09:00
Ocean Bennett db1bbfe588 fix(sessions): keep fresh chats during auto tidy (#1871)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-09 01:06:20 +01:00
Kenny Van de Maele 2404b00f18 refactor(uploads): centralize upload byte-limits in upload_limits.py (#3364) (#3518)
Move every per-route upload byte-limit into src/upload_limits.py as a
validated, env-overridable constant via read_byte_limit_env:

- Add GALLERY_UPLOAD_MAX_BYTES, GALLERY_TRANSFORM_UPLOAD_MAX_BYTES,
  MEMORY_IMPORT_MAX_BYTES, PERSONAL_UPLOAD_MAX_BYTES,
  EMAIL_COMPOSE_UPLOAD_MAX_BYTES, STT_MAX_AUDIO_BYTES, ICS_MAX_BYTES.
- Routes import their constant instead of defining it locally: replaces 4
  raw int(os.getenv(...)) and removes 3 hardcoded literals.
- The 3 previously-hardcoded limits (email compose, STT audio, calendar
  ICS) are now env-overridable with the same ODYSSEUS_*_MAX_BYTES naming.
- Defaults unchanged, so behavior is unchanged unless an env var is set;
  an invalid value now fails fast with a clear message instead of a bare
  int() ValueError.
- Document all env vars in .env.example and the README.

Fixes #3364
2026-06-09 01:24:30 +02:00
Ocean Bennett e7c1d75884 fix(models): query v1 models for llama-server endpoints (#3380)
* fix(models): query v1 models for llama-server endpoints

* test(models): accept owner kwargs in llama-server regression
2026-06-09 01:09:02 +02:00
Mateus Oliveira f7ae85590b refactor(tools): consolidate duplicated _truncate and get_mcp_manager into src/tool_utils (#3478)
* refactor(tools): consolidate duplicated _truncate and get_mcp_manager into src/tool_utils

Move all copies of _truncate(), get_mcp_manager(), and set_mcp_manager()
into a single leaf module (src/tool_utils.py) that imports only from
src.constants. This eliminates the lazy-import hack
('from src import agent_tools' inside function bodies) in tool_execution.py
and tool_implementations.py, and fixes a latent bug: the _truncate copy in
tool_execution.py was missing the isinstance guard and would crash on None.

Also deletes mcp_servers/_common.py — it was dead code with zero callers
anywhere in the codebase, containing its own copy of truncate() and
constants that already exist in src/constants.py.

* fix(tools): route remaining get_mcp_manager imports to src.tool_utils

The maintainer's feedback flagged src/task_scheduler.py:1857 and
routes/task_routes.py:977. A project-wide search found a third call site
in src/agent_loop.py that also imported get_mcp_manager from
src.agent_tools instead of src.tool_utils.

All three are now sourced from the canonical location in src.tool_utils.

---------

Co-authored-by: mcnoliveira <mcnoliveira@gmail.com>
2026-06-09 01:05:30 +02:00
Rohith Matam 049833e309 fix: skip malformed document tool call items (#3494) 2026-06-08 23:25:31 +02:00
Lucas Daniel 0a324f20d2 fix(agent): stop treating illustrative Markdown fences as tool calls for native function-calling models (#3356)
* fix(agent): stop executing illustrative Markdown fences as tool calls for native function-calling models

_resolve_tool_blocks fell back to the textual parse_tool_blocks() fenced-block
parser whenever a model produced no native tool_calls, regardless of whether
that model has a reliable native function-calling channel. Native models
(GPT/Claude/Grok/Qwen3/DeepSeek-V, etc. - _is_api_model true) commonly write
illustrative ```bash/```python/```json examples in guide-only prose; the
fallback parser matched these and executed them as real commands, sometimes
looping for several rounds as the model tried to clarify with more examples
(#3222).

Restrict the textual fenced-block fallback to non-native models, which rely
on it as their only tool-invocation channel. Native models are trusted to use
their structured tool_calls channel for real invocations; when they don't
emit one, a bare fence in their response is prose, not an action. The native
tool_calls path itself is untouched.

This sits one layer below #3088's guide-only policy enforcement: that PR
blocks tool exposure/execution on explicit no-tools requests, while this fixes
the parser so ordinary illustrative fences are never misread as calls in the
first place, on any turn.

* fix(agent): gate only the fenced-example pattern for native models, preserve DSML/invoke recovery and persistence

_resolve_tool_blocks previously short-circuited the entire textual parser
(tool_blocks = [] if is_api_model else parse_tool_blocks(...)) for native
function-calling models with no native tool_calls. That also dropped Patterns
2-5 (explicit [TOOL_CALL]/<invoke>/<tool_code>/DSML markup leaked into content
as text), which are real calls a model couldn't emit on its structured channel
(e.g. DeepSeek-V falling back to DSML), not illustrative examples.

parse_tool_blocks/strip_tool_blocks now take a skip_fenced flag that gates ONLY
Pattern 1 (the fenced ```bash/```python/```json block matcher). _resolve_tool_blocks
passes skip_fenced=is_api_model so fenced examples stop being executed for
native models while [TOOL_CALL]/<invoke>/<tool_code>/DSML stay fully active and
recoverable. cleaned_round mirrors the same gate when persisting round text, so
an illustrative fence that wasn't executed isn't stripped from saved/reloaded
history either (it was streaming once and then disappearing on reload).
2026-06-08 22:25:28 +02:00
Mazen Tamer Salah 8e494cc1c4 fix(chat): keep balanced trailing ')' when extracting URLs (#3406)
extract_urls() stripped any trailing ')' unconditionally via
`re.sub(r'[.,;:!?\)]+$', '', url)`. That corrupts URLs that legitimately
end in a parenthesis — most commonly Wikipedia disambiguation links like
https://en.wikipedia.org/wiki/Python_(programming_language), which became
...Python_(programming_language and then 404 when fetched by the web/research
tools.

Strip trailing sentence punctuation as before, but only drop a ')' when it is
unbalanced (more ')' than '('), so a prose-glued "(see https://example.com)"
still loses its closing paren while balanced URLs keep theirs.

Added tests/test_extract_urls.py covering balanced, unbalanced, nested, and
trailing-punctuation cases.
2026-06-08 21:33:29 +02:00
Alex Little a58f526992 fix(presets): scope expand-prompt model resolution to owner (#3477)
* fix(presets): scope expand-prompt model resolution to owner

/api/presets/expand resolved its model endpoint with no owner, so in a
multi-user setup it could match another user's endpoint and use its URL
and decrypted api_key. Pass effective_user(request) to _resolve_model so
resolution is owner-scoped. Adds a regression test.

* fix(presets): scope teacher and audit model resolution to owner

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alex Little <alexwilliamlittle@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-08 21:12:02 +02:00
Mazen Tamer Salah d58202d10e fix(presets): persist presets atomically to avoid corruption on crash (#2169)
PresetManager.save() used a plain open("w") + json.dump, which truncates
presets.json before writing the new content. A crash, power loss, or
serialization error mid-write leaves the file truncated/empty and every
saved preset is lost.

Route the write through core.atomic_io.atomic_write_json (tmp file +
os.replace), matching how the rest of the codebase persists JSON state.
The helper is imported lazily so this module stays free of the heavy core
package import graph at module load time.

Adds tests/test_preset_atomic_save.py covering the source contract, a
failed-write leaving the existing file intact, and a round trip.
2026-06-08 19:16:37 +02:00
Mazen Tamer Salah 1209f258d7 fix(caldav): skip the prune when any object fails to parse (#3454)
* fix(caldav): don't prune the whole window when no objects could be parsed

The post-sync prune deletes local origin=="caldav" rows in the window whose UID
the server didn't just return. With an empty seen_uids it falls back to
`uid.isnot(None)` — a match-all delete. That's right when the calendar is
genuinely empty, but when the server returns objects and every one fails to
parse (malformed iCal / an icalendar error), seen_uids is empty only because
nothing could be read, so the match-all branch silently deletes every local
event in the 90-day-back/365-day-forward window.

Track whether any object failed to parse and gate the prune with a small pure
helper `_should_prune_window(seen_uids, parse_failed)`: prune when something was
read, or when the calendar is genuinely empty (no objects, no parse errors), but
never when objects came back unreadable.

Adds tests/test_caldav_prune_parse_failure.py for the three cases.

* fix(caldav): skip the prune on any parse failure, not just total

Review follow-up (#3454): _should_prune_window returned True whenever seen_uids
was non-empty, so a partial parse failure (say 48 of 50 objects parse) still
pruned the 2 unreadable-but-still-upstream events, because their UIDs were absent
from seen_uids. Any parse failure makes seen_uids an incomplete view of the
server, so pruning against it is unsafe whether the failure is total or partial.

Skip the prune on any parse failure (return not parse_failed); only prune on a
clean read (a genuinely empty window is still safe to prune). Tradeoff: one
permanently-unparseable event pauses deletion mirroring until it is fixed, which
is the safe direction (false-keep beats false-delete).

Replace the now-incorrect "partial failure still prunes" assertion with a
partial-failure regression: one object parses, one fails, so the prune is
skipped and the unparsed event's local copy is not deleted.

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-08 18:59:14 +02:00
Mazen Tamer Salah d71284194b fix(memory): only delete memories the model explicitly drops in tidy (#3455)
* fix(memory): only delete memories the model explicitly drops in tidy

The AI memory-tidy path computed deletions as the complement of the model's
`keep` list (`if mid not in keep_ids: continue`). When the model returned a
valid response that simply omitted some existing ids — a common LLM lapse — every
omitted memory was silently deleted, even though it was neither a duplicate nor
listed in `drop`.

Honor the explicit `drop` set instead: delete only ids the model dropped (minus
any it saw only truncated), and preserve everything else, still applying cleaned
text/category from `keep`.

Adds tests/test_consolidate_memory_explicit_drops.py: a memory the model omits
from both keep and drop survives; an explicitly dropped one is removed.

* refactor(memory): remove now-dead keep_ids from tidy

After deletion switched to drop_ids and text/category rewrites to cleaned_by_id,
keep_ids was written but never read. Remove the init, the .add(mid) in the keep
loop, and the truncated .update() (its truncated-protection is already covered by
`drop_ids -= truncated_ids`). Pure deletion, no behavior change; tests stay green.

Addresses review feedback on #3455.

---------

Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
2026-06-08 18:54:45 +02:00
stocky789 1e0d9b92af feat: add ChatGPT Subscription provider (#2876)
* feat: Add ChatGPT Subscription support and related features

- Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI.
- Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status.
- Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance.
- Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints.
- Added new slash commands for managing skills, including listing, searching, and invoking skills.
- Implemented caching for skill catalog to optimize performance.
- Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing.
- Refactored existing code to accommodate new features and improve maintainability.

* refactor: share provider device-flow setup

- reuse one device-flow backend for Copilot and ChatGPT Subscription
- add one frontend device-flow helper for Settings and /setup
- put GitHub Copilot back into Add Models, now as a dropdown option
- make provider selection just select; clicking Add starts sign-in
- stop ChatGPT Subscription setup from opening auth tabs automatically
- make /setup copilot and /setup chatgpt-subscription work from chat
- show ChatGPT Subscription in the /setup suggestions
- show the real error message when setup fails
- add focused tests for the shared flow and setup UI

* feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX

Backend:
- Resolve runtime bearer for provider-auth endpoints at probe time via a
  shared _resolve_probe_key() that delegates to resolve_endpoint_runtime,
  applied across all probe/refresh call sites.
- Skip live completion probes and health pings for discovery-only providers
  (centralized behind _is_discovery_only_provider) — the Codex/Responses API
  has no such endpoints, so status is derived from cached models.
- Never persist the short lived ChatGPT bearer to the plaintext sessions
  table; proactively clear any stale bearer left by an earlier code path.
- Revoke orphaned ProviderAuthSession credentials when the last endpoint
  backing them is deleted (_delete_orphaned_provider_auth), surfaced via
  cleared_provider_auth in the delete response.

Frontend (admin.js):
- Auto-start the device-auth flow on provider selection so the authorization
  panel (code + Authorize) shows immediately instead of behind a "Sign in" click.
- Remove the redundant top button for device auth providers, move retry
  into the panel via an inline "Try again".
- Drop the self-evident hint text and add an execCommand clipboard fallback so
  Copy works in non-secure (HTTP/LAN) contexts.

* fix: harden chatgpt subscription provider

* chore: remove PR media from branch

* Fix chatgpt subscription recovery and token handling

---------

Co-authored-by: 5p00kyy <admin@5p00ky.dev>
2026-06-08 10:19:18 +02:00
Mike ac94885c84 refactor(constants): single source of truth for data dir (#3368)
* refactor(constants): single source of truth for data dir + merge core/src constants

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(contributing): use named src.constants for data paths, drop core/constants references

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 09:58:52 +02:00
Kenny Van de Maele 505d8bae5a fix(cookbook): locate cookbook_state.json via DATA_DIR, not hardcoded /app/data (#3332)
Three call sites hardcoded Path("/app/data/cookbook_state.json"), which only
exists in Docker; on a native run the real path is <repo>/data, so the state
file looked missing and cookbook serve-state was silently ignored. Two others
used os.environ.get("DATA_DIR", "data") (a relative fallback, since DATA_DIR is
never set as an env var). Route all five through core.constants.DATA_DIR so the
path is consistent and absolute on both Docker and native.

Part of #3331.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 00:13:47 +01:00
nubs 1a0e1c5d69 fix(documents): restore PDF library metadata and preview (#2483)
PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf.

Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs.

The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none.

Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker.

Fixes #2468
2026-06-07 23:23:27 +02:00
Kenny Van de Maele 76c1f42ab0 fix: route all agent loopback calls through internal_api_base() helper (#3322)
#2753 made the agent loopback base port-configurable but only for
_COOKBOOK_BASE in tool_implementations. Several other in-process loopback
calls still hardcoded http://localhost:7000 and broke off port 7000:
cookbook_serve_lifecycle (model-endpoints x2, shell/exec), builtin_actions
(model/serve), task_routes (calendar x3), and the gallery/email calls in
tool_implementations.

Extract the resolution (ODYSSEUS_INTERNAL_BASE / APP_PORT / 7000 fallback,
127.0.0.1 to avoid IPv6 ambiguity) into core.constants.internal_api_base()
and route every call site through it. Rename the now-misnamed _COOKBOOK_BASE
to _INTERNAL_BASE since it serves gallery/email/calendar/serve too. Adds a
test for the resolver plus a regression guard against reintroducing the
literal.

Part of #2752.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 22:22:09 +01:00
Gunnar Arias d85c5e335e fix(security): harden untrusted_context_message against delimiter spoofing (#3086)
* fix(security): harden untrusted_context_message against delimiter spoofing

Root cause: untrusted_context_message() did not sanitise content before
interpolating it into the <<<UNTRUSTED_SOURCE_DATA>>> / <<<END_UNTRUSTED_SOURCE_DATA>>>
delimited sandbox block. Malicious content embedding the literal delimiter
strings could prematurely close the sandbox and inject instructions that
the LLM treats as trusted.

Fix: add _escape_guard_markers() helper that replaces the guard marker
strings with structurally inert tokens (<<<_UNTRUSTED_DATA>>> and
<<<_END_UNTRUSTED_DATA>>>) before the content is wrapped. The function is
applied in untrusted_context_message() after casting content to str.

The existing ~13 call sites (chat_processor.py, agent_loop.py,
deep_research.py, chat_helpers.py, chat_routes.py) are unaffected because
they pass content through without inspecting the output delimiters.

Regression tests added in tests/test_prompt_security.py covering:
- _escape_guard_markers unit tests (open, close, both, benign passthrough)
- untrusted_context_message integration tests (delimiter spoofing
  neutralisation, type coercion, None handling, metadata preservation)

Resolves #3056

* fix(security): sanitize label for newlines and guard markers

Addresses reviewer feedback on PR #3086:
- Normalize label: strip CR/LF to prevent pre-guard line injection
- Escape guard marker literals in label via _escape_guard_markers()
- Add regression tests for label-based newline injection, GUARD_OPEN
  and GUARD_CLOSE in label, and exactly-one-structural-guard assertion

* fix(security): move Source label inside GUARD_OPEN block

The reviewer correctly identified that even after sanitizing the label,
any user-derived label text (e.g. `f"web page: {url}"`) still appeared
before GUARD_OPEN in the trusted framing zone, where the LLM treats it
as trusted instructions.

Fix: move the 'Source: {label}' line to inside the guarded block so
only the hardcoded UNTRUSTED_CONTEXT_HEADER sits before GUARD_OPEN.
The raw label is still kept in metadata["source"] for traceability.
_sanitize_label() and _escape_guard_markers() are kept for defence-in-
depth on the label stored inside the block.

Update test_label_newline_injection_is_blocked to assert no label-
derived instruction text appears before GUARD_OPEN (pre-guard zone is
now empty of any user-derived content).
2026-06-07 22:15:50 +01:00
Syed Ali Jaseem f939cb65ce refactor(tests): replace local function copies in test_endpoint_resolver with real imports (#3359)
* refactor(tests): replace local function copies in test_endpoint_resolver with real imports

The test file carried 9 verbatim copies of src/endpoint_resolver.py functions
to avoid import-pollution concerns, but these copies are a drift hazard — PR #3343
had to update both in parallel.  Replace them with direct imports so future changes
to endpoint_resolver are automatically exercised by the test suite.

Also fixes _ollama_api_root in endpoint_resolver.py: the bare-URL Ollama case
(e.g. http://nas:11434 with empty path) was already handled correctly in the test
copy but was missing from the real function, which would return /chat instead of
/api/chat for native Ollama endpoints without an explicit /api prefix.

Closes #3351

* refactor: import _ollama_api_root from llm_core instead of duplicating it

endpoint_resolver already imports _detect_provider and _host_match from
llm_core. Add _ollama_api_root to that import and remove the local copy,
collapsing two implementations to one source of truth.

llm_core's version is a superset (also strips /api/chat|tags|generate paths),
and since normalize_base already removes those suffixes upstream the result
is identical for every input used here.
2026-06-07 22:47:57 +02:00
nubs 865e61450e fix(upload): configure chat attachment size limit (#2439) 2026-06-07 22:42:24 +02:00
adabarbulescu a8859bb25c fix(llm): Properly detect remote Ollama bare URLs as native endpoints (fixes #3252) (#3343) 2026-06-07 21:19:19 +02:00
RaresKeY 3a91c11ff8 fix: block app_api access to Cookbook host controls (#3231) 2026-06-07 19:20:11 +02:00
PewDiePie c9198baa2e fix: make agent loopback base port env-configurable (#2752) (#2753)
_COOKBOOK_BASE was hardcoded to http://localhost:7000 with no env-var
override anywhere in the codebase. Tools that do an internal HTTP
loopback (app_api, trigger_research, cookbook state read/write) silently
fail with "All connection attempts failed" whenever the running uvicorn
isn't on port 7000 — which is most non-default deployments and any
side-by-side multi-instance setup. The misleading "Task triggered"
message from manage_tasks during a research request hides that the
underlying research never starts.

Resolution order, lowest to highest priority:
  1. Fallback http://127.0.0.1:7000 (preserves legacy default).
  2. APP_PORT — derive http://127.0.0.1:$APP_PORT (matches docker-compose
     which already reads APP_PORT).
  3. ODYSSEUS_INTERNAL_BASE — explicit override (e.g. behind a TLS proxy
     where loopback isn't 127.0.0.1).

127.0.0.1 instead of "localhost" avoids IPv6/DNS ambiguity for a
strictly-local call.

No API or schema change. Defaults preserved: existing setups on port
7000 are unaffected.

Caught by #2752.

Co-authored-by: pewdiepie-archdaemon <pewdiepie-archdaemon@users.noreply.github.com>
2026-06-07 18:47:47 +02:00
Sebastian Andres El Khoury Seoane 8d9d4ec9c6 feat(platform): Add support for APFEL as part of the dependencies and models for the Cookbook. (#2657)
* feat(platform): add support for Apple Silicon detection in platform compatibility

test(tests): enhance shell_routes tests for Apple Silicon compatibility

* fix issues with missing import

* fix: correct package name in package-lock.json and enhance package installation commands in shell_routes.py and cookbook.js

* feat: add Apfel startup and health checks on macOS

- bootstrap Apfel via Homebrew on arm64 macOS
- start `apfel --serve --port 11435` detached for Odysseus
- verify readiness via `/health`
- clean up the Apfel process on exit or Ctrl+C

* fix: duplicate variable declaration post-merge conflict
- Should fix `node` CI issues.

* fix: issues with the update status of the APFEL dependency.
- fixed by changing the main conditional that determines the update.

* Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file.

* Fix: whitespace issues with the model_routes file

* Fix: Remove unnecessary whitespaces and formatting for the model_routes.py file. Final

* Fix: Fixed updates using PIP for APFEL instead of custom cmd
2026-06-07 17:28:02 +02:00
Muhammad Ikhwan Fathulloh 2a6921a455 Fix logical bugs in event bus and bulk session deletion (#3139) 2026-06-07 17:08:50 +02:00
Rudra Sarker c5ac89f01f fix: preserve partial deep research findings on non-timeout errors (#2189)
* fix: preserve partial deep research findings on non-timeout errors

* fix: preserve partial deep research findings on non-timeout errors
2026-06-07 16:53:14 +02:00
Wes Huber b9a96bca1a fix(research): avoid double split() call and potential IndexError (#2229)
cat.split()[0] was called in the condition and again in the body,
wasting a second split. More importantly, if cat were ever
whitespace-only, split() returns [] and [0] raises IndexError.
Assign to a local variable and guard with a truthiness check.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-07 16:46:21 +02:00
Wes Huber 706ea6a7b7 fix: TOCTOU race in personal file delete + IndexError on whitespace cmd (#2228)
1. routes/personal_routes.py: os.path.exists() then os.remove() is a
   classic TOCTOU race — another request or cleanup can delete the
   file between the check and the remove, raising FileNotFoundError.
   Replace with try/except FileNotFoundError.

2. src/tool_implementations.py: cmd.split()[0] crashes with IndexError
   when cmd is a non-empty whitespace-only string (split() returns []).
   Guard with (cmd.split() or [''])[0].

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-07 16:44:26 +02:00
M57 12cb39cbd9 feat: add OpenCode Zen and Go as provider options (#26)
- Add OpenCode Zen (https://opencode.ai/zen/v1) and Go (https://opencode.ai/zen/go/v1)
- Add provider detection via _host_match() in llm_core.py
- Add curated model list entries in model_routes.py
- Add webhook provider URLs
- Add provider icon (providers.js) and dropdown options (index.html)
- Add auto-detection patterns and setup URLs (slashCommands.js)
- Whitelist opencode.ai in URL validation (admin.js)
- Rebased on main to fix merge conflicts with _HOST_TO_CURATED refactor

Co-authored-by: M57 <hy4ri@users.noreply.github.com>
2026-06-07 16:43:00 +02:00
max-freddyfire 43c16fc7e4 fix(context_compactor): return original messages when compaction summary fails (#2174)
On summary LLM call failure, maybe_compact was returning system_msgs+recent
(dropping the older half) with was_compacted=False, misleading the caller into
thinking the list was unchanged. Return the original messages list unchanged so
no history is lost; the next trim_for_context call handles length if needed.

Fixes #2160

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 16:40:16 +02:00
YotamPeled adbcb3763f fix(agent): don't abort legitimate tool batches as runaway loops (#3183)
The loop-breaker's runaway backstop counted per-tool-type call totals and
tripped whenever any tool was used >=15 times — treating 15+ DISTINCT calls
to one tool as a stuck loop. A real batch (e.g. "add these 18 birthdays to my
calendar" emits 18 distinct manage_calendar create_event calls in one round)
got flagged "calling manage_calendar over and over", the calls were discarded
(next round tools_sent=0), and 0 events were created.

Count IDENTICAL repeated call signatures instead (same tool AND args), via a
small, unit-testable _detect_runaway_call() helper. Genuine batches pass; a
model truly stuck repeating one call still trips the backstop. Adds a
regression test.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 16:16:17 +02:00
danielroytel 5d3e3c7053 feat(tasks): assign folder='Tasks' at creation + backfill migration (#2834)
* feat: assign folder='Tasks' to task sessions at creation

Task sessions (LLM, action, research) now set folder='Tasks' on their
DbSession row, matching the pattern used by the Assistant folder. This
enables sidebar lens filtering without changing existing session
behaviour.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add backfill script for task session folders

One-shot script to set folder='Tasks' on existing [Task]/[Research]
sessions that predate the folder assignment in task_scheduler.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: replace standalone backfill script with automatic migration

Convert scripts/backfill_task_folders.py into _migrate_backfill_task_folders()
in core/database.py, called from init_db(). The migration is idempotent (only
touches rows where folder IS NULL/empty) and runs automatically on upgrade,
so operators no longer need a manual step to tag pre-existing task sessions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 15:33:17 +02:00
RaresKeY a3784da172 fix: block app_api access to shell routes (#3225) 2026-06-07 15:19:08 +02:00
Vykos 83b0ab7cd3 Scope auxiliary LLM endpoints by owner (#2996)
* fix(auth): scope auxiliary llm endpoints by owner

* fix(auth): scope auxiliary llm fallbacks by owner
2026-06-07 14:47:44 +02:00
Vykos 2149f0fb67 fix(rag): forward owner through manager wrapper (#2991) 2026-06-07 12:56:57 +02:00
Vykos 000932a6d9 fix(auth): gate api tokens from user routes (#2992) 2026-06-07 12:55:01 +02:00
Vykos 299538ea4e Harden note reminder dispatch ownership (#2999) 2026-06-07 12:52:27 +02:00
Vykos f2a79aaf5c Tighten manage notes owner checks (#3002) 2026-06-07 12:50:10 +02:00