odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-18 10:45:31 -04:00

Author	SHA1	Message	Date
Alexandre Teixeira	ff5bcd9864	fix(agent): surface early loop-guard stops	2026-06-15 17:07:15 +01:00
Ashvin	7fd937fa57	fix(calendar): parse "mins"/"hrs" reminder offsets in manage_calendar (#4266 ) _reminder_minutes matched the offset with (?:m\|min\|minute\|minutes)\b and (?:h\|hr\|hour\|hours)\b. The trailing \b makes the common plural abbreviations "mins"/"hrs" fail to match (after "min" the "s" is a word char, so no boundary), so reminder_minutes "5 mins" or "2 hrs" returned None and the event was created with no reminder, silently. Widen the two unit regexes and the matching reminder_only description regex to a strict superset that also accepts mins/hrs. The sibling duration parser already accepts these forms (it has no \b), so this only brings the reminder parser in line.	2026-06-15 17:37:28 +02:00
Catalin Iliescu	c41caac438	fix(cookbook): only persist successfully stopped scheduled serves (#4267 ) Co-authored-by: Cata <cata@bigjohn.local>	2026-06-15 17:30:18 +02:00
RaresKeY	f66a23d19d	fix(ai): validate generated image result URLs (#4289 )	2026-06-15 16:40:49 +02:00
pewdiepie-archdaemon	1cc9a003fd	Fix failing post-merge tests	2026-06-15 22:49:06 +09:00
pewdiepie-archdaemon	6d507f8128	Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615 # Conflicts: # src/tool_implementations.py # static/js/research/panel.js	2026-06-15 21:20:15 +09:00
pewdiepie-archdaemon	2cbd55b8bd	Open email context for agent, email search across All Mail, cookbook serve polish - Agent: pass the open email reader (uid/folder/account/from/subject/body preview) on every chat submit so 'reply to this' / 'write email saying hi' route to ui_control open_email_reply with the right UID instead of inventing a new .md draft. Code-level enforcement (chat_routes strips create_document + send_email when active_email is set); cross-session active_doc_id is now trusted instead of being silently dropped. set_active_email/clear_active_email tool-layer helpers in tool_implementations. - ui_control open_email_reply: optional body argument so the agent can open-and-write in one call; envelope now forwards uid/folder/account/ body/panel through tool_output. Tool description sharpened and the parser rejects empty bodies on reply/reply-all (forces the agent to write rather than open an empty draft). - Email library: search now runs against [Gmail]/All Mail when the current folder is INBOX (archived emails surface). Whirlpool spinner + 'Searching…' placeholder while in flight. Each search result is stamped with its source folder so clicks open the right email instead of whatever shares its UID in INBOX. Search no longer re-applies the same text pill locally (which only checks subject/from/snippet, never body) so body-only matches don't get dropped after IMAP returns them. Initial inbox load bumped 100→500. - Email favorites: 'Favorite (pin to top)' / 'Unfavorite' in both the card menu and the open-reader more menu, backed by a new /api/email/flag/{uid}?on=true\|false endpoint. Flagged emails always bubble to the top of the grid regardless of active sort. - AI reply in doc editor: never overwrites existing draft text or the quoted history. AI suggestion is prepended; AI-generated 'On … wrote:' re-quotes are stripped so the original quote isn't visually edited. - Cookbook serve: pre-launch GPU driver / has_gpu / install / version- floor checks (vllm minimax_m2 needs 0.10.0+, deepseek_r1 needs 0.7.0 etc.) before the launch chain starts. Detect 'another model already running on this host' and offer Stop & launch (with graceful then force tmux kill helpers, port release wait). Per-vendor deep-link buttons (vLLM recipe / SGLang cookbook) with hardware hash. Backend picker is now a custom dropdown with accent-coloured logos for vLLM, SGLang, llama.cpp, Ollama, Diffusers; same glyphs added next to package names in Dependencies. Runtime-readiness note moved inside the panel (green when ready, red when missing) with an × dismiss. Esc collapses the expanded card; expanded card scrolls when it overflows; Trust Remote / Auto Tool / Reasoning Parser / Enforce Eager / Prefix Caching / Expert Parallel / Speculative / MoE Env on one row (Reasoning Parser auto-detected per model family). Dtype→Row 1, GPUs→Row 2 (rightmost). Removed redundant GPU 'auto' input — command builders read from the GPU button strip. Default cookbook open is Download tab. - Cookbook hwfit: 'Model (latest)' / 'Model (oldest)' header sorts by release_date; release dates can be backfilled with the new scripts/backfill_model_release_dates.py and recipe metadata pulled with scripts/import_from_vllm_recipes.py against the upstream vllm-project/recipes catalog (vllm_recipe + min_vllm_version stamped on entries). - Calendar: Quick add hint cycles a random Odysseus-themed example per open (wooden horse Friday, crew muster 10am daily, council on Ithaca, …). Typing a time like '11pm' in the event title updates the hero clock live. - Doc editor: email-mode Reply button (sparkle icon, accent) opens the same Fast/Full + context popover the email reader uses; Ctrl+Alt+M toggles markdown preview. - Memories panel: custom sort picker with per-option icons, default 'Latest', visible Enabled/Disabled toggle text matching the section description style.	2026-06-15 20:47:51 +09:00
andrewemer	cd02ac7ef6	fix(agent): skill-prescribed tools never reach the model's schema list (#4008 ) * Agent: make skill-prescribed tools actually callable The skill index and matched-skill procedures are injected into the prompt, but tool selection never followed: manage_skills wasn't in the RAG-selected schema list (so the model substituted manage_memory), and a matched skill could prescribe tools (grep, read_file) the model had no schema for. Now: - manage_skills rides along whenever the owner has any skills indexed - a Jaccard-matched skill's requires_toolsets join the selection - viewing a skill mid-turn via manage_skills unlocks its requires_toolsets for subsequent rounds - admin-intent turns send _ADMIN_TOOLS schemas, matching the prompt text _build_base_prompt already advertises - index_for(active_toolsets=None) no longer hides requires_toolsets skills from callers that don't know the active set Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Agent: validate skill requires_toolsets against known tools, not TOOL_SECTIONS grep/glob/ls ship as function schemas without a prompt-prose section, so gating on TOOL_SECTIONS silently dropped them from a skill's requires_toolsets. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-15 20:32:43 +09:00
cirim	e7abb7559d	fix(research): keep Discuss chats grounded on their report (#4006 ) * fix(research): preserve Discuss spin-off primer during context trimming trim_for_context() kept only system_msgs[:1] as essential and dropped the rest under budget pressure. A research "Discuss" spin-off seeds the report as a system message that sits after the preface system messages, so it landed in extra_system and was the first thing evicted once the chat grew — the conversation then lost its grounding and drifted off task. Treat any system message carrying research_spinoff_from metadata as essential, alongside the leading system prompt, so the seeded report survives trimming. maybe_compact already retains all system messages. Tests: tests/test_context_compactor.py::TestResearchPrimerPreserved * fix(research): ground Discuss spin-off chats on the seeded report build_chat_context injected global memory (pinned + hybrid-retrieved) and personal-doc RAG every turn, keyed off the user-level memory_enabled pref and a request-scoped use_rag flag — never the session. A research spin-off, whose primer declares the report the sole knowledge base, thus had unrelated keyword-matched facts pulled in ("wrong data") competing with the report; its rag=False flag was also ignored (use_rag defaulted on). Add _session_is_research_spinoff(sess) (detects the primer research_spinoff_from metadata; handles ChatMessage and dict forms) and, for such sessions, disable memory injection and force RAG off. Tests: tests/test_chat_helpers.py spin-off detection cases --------- Co-authored-by: Dan (cirim) <claude@cirim.org>	2026-06-15 20:31:57 +09:00
Josh Patra	f5d3e5098a	fix(llm): omit temperature for Kimi K2.5 and K2.6 (#3960 )	2026-06-15 20:29:22 +09:00
Josh Patra	4ee5ed4dce	fix(memory): return complete memory lists (#3885 )	2026-06-15 20:28:25 +09:00
Achilleas90	ffc0f1dccc	Harden CalDAV write-back with retries (#1193 ) Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-15 15:59:31 +09:00
KYDNO	955455b797	fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549 ) * fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(kimi): omit temperature for kimi-for-coding API calls Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-15 15:56:54 +09:00
Abhishek Kumbhar	a172522d87	fix(integrations): prevent blank API integrations (#3840 ) * fix(integrations): validate unified API form fields * fix(integrations): validate API integration fields server-side	2026-06-15 15:40:36 +09:00
Vishnu	d6a3c9a0fe	fix(utility): use utility model for background tasks (auto-title, memory audit) instead of chat model (#4027 )	2026-06-15 15:33:19 +09:00
Dividesbyzer0	33c26bab88	fix(agent): parse raw json web search calls (#4088 )	2026-06-15 15:19:38 +09:00
cyq	e52d078ea1	fix(agent): detect Polish web lookup intent (#4091 )	2026-06-15 15:19:03 +09:00
nsgds	7ae6133d7f	fix(agent): don't let a materialized default budget defeat context-window scaling (#4122 ) * fix(agent): don't let a materialized default budget defeat context scaling #1230 scales agent_input_token_budget to the model's context window unless the user explicitly set a budget, detected via is_setting_overridden(). But the settings-save path materializes every DEFAULT_SETTINGS key into settings.json (load_settings merges defaults; handlers persist the merged dict), so the persisted default 6000 reads as "overridden" and the budget code takes the min(6000, ctx) branch — silently re-capping long-context models at 6000 for anyone who has ever saved a setting. This reintroduces the exact regression #1170/#1230 set out to fix. Add is_setting_customized() (saved value != default) and gate the scaling on it instead of mere presence. A persisted default is not a user choice. is_setting_overridden has exactly one consumer (this budget path), so the change is contained. Tests cover the materialized-default regression, a deliberately-chosen budget still being honoured, and the absent-key case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): rework context-budget fix per review (#4122) Address RaresKeY's review: P2 (explicitness): is_setting_customized treated a saved value equal to the default as "not explicit", which ALSO blocked a user from deliberately pinning the default budget. Reframe the default value itself as the AUTO sentinel — agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context window", any other value is an explicit cap. A materialized default still reads as auto (fixing the original regression), and any non-default value the user chooses is now honoured. Drop the now-unused is_setting_customized helper. P2 (fallback context): auto-scaling trusted get_context_length() even when it returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known window), over-allocating on self-hosted/proxy setups. Add get_context_length_known() (also returns whether the window was actually discovered); the budget block passes 0 when unknown so auto-scaling stays conservative instead of inflating to an unproven window. hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that contract and answered the reviewer's question rather than silently reversing it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(agent): lock the materialized-default budget regression (review on #4121) Per WGlynn's review on the issue: add an end-to-end regression that saves an UNRELATED setting (which makes the settings-save path materialize the budget default into settings.json) and asserts the budget still auto-scales rather than re-reading as an explicit 6000 cap — locking the exact reopening shut. To make the test bite the production decision (not just re-derive it), extract `budget_is_explicit()` into src/context_budget.py and use it from the agent loop. It keys off value-vs-default (the default is the auto sentinel), NOT settings presence — which is the whole point, since the save path materializes defaults. Note: after this PR's rework, is_setting_overridden has ZERO production callers, so the merged-dict materialization smell can't reach any setting through a presence check today (WGlynn's durability concern). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): bind the budget context window to its own provenance (review #4122) RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop kept only the `known` flag from get_context_length_known() and budgeted off the passed-in `context_length`, which can come from a different lookup. Two failures: - local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT fallback while the fresh probe proves the real (smaller) served context — we'd scale off the stale value; - callers that don't pass context_length (scheduled tasks, teacher escalation, skill test runs, bg_monitor) were capped at 6000 even when a long window is discoverable. Extract budget_context_for_model() which returns the freshly-probed window when known else 0, binding the flag to the value it proves; the agent loop uses it. Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): fix stale budget comments + tighten to the contract (review #4122) - settings.py: an explicit budget is clamped to the window only — hard_max is auto-only (#1190); drop the incorrect "and to hard_max". - is_setting_overridden docstring: drop the stale "adaptive budgets" example; point value-sensitive callers at context_budget.budget_is_explicit. - Tighten the budget-block comments to the contract (default = auto sentinel, non-default = explicit cap, hard_max = auto-only ceiling). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): correct budget issue citations (#1190 → merged #1230/#1273) The context-budget contract (auto-sentinel, explicit budgets honoured, hard_max auto-only) merged via #1230 — #1190 was the earlier, closed, superseded PR. Re-point the contract comments at #1230 (the live source, already cited for the auto-sentinel two lines up in settings.py). The configurable hard_max setting (`agent_input_token_hard_max`) was a reviewer requirement first raised on #1190, omitted from the merged #1230, and actually added in #1273 — credit #1273 for it and correct the test comment's history (it previously implied this PR completed the requirement). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 15:17:28 +09:00
Dividesbyzer0	589fcd314a	fix(image): patch realesrgan torchvision compatibility (#4110 )	2026-06-15 15:16:41 +09:00
Max Hsu	039431f5ea	fix(mcp): detect npx cache entries before probing (#4034 )	2026-06-15 15:14:48 +09:00
Dividesbyzer0	7f571c8f7e	fix(agent): keep gpt-oss on text tool mode Treat gpt-oss local OpenAI-compatible models as text/fenced-tool models unless the endpoint explicitly declares native tool support.	2026-06-15 15:11:52 +09:00
cirim	056d1fb960	fix(llm): make connect timeout configurable Use a configurable LLM_CONNECT_TIMEOUT for call and stream connect budgets instead of the previous hard-coded 3s default.	2026-06-15 15:11:38 +09:00
Muhammed Midlaj	4b0a977988	fix(models): probe /v1/models for path-less LM Studio endpoints Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.	2026-06-15 15:09:50 +09:00
Boudbois2271	54690997ec	fix(calendar): treat same-day list_events range as full day Expand zero-width or inverted list_events windows to one day so start=end single-day queries return that day's events.	2026-06-15 15:09:19 +09:00
Wes Huber	be046dd29a	fix(cookbook): preserve state during lifecycle tick Log malformed cookbook state and re-read fresh state before writing scheduled-stop mutations so concurrent UI changes are preserved.	2026-06-15 15:07:03 +09:00
holden093	4c41834dc7	fix(youtube): consolidate duplicate handler Make src.youtube_handler a compatibility wrapper around services.youtube.youtube_handler so transcript state, URL parsing, and timeout behavior no longer diverge.	2026-06-15 15:03:41 +09:00
holden093	96052c5e8a	fix(agent): add contacts domain to tool classifier Add a contacts domain rule pack and deterministic contact intent detection so contact prompts surface resolve_contact/manage_contact tools.	2026-06-15 15:03:19 +09:00
adabarbulescu	afc81bdd7b	fix: drop thinking deltas from background agent loops Skip thinking-only deltas when accumulating background, scheduled-task, and teacher captured reply text.	2026-06-15 15:03:09 +09:00
Dividesbyzer0	a07fe35936	fix(agent): honor explicit web search requests Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.	2026-06-15 15:02:10 +09:00
RaresKeY	a7766d0b7f	fix(agent): honor auth-disabled tool access after setup Check explicit auth-disabled mode before configured-admin ownership checks so single-user mode keeps full agent tool access after setup.	2026-06-15 15:01:48 +09:00
Tom	2857723e47	fix(security): restrict API-key encryption key file to 0o600 Lock the API key encryption key file to owner-only permissions on creation and when reading existing keys, with regression coverage for permissions and encryption roundtrip.	2026-06-15 15:00:11 +09:00
Michael	a633611823	fix(agent): let retrieval run for non-English low-signal queries Allow non-workspace low-signal prompts to fall through to tool retrieval so non-English requests are not limited to always-available tools.	2026-06-15 14:58:56 +09:00
muhamed hamed	3b3c0d6254	fix: detect HuggingFace token when downloading cookbook models (#3459 ) Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 21:53:16 +01:00
Mazen Tamer Salah	f5c1eb4b9d	fix(settings): degrade load_features to defaults on PermissionError load_settings() already catches PermissionError, but load_features() caught only FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable data/features.json (e.g. root-owned after a deploy) therefore raised instead of falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything that reads feature flags. Add PermissionError to the except tuple to match load_settings(). Adds tests/test_load_features_permission_error.py. Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 21:20:10 +01:00
Marius Popa	2a4bba2b9e	fix(api-keys): preserve encrypted keys when saving providers (#1920 ) * fix(api-keys): preserve encrypted keys when saving providers * test(api-keys): cover malformed raw key entries --------- Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 18:23:54 +01:00
Kenny Van de Maele	620fdd0859	feat(agent): confine agent file/shell tools to a selectable workspace (#3665 ) * feat(agent): workspace confinement via context-local binding + get_workspace tool Bind the per-turn workspace once in execute_tool_block; the shared path resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so file tools + bash/python are confined centrally and a new tool that uses the shared helpers cannot accidentally bypass it. Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory modal (reusing existing modal/button CSS), the /workspace slash command, and a get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic (realpath/normcase/commonpath) and docker-safe (container paths, no host assumptions). Reopens #2023. * ux(workspace): clarify workspace is not a sandbox Picker modal note + pill tooltip + get_workspace tool/output wording now state plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder, but bash/python only start there (cwd) and are not sandboxed. Modal note reuses the existing .muted class. * fix(agent): treat an active workspace as file-work intent A vague low-signal message (e.g. "look at the local project") matches no domain keywords, so tool retrieval is skipped and only always-available tools are offered — leaving the agent with no file access even though a workspace is set. When a workspace is active, include the file/code tools (incl. get_workspace) on low-signal turns so the agent can act on the folder. Also requires the tool index (ChromaDB) to be reachable for normal retrieval; that is an environment dependency, not part of this change. * ux(workspace): hide pill + overflow entry in chat mode Workspace only scopes the agent's file/shell tools, so the pill and the overflow 'Workspace' entry are agent-only now — hidden in chat mode like the bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is called from the agent/chat setMode handler. * prompt(tools): steer bash/python to defer to the dedicated file tools bash/python schema descriptions (what native-tool-calling models read) were bare and gave no steer, so models would do file ops via the shell (e.g. writing SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python in the schema + tool-index + prompt section to prefer read_file/write_file/ edit_file/grep/glob/ls and only be used for what those do not cover. * prompt(tools): keep bash/python deferral generic (no hardcoded tool names) Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc. by name, so the guidance does not go stale if those tools are renamed. * style(workspace): drop em-dashes from added code comments/strings * ux(workspace): terser non-sandbox note in picker (no tool-name list) * ux(workspace): mirror terse non-sandbox wording in pill tooltip * chore: untrack local venv symlink (run-only, not part of the feature) * prompt(workspace): keep get_workspace text generic (no hardcoded tool names) * fix(agent): low-signal + workspace surfaces only read-only file tools Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but not write_file/edit_file/bash/python -- those wait for a request that actually calls for them (RAG retrieval still adds them on a real ask). * feat(workspace): cap browse listing at 500 dirs with a truncated hint Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local _MAX_BROWSE_DIRS so a directory with thousands of children does not dump every row into the picker; the response carries a truncated flag and the modal tells the user to type a path to jump in. * chore: untrack local venv symlink (run-only artifact) * fix(workspace): vet the workspace root against the sensitive-path deny list at bind time The in-workspace resolver deny-lists sensitive paths inside the workspace, but the empty-path search root is the workspace itself, so a workspace of ~/.ssh could be listed via ls with no path. vet_workspace() (public, in tool_execution next to the resolvers) rejects non-directories and sensitive roots before the path is ever bound; chat_routes uses it instead of its inline isdir check. * fix(workspace): reject filesystem roots and stop showing rejected workspaces as active Review findings from #3665: P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes every absolute path 'inside' the workspace and collapses confinement into host-wide file access. A root is its own dirname, so reject when dirname(resolved) == resolved; the browse response now carries a selectable flag and the picker disables 'Use this folder' on unselectable dirs. P3: /workspace set stored any string client-side and the chat route silently dropped rejected values, so the pill could claim a confinement that was not in effect. New admin-gated /api/workspace/vet validates manual paths before they persist (canonical path returned), and when a posted workspace is rejected at send time the stream emits workspace_rejected so the client clears the stored value and toasts instead of continuing silently. * fix(workspace): check caller privilege before vetting the posted workspace Review finding: /api/chat_stream called vet_workspace() on the posted value for every caller and emitted workspace_rejected on failure, so a non-admin who can chat but cannot use file/shell tools could distinguish existing directories from missing/file/sensitive/root paths by whether the event appeared. The resolution now lives in _resolve_request_workspace, which drops the submitted value uniformly for non-admin callers, with no vetting and no event, before the path ever touches the filesystem. Admin and single-user behavior is unchanged. Test pins that valid and invalid paths are indistinguishable for a non-admin and that vet_workspace is never invoked for them.	2026-06-11 18:17:54 +02:00
Michael	95c54ac3cb	fix: use _truncate for tool output display limits in agent_loop (#3831 ) Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit truncation indicator when content is cut. Scoped down from the original PR: only agent/tool-output display behavior, no integrations.py changes. Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 17:05:13 +01:00
Kenny Van de Maele	263d41c58a	fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers (#3945 ) * fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers _apply_local_cache_affinity adds session_id + cache_prompt for llama.cpp KV-cache slot affinity (#2927), gated on _is_self_hosted_openai_compatible, which treated any unknown OpenAI-compatible host as self-hosted. Strict cloud providers added as custom endpoints (Mistral at api.mistral.ai) reject unknown body fields, so every request failed with 422 extra_forbidden. Self-hosted now also requires the endpoint to resolve as local via model_context.is_local_endpoint: loopback/private/tailscale host, or endpoint kind explicitly configured as "local" (the escape hatch for tunneled self-hosted servers). is_local_endpoint is promoted to a public name since llm_core now shares it. Fixes #3793 * test(llm): sweep cloud OpenAI-compatible hosts in affinity gating Parametrized cases adapted from #3839 (credit: Shabablinchikow): deepseek, x.ai, together, fireworks, and the Gemini OpenAI-compat endpoint must all stay free of the llama.cpp extras, not just the Mistral host from #3793. * fix(llm): narrow the Tailscale range to 100.64.0.0/10 in is_local_endpoint Review finding on #3945: _PRIVATE_PREFIXES carried a bare "100." prefix, treating all of 100.0.0.0/8 as local while Tailscale only uses the CGNAT block 100.64.0.0/10. Public 100.x hosts (e.g. AWS ranges outside the block) were classified local and still received the llama.cpp extras this PR exists to keep away from strict providers. Match the narrowed classification routes/model_routes.py already uses, with boundary tests just below, inside, and just above the range.	2026-06-11 17:51:03 +02:00
Mazen Tamer Salah	f941db29d3	fix(search): batch FTS hit lookups into one query (N+1) (#3909 ) _search_fts ran the FTS MATCH query, then looked up each hit's full row with its own db.query(...).filter(id == message_id).first() inside a loop, so a search returning N hits issued N extra SELECTs. Fetch all hit rows in a single IN(...) query via _fetch_messages_by_id and reassemble results in hit (relevance) order. Adds tests/test_session_search_batch_fetch.py asserting a single batched query (and no query for empty input). Existing session-search tests stay green.	2026-06-11 16:31:54 +02:00
RaresKeY	c500bcb47d	fix(uploads): migrate upload ownership on rename (#3617 )	2026-06-11 16:01:04 +02:00
Mazen Tamer Salah	f7a3605b16	fix(webhooks): keep references to in-flight delivery tasks (#3859 ) fire() and fire_and_forget() scheduled delivery with bare create_task()/ loop.create_task() and kept no reference. asyncio holds only a weak reference to a task, so the GC could collect a delivery (or the fire() coroutine itself) before it completed, silently dropping the webhook. Track in-flight tasks in a set on the manager via a _spawn_tracked() helper that holds a strong reference for the task's lifetime and discards it on completion (add_done_callback), and route both schedule sites through it. Adds tests/test_webhook_task_refs.py.	2026-06-11 15:53:52 +02:00
George Lawton	4f48cfa9ae	fix: omit temperature for Opus 4.7+ on native Anthropic path (#3117 ) Anthropic removed the sampling parameters (temperature, top_p, top_k) starting with Claude Opus 4.7 — sending temperature at all, even 0.0, returns HTTP 400. _build_anthropic_payload sent it unconditionally, so every native-Anthropic request to Opus 4.7/4.8 failed: the research probe (ResearchHandler._probe_endpoint, temperature=0) aborted runs before they started, and all DeepResearcher._llm calls 400'd. Add _anthropic_rejects_temperature (version-gates opus-N-M >= (4,7)) and omit temperature in the Anthropic builder for those models. Older Claude models (Opus 4.6 and below, Sonnet/Haiku) keep temperature and the existing [0,1] clamp. The version gate is hardened against real-world model id shapes: - a word-boundary anchor so a substring like `octopus-4-8` is not read as Opus and stripped of temperature; - a 1-2 digit minor cap so a dated id such as `claude-opus-4-20250514` (Opus 4.0, listed in ANTHROPIC_MODELS) parses as major-only and keeps temperature, while dated 4.7+ snapshots still match; - a non-string guard so a non-string model can't raise AttributeError (the previous builder never called .lower() on it). Adds regression tests covering 4.7/4.8 omission, older/dated/legacy retention, the substring overmatch, and non-string inputs. Fixes #3065 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 16:27:40 +03:00
RaresKeY	50fedff2f2	fix(email): scope learned sender signatures by owner (#3724 )	2026-06-11 13:26:59 +02:00
cyq	c01034f9cb	fix(settings): scrub camelCase secret keys (#3707 )	2026-06-11 12:53:33 +02:00
pewdiepie-archdaemon	ebd2332db4	Agent prompt builder: stop re-adding ALWAYS_AVAILABLE on top of filtered tools Found the reason yesterday's tool-retrieval drop wasn't taking effect: in _build_agent_prompt, when relevant_tools was provided, it computed tool_names = set(ALWAYS_AVAILABLE) \| set(relevant_tools) which silently re-added every tool get_tools_for_query had just deliberately discarded. So when a 'save this for <person>' query dropped manage_memory from the retrieved set, the prompt builder put it right back, and the model saw both tools again. Trust the relevant_tools set. get_tools_for_query already starts from ALWAYS_AVAILABLE — any discard there is intentional and should propagate. Only force-include ask_user and update_plan here as belt- and-suspenders since the agent loop relies on those for its own control flow. Other callers (task_scheduler) already union ALWAYS_AVAILABLE or ASSISTANT_ALWAYS_AVAILABLE into relevant_tools before passing it in, so they're unaffected.	2026-06-11 09:49:20 +09:00
pewdiepie-archdaemon	f5ad59317c	Tool retrieval: HARD drop manage_memory when query is a contact-save pattern Description-level steering wasn't enough — even with the explicit 'DO NOT use for info about another person' in manage_memory's description, models kept choosing memory over manage_contact. They can't if memory isn't in the toolset. New logic in ToolIndex.get_tools_for_query: detect three contact-save patterns and discard manage_memory from the returned set (overriding ALWAYS_AVAILABLE): 1. 'save [up to 3 words] for/to <name>' where <name> isn't a timing / pronoun stopword (later, tomorrow, me, you, future, etc.). Catches the canonical 'save this for X' and the wider 'save this address for X', 'save it for X'. 2. 'to/in/into (my) contacts' or 'address book'. Catches both 'add X to my contacts' and 'put this in my address book for X'. 3. Possessive: 'save (his/her/their) (address/phone/email/...)'. Stronger signal — also force-adds manage_contact to the set in case the keyword fallback missed it. Verified: 8 positive contact patterns all drop memory, 10 false- positive 'save X for later/tomorrow/me/the next thing' all keep it.	2026-06-11 09:46:34 +09:00
pewdiepie-archdaemon	df47536b8d	manage_memory descriptions: explicit deferral to manage_contact for person info Even with manage_contact in the retrieved tool set, models were still defaulting to manage_memory when the user pasted an address + 'save for <person>'. Both tools were in front of the model and it picked memory. Tighten both descriptions to steer at decision-time: - agent_loop.py manage_memory description: clarify scope is facts about the USER, with an explicit 'DO NOT use for info about another person' + a 'use manage_contact instead' line. - tool_index.py manage_memory description: same in shorter form, so the embedded retrieval signal is consistent with the prompt-time description.	2026-06-11 09:25:23 +09:00
pewdiepie-archdaemon	8a00f954a9	Tool retrieval: catch 'add X to (my) contacts' / 'address book' phrasings The literal phrase 'add to contacts' missed when there was a name between 'add' and 'to', e.g. 'add Pat to my contacts'. Anchor on the tail with 'to my contacts', 'to contacts', 'to address book' so word boundaries fire regardless of what sits in front.	2026-06-11 09:18:30 +09:00
pewdiepie-archdaemon	8632072ce0	Contacts: postal-address support via vCard ADR, keep tool prompt minimal Closes the gap that pushed the agent into manage_memory when the user pasted an address and said 'save this for X'. manage_contact now accepts an optional address arg end-to-end: - routes/contacts_routes.py: - _normalize_contact carries an 'address' field - _build_vcard emits ADR:;;<address>;;;; (street component of the RFC-6350 7-part ADR), only when address is non-empty - _parse_vcards reads ADR, joins non-empty components with ', ' - _create_contact and _update_contact thread address through; update preserves existing address when caller passes empty - src/tool_implementations.py do_manage_contact: - add accepts address; require at least name+address or email (was: email required) so address-only contacts are addable - update accepts address; require name OR emails OR address - src/tool_schemas.py: schema gets a single 'address' string field - src/tool_index.py + src/agent_loop.py: descriptions get one 'address' arg mention and a 'use this for save-X-for-person / address pastes / phone-with-name' steering line. Net: a few bytes added, not a paragraph. Also: removed a stray name from the schema's manage_contact example strings ('save Jonathan's email…') — no real names in the codebase.	2026-06-11 09:14:52 +09:00
pewdiepie-archdaemon	153b788134	Tool retrieval: surface manage_contact for 'save X for <person>' patterns When the user dumps a postal address or phone number alongside a person's name and says 'save this for X', the vector retriever was missing manage_contact because its description only mentioned the literal word 'contact'. The model defaulted to manage_memory (which is in ALWAYS_AVAILABLE), so the saved fact ended up as un-named memory that wouldn't surface on a later 'what's X's address?' search. - Rewrite manage_contact's index description to anchor on the semantics: 'save info about another person', including postal/ mailing address, ZIP, phone, etc. Now it embeds close to address- paste queries. - Extend the keyword intent-map with 'save this for', 'save it for', 'mailing address', 'postal code', 'their address', etc. — common ways users say 'this belongs to a contact' without the literal word 'contact'.	2026-06-11 08:56:42 +09:00

1 2 3 4 5 ...

422 Commits