odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-16 01:35:36 -04:00

Author	SHA1	Message	Date
Lucas Daniel	f4e8990635	chore: add warnings to silent except Exception blocks (#3212 ) * log(app): add warnings to silent except Exception blocks - Internal tool auth header failure now logs a warning instead of silently passing, making auth bypass easier to spot in logs. - Token last_used_at update failure now logs at DEBUG (fire-and-forget, non-critical, but useful when debugging token tracking issues). - Image ownership verification failure now logs a warning so unexpected access-check errors surface instead of silently allowing the request. * log(chat_routes): add warnings to silent except Exception blocks - clear_orphaned_session_endpoint: log before rollback so failures appear in traces when users see stale/deleted model options. - _endpoint_has_model (JSON parse): log malformed cached_models instead of silently treating endpoint as valid. - _has_any_visible_model (JSON parse): log malformed cached_models instead of silently returning empty list. - timezone header parse: log failure so time-zone-related tool bugs (wrong scheduled times, calendar events) are traceable. - attachments JSON parse: log failure so silently-dropped attachments are visible in server logs. * log(email_routes): add warnings to silent except Exception blocks - Email alias resolution failure now logs a warning instead of silently returning an empty list, making broken account configs diagnosable. * log(document_routes): add warnings to silent except Exception blocks - Export ZIP request body parse failure now logs a warning so empty exports caused by malformed requests are diagnosable. - clear_active_document failure on detach now logs a warning to help trace doc re-injection bugs like #1160. * log(agent_loop): add warnings to silent except Exception blocks - builtin tool overrides load failure now logs a warning so misconfigured settings don't silently fall back to defaults without a trace. - Timezone context injection failure now logs a warning to help debug incorrect scheduled times in agent-created tasks. - PDF form-backed document detection failure now logs a warning so broken form-doc UI is traceable to the root cause. * log(llm_core): add warnings to silent except Exception blocks - Malformed URL in _is_ollama_native_url now logs a warning so bad endpoint configs are traceable instead of silently returning False. - Model list fetch failure now logs a warning with the endpoint URL so endpoints that silently vanish from the model picker are diagnosable. * log: pass exception via exc_info instead of string interpolation * fix(logging): avoid logging raw URLs in llm_core error paths Drop the raw url/base_chat_url from the Ollama-detection and model-list-fetch warning logs added by this sweep, since these values can contain private hostnames, internal IPs, credentials, or other deployment details. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 17:49:27 +01:00
pewdiepie-archdaemon	6d507f8128	Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615 # Conflicts: # src/tool_implementations.py # static/js/research/panel.js	2026-06-15 21:20:15 +09:00
pewdiepie-archdaemon	2cbd55b8bd	Open email context for agent, email search across All Mail, cookbook serve polish - Agent: pass the open email reader (uid/folder/account/from/subject/body preview) on every chat submit so 'reply to this' / 'write email saying hi' route to ui_control open_email_reply with the right UID instead of inventing a new .md draft. Code-level enforcement (chat_routes strips create_document + send_email when active_email is set); cross-session active_doc_id is now trusted instead of being silently dropped. set_active_email/clear_active_email tool-layer helpers in tool_implementations. - ui_control open_email_reply: optional body argument so the agent can open-and-write in one call; envelope now forwards uid/folder/account/ body/panel through tool_output. Tool description sharpened and the parser rejects empty bodies on reply/reply-all (forces the agent to write rather than open an empty draft). - Email library: search now runs against [Gmail]/All Mail when the current folder is INBOX (archived emails surface). Whirlpool spinner + 'Searching…' placeholder while in flight. Each search result is stamped with its source folder so clicks open the right email instead of whatever shares its UID in INBOX. Search no longer re-applies the same text pill locally (which only checks subject/from/snippet, never body) so body-only matches don't get dropped after IMAP returns them. Initial inbox load bumped 100→500. - Email favorites: 'Favorite (pin to top)' / 'Unfavorite' in both the card menu and the open-reader more menu, backed by a new /api/email/flag/{uid}?on=true\|false endpoint. Flagged emails always bubble to the top of the grid regardless of active sort. - AI reply in doc editor: never overwrites existing draft text or the quoted history. AI suggestion is prepended; AI-generated 'On … wrote:' re-quotes are stripped so the original quote isn't visually edited. - Cookbook serve: pre-launch GPU driver / has_gpu / install / version- floor checks (vllm minimax_m2 needs 0.10.0+, deepseek_r1 needs 0.7.0 etc.) before the launch chain starts. Detect 'another model already running on this host' and offer Stop & launch (with graceful then force tmux kill helpers, port release wait). Per-vendor deep-link buttons (vLLM recipe / SGLang cookbook) with hardware hash. Backend picker is now a custom dropdown with accent-coloured logos for vLLM, SGLang, llama.cpp, Ollama, Diffusers; same glyphs added next to package names in Dependencies. Runtime-readiness note moved inside the panel (green when ready, red when missing) with an × dismiss. Esc collapses the expanded card; expanded card scrolls when it overflows; Trust Remote / Auto Tool / Reasoning Parser / Enforce Eager / Prefix Caching / Expert Parallel / Speculative / MoE Env on one row (Reasoning Parser auto-detected per model family). Dtype→Row 1, GPUs→Row 2 (rightmost). Removed redundant GPU 'auto' input — command builders read from the GPU button strip. Default cookbook open is Download tab. - Cookbook hwfit: 'Model (latest)' / 'Model (oldest)' header sorts by release_date; release dates can be backfilled with the new scripts/backfill_model_release_dates.py and recipe metadata pulled with scripts/import_from_vllm_recipes.py against the upstream vllm-project/recipes catalog (vllm_recipe + min_vllm_version stamped on entries). - Calendar: Quick add hint cycles a random Odysseus-themed example per open (wooden horse Friday, crew muster 10am daily, council on Ithaca, …). Typing a time like '11pm' in the event title updates the hero clock live. - Doc editor: email-mode Reply button (sparkle icon, accent) opens the same Fast/Full + context popover the email reader uses; Ctrl+Alt+M toggles markdown preview. - Memories panel: custom sort picker with per-option icons, default 'Latest', visible Enabled/Disabled toggle text matching the section description style.	2026-06-15 20:47:51 +09:00
andrewemer	cd02ac7ef6	fix(agent): skill-prescribed tools never reach the model's schema list (#4008 ) * Agent: make skill-prescribed tools actually callable The skill index and matched-skill procedures are injected into the prompt, but tool selection never followed: manage_skills wasn't in the RAG-selected schema list (so the model substituted manage_memory), and a matched skill could prescribe tools (grep, read_file) the model had no schema for. Now: - manage_skills rides along whenever the owner has any skills indexed - a Jaccard-matched skill's requires_toolsets join the selection - viewing a skill mid-turn via manage_skills unlocks its requires_toolsets for subsequent rounds - admin-intent turns send _ADMIN_TOOLS schemas, matching the prompt text _build_base_prompt already advertises - index_for(active_toolsets=None) no longer hides requires_toolsets skills from callers that don't know the active set Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Agent: validate skill requires_toolsets against known tools, not TOOL_SECTIONS grep/glob/ls ship as function schemas without a prompt-prose section, so gating on TOOL_SECTIONS silently dropped them from a skill's requires_toolsets. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-15 20:32:43 +09:00
KYDNO	955455b797	fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549 ) * fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(kimi): omit temperature for kimi-for-coding API calls Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-15 15:56:54 +09:00
cyq	e52d078ea1	fix(agent): detect Polish web lookup intent (#4091 )	2026-06-15 15:19:03 +09:00
nsgds	7ae6133d7f	fix(agent): don't let a materialized default budget defeat context-window scaling (#4122 ) * fix(agent): don't let a materialized default budget defeat context scaling #1230 scales agent_input_token_budget to the model's context window unless the user explicitly set a budget, detected via is_setting_overridden(). But the settings-save path materializes every DEFAULT_SETTINGS key into settings.json (load_settings merges defaults; handlers persist the merged dict), so the persisted default 6000 reads as "overridden" and the budget code takes the min(6000, ctx) branch — silently re-capping long-context models at 6000 for anyone who has ever saved a setting. This reintroduces the exact regression #1170/#1230 set out to fix. Add is_setting_customized() (saved value != default) and gate the scaling on it instead of mere presence. A persisted default is not a user choice. is_setting_overridden has exactly one consumer (this budget path), so the change is contained. Tests cover the materialized-default regression, a deliberately-chosen budget still being honoured, and the absent-key case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): rework context-budget fix per review (#4122) Address RaresKeY's review: P2 (explicitness): is_setting_customized treated a saved value equal to the default as "not explicit", which ALSO blocked a user from deliberately pinning the default budget. Reframe the default value itself as the AUTO sentinel — agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context window", any other value is an explicit cap. A materialized default still reads as auto (fixing the original regression), and any non-default value the user chooses is now honoured. Drop the now-unused is_setting_customized helper. P2 (fallback context): auto-scaling trusted get_context_length() even when it returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known window), over-allocating on self-hosted/proxy setups. Add get_context_length_known() (also returns whether the window was actually discovered); the budget block passes 0 when unknown so auto-scaling stays conservative instead of inflating to an unproven window. hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that contract and answered the reviewer's question rather than silently reversing it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(agent): lock the materialized-default budget regression (review on #4121) Per WGlynn's review on the issue: add an end-to-end regression that saves an UNRELATED setting (which makes the settings-save path materialize the budget default into settings.json) and asserts the budget still auto-scales rather than re-reading as an explicit 6000 cap — locking the exact reopening shut. To make the test bite the production decision (not just re-derive it), extract `budget_is_explicit()` into src/context_budget.py and use it from the agent loop. It keys off value-vs-default (the default is the auto sentinel), NOT settings presence — which is the whole point, since the save path materializes defaults. Note: after this PR's rework, is_setting_overridden has ZERO production callers, so the merged-dict materialization smell can't reach any setting through a presence check today (WGlynn's durability concern). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): bind the budget context window to its own provenance (review #4122) RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop kept only the `known` flag from get_context_length_known() and budgeted off the passed-in `context_length`, which can come from a different lookup. Two failures: - local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT fallback while the fresh probe proves the real (smaller) served context — we'd scale off the stale value; - callers that don't pass context_length (scheduled tasks, teacher escalation, skill test runs, bg_monitor) were capped at 6000 even when a long window is discoverable. Extract budget_context_for_model() which returns the freshly-probed window when known else 0, binding the flag to the value it proves; the agent loop uses it. Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): fix stale budget comments + tighten to the contract (review #4122) - settings.py: an explicit budget is clamped to the window only — hard_max is auto-only (#1190); drop the incorrect "and to hard_max". - is_setting_overridden docstring: drop the stale "adaptive budgets" example; point value-sensitive callers at context_budget.budget_is_explicit. - Tighten the budget-block comments to the contract (default = auto sentinel, non-default = explicit cap, hard_max = auto-only ceiling). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): correct budget issue citations (#1190 → merged #1230/#1273) The context-budget contract (auto-sentinel, explicit budgets honoured, hard_max auto-only) merged via #1230 — #1190 was the earlier, closed, superseded PR. Re-point the contract comments at #1230 (the live source, already cited for the auto-sentinel two lines up in settings.py). The configurable hard_max setting (`agent_input_token_hard_max`) was a reviewer requirement first raised on #1190, omitted from the merged #1230, and actually added in #1273 — credit #1273 for it and correct the test comment's history (it previously implied this PR completed the requirement). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 15:17:28 +09:00
Dividesbyzer0	7f571c8f7e	fix(agent): keep gpt-oss on text tool mode Treat gpt-oss local OpenAI-compatible models as text/fenced-tool models unless the endpoint explicitly declares native tool support.	2026-06-15 15:11:52 +09:00
holden093	96052c5e8a	fix(agent): add contacts domain to tool classifier Add a contacts domain rule pack and deterministic contact intent detection so contact prompts surface resolve_contact/manage_contact tools.	2026-06-15 15:03:19 +09:00
Dividesbyzer0	a07fe35936	fix(agent): honor explicit web search requests Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.	2026-06-15 15:02:10 +09:00
Michael	a633611823	fix(agent): let retrieval run for non-English low-signal queries Allow non-workspace low-signal prompts to fall through to tool retrieval so non-English requests are not limited to always-available tools.	2026-06-15 14:58:56 +09:00
Kenny Van de Maele	620fdd0859	feat(agent): confine agent file/shell tools to a selectable workspace (#3665 ) * feat(agent): workspace confinement via context-local binding + get_workspace tool Bind the per-turn workspace once in execute_tool_block; the shared path resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so file tools + bash/python are confined centrally and a new tool that uses the shared helpers cannot accidentally bypass it. Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory modal (reusing existing modal/button CSS), the /workspace slash command, and a get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic (realpath/normcase/commonpath) and docker-safe (container paths, no host assumptions). Reopens #2023. * ux(workspace): clarify workspace is not a sandbox Picker modal note + pill tooltip + get_workspace tool/output wording now state plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder, but bash/python only start there (cwd) and are not sandboxed. Modal note reuses the existing .muted class. * fix(agent): treat an active workspace as file-work intent A vague low-signal message (e.g. "look at the local project") matches no domain keywords, so tool retrieval is skipped and only always-available tools are offered — leaving the agent with no file access even though a workspace is set. When a workspace is active, include the file/code tools (incl. get_workspace) on low-signal turns so the agent can act on the folder. Also requires the tool index (ChromaDB) to be reachable for normal retrieval; that is an environment dependency, not part of this change. * ux(workspace): hide pill + overflow entry in chat mode Workspace only scopes the agent's file/shell tools, so the pill and the overflow 'Workspace' entry are agent-only now — hidden in chat mode like the bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is called from the agent/chat setMode handler. * prompt(tools): steer bash/python to defer to the dedicated file tools bash/python schema descriptions (what native-tool-calling models read) were bare and gave no steer, so models would do file ops via the shell (e.g. writing SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python in the schema + tool-index + prompt section to prefer read_file/write_file/ edit_file/grep/glob/ls and only be used for what those do not cover. * prompt(tools): keep bash/python deferral generic (no hardcoded tool names) Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc. by name, so the guidance does not go stale if those tools are renamed. * style(workspace): drop em-dashes from added code comments/strings * ux(workspace): terser non-sandbox note in picker (no tool-name list) * ux(workspace): mirror terse non-sandbox wording in pill tooltip * chore: untrack local venv symlink (run-only, not part of the feature) * prompt(workspace): keep get_workspace text generic (no hardcoded tool names) * fix(agent): low-signal + workspace surfaces only read-only file tools Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but not write_file/edit_file/bash/python -- those wait for a request that actually calls for them (RAG retrieval still adds them on a real ask). * feat(workspace): cap browse listing at 500 dirs with a truncated hint Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local _MAX_BROWSE_DIRS so a directory with thousands of children does not dump every row into the picker; the response carries a truncated flag and the modal tells the user to type a path to jump in. * chore: untrack local venv symlink (run-only artifact) * fix(workspace): vet the workspace root against the sensitive-path deny list at bind time The in-workspace resolver deny-lists sensitive paths inside the workspace, but the empty-path search root is the workspace itself, so a workspace of ~/.ssh could be listed via ls with no path. vet_workspace() (public, in tool_execution next to the resolvers) rejects non-directories and sensitive roots before the path is ever bound; chat_routes uses it instead of its inline isdir check. * fix(workspace): reject filesystem roots and stop showing rejected workspaces as active Review findings from #3665: P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes every absolute path 'inside' the workspace and collapses confinement into host-wide file access. A root is its own dirname, so reject when dirname(resolved) == resolved; the browse response now carries a selectable flag and the picker disables 'Use this folder' on unselectable dirs. P3: /workspace set stored any string client-side and the chat route silently dropped rejected values, so the pill could claim a confinement that was not in effect. New admin-gated /api/workspace/vet validates manual paths before they persist (canonical path returned), and when a posted workspace is rejected at send time the stream emits workspace_rejected so the client clears the stored value and toasts instead of continuing silently. * fix(workspace): check caller privilege before vetting the posted workspace Review finding: /api/chat_stream called vet_workspace() on the posted value for every caller and emitted workspace_rejected on failure, so a non-admin who can chat but cannot use file/shell tools could distinguish existing directories from missing/file/sensitive/root paths by whether the event appeared. The resolution now lives in _resolve_request_workspace, which drops the submitted value uniformly for non-admin callers, with no vetting and no event, before the path ever touches the filesystem. Admin and single-user behavior is unchanged. Test pins that valid and invalid paths are indistinguishable for a non-admin and that vet_workspace is never invoked for them.	2026-06-11 18:17:54 +02:00
Michael	95c54ac3cb	fix: use _truncate for tool output display limits in agent_loop (#3831 ) Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit truncation indicator when content is cut. Scoped down from the original PR: only agent/tool-output display behavior, no integrations.py changes. Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 17:05:13 +01:00
pewdiepie-archdaemon	ebd2332db4	Agent prompt builder: stop re-adding ALWAYS_AVAILABLE on top of filtered tools Found the reason yesterday's tool-retrieval drop wasn't taking effect: in _build_agent_prompt, when relevant_tools was provided, it computed tool_names = set(ALWAYS_AVAILABLE) \| set(relevant_tools) which silently re-added every tool get_tools_for_query had just deliberately discarded. So when a 'save this for <person>' query dropped manage_memory from the retrieved set, the prompt builder put it right back, and the model saw both tools again. Trust the relevant_tools set. get_tools_for_query already starts from ALWAYS_AVAILABLE — any discard there is intentional and should propagate. Only force-include ask_user and update_plan here as belt- and-suspenders since the agent loop relies on those for its own control flow. Other callers (task_scheduler) already union ALWAYS_AVAILABLE or ASSISTANT_ALWAYS_AVAILABLE into relevant_tools before passing it in, so they're unaffected.	2026-06-11 09:49:20 +09:00
pewdiepie-archdaemon	df47536b8d	manage_memory descriptions: explicit deferral to manage_contact for person info Even with manage_contact in the retrieved tool set, models were still defaulting to manage_memory when the user pasted an address + 'save for <person>'. Both tools were in front of the model and it picked memory. Tighten both descriptions to steer at decision-time: - agent_loop.py manage_memory description: clarify scope is facts about the USER, with an explicit 'DO NOT use for info about another person' + a 'use manage_contact instead' line. - tool_index.py manage_memory description: same in shorter form, so the embedded retrieval signal is consistent with the prompt-time description.	2026-06-11 09:25:23 +09:00
pewdiepie-archdaemon	8632072ce0	Contacts: postal-address support via vCard ADR, keep tool prompt minimal Closes the gap that pushed the agent into manage_memory when the user pasted an address and said 'save this for X'. manage_contact now accepts an optional address arg end-to-end: - routes/contacts_routes.py: - _normalize_contact carries an 'address' field - _build_vcard emits ADR:;;<address>;;;; (street component of the RFC-6350 7-part ADR), only when address is non-empty - _parse_vcards reads ADR, joins non-empty components with ', ' - _create_contact and _update_contact thread address through; update preserves existing address when caller passes empty - src/tool_implementations.py do_manage_contact: - add accepts address; require at least name+address or email (was: email required) so address-only contacts are addable - update accepts address; require name OR emails OR address - src/tool_schemas.py: schema gets a single 'address' string field - src/tool_index.py + src/agent_loop.py: descriptions get one 'address' arg mention and a 'use this for save-X-for-person / address pastes / phone-with-name' steering line. Net: a few bytes added, not a paragraph. Also: removed a stray name from the schema's manage_contact example strings ('save Jonathan's email…') — no real names in the codebase.	2026-06-11 09:14:52 +09:00
pewdiepie-archdaemon	bc2d934b94	Agent email safety: stage drafts for user approval instead of auto-send Closes the auto-send hole that let earlier models invent signatures (e.g. signing 'David' for a user named Felix) and SMTP them to real recipients before the user could review. New setting: agent_email_confirm (default True). When on, the MCP send_email and reply_to_email tools no longer SMTP directly — they write the composed email to scheduled_emails with a new status 'agent_draft' (far-future send_at so the scheduled-send poller ignores them) and return a {pending: true, pending_id, to, subject, body, message: ...} payload. The model surfaces that to the user. Backend endpoints to approve / cancel: - GET /api/email/pending → list staged drafts for the owner - POST /api/email/pending/{id}/approve → flip status to 'pending' + backdate send_at so the existing scheduled-send poller delivers immediately - DELETE /api/email/pending/{id} → status = 'cancelled' UI: - Settings / AI Defaults gets a new 'Email Safety' card with the toggle, default on. - Tool descriptions for send_email and reply_to_email now include the pending behavior + an explicit 'DO NOT invent a signature, do not type a person's name' guardrail. Pass 2 (next): inline chat card with Send / Discard buttons so the user doesn't have to type a confirmation reply. Today's prompt + the listing endpoint give the model a clean path to surface drafts.	2026-06-11 08:50:06 +09:00
pewdiepie-archdaemon	4f7061fd61	Settings overhaul + UI polish pass Two months of iteration on the Settings panel, integration forms, and small visual nudges across the app. Highlights: Settings restructure - Add Models: split into separate Local + API cards (no more in-card tabs); each fuses Type/Provider with the URL input. - Added Models: new dedicated sidebar tab, with Probe + Clear-offline pulled into its header; Local/API sub-section icons accent-tinted. - Search: Web Search and a new Deep Research card (Model + tuning), with a cross-link to AI Defaults. Provider hints use real clickable anchors; Web Search Test button shows a whirlpool spinner. - AI Defaults: Image Generation card returns; Research Model card carries only Endpoint+Model with a cross-link to Search; Vision / Default / Utility fallbacks unified under one numbered-row design matching Search's chain. - API Permissions (was 'API Tokens'): per-row rename, inline Permissions toggle that expands the scope-edit panel, in-field copy icons (icon→check on success). Empty state accent-tinted. - Integrations: + Add Integration drops a type-picker menu directly under the button (drop-up on tight viewports); each integration form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses the same accent-outlined Save/Test/Cancel buttons right-aligned. - Danger Zone: Wipe→Delete with trash icons; new 'Delete everything' row at the bottom that loops every category. AI Synthesis (Reminders) - Persona dropdown sourced from PROMPT_TEMPLATES + custom preset. - src/reminder_personas.py mirrors the five built-ins for the server-side synthesis path. - dispatch_reminder() reads reminder_llm_persona and uses the persona's system prompt; empty/unknown falls back to warm-neutral. Esc handling - Kebab menus and the provider picker intercept Esc in capture phase so dismissing a popup no longer closes the whole Settings modal. Accent tinting - Scoped CSS rule across data-settings-panel=ai/services/added-models/ search/integrations/reminders for card h2 icons + the Added Models sub-section icons. Codex/Claude integration form - No more auto-creation on form open — explicit Create token button. - New tokens start with every scope granted; existing tokens move out of the integration form into the API Permissions card. - Setup reveal: copy buttons inline inside the token + setup code blocks; shorter subtitle wording. Misc visual polish - Save/Test/Cancel uniformly accent-outlined and right-aligned on every integration form. - Provider logos render inline next to the search fallback selects and the Deep Research Search dropdown. - Trash icons in fallback rows bumped to 20x20 so they fill the 32px button. - Image generation default flipped to off.	2026-06-10 15:15:13 +09:00
Lucas Daniel	55ff22c6d5	fix(chat): stabilize system prompt, sequence memory extraction, and send stable session id to preserve KV cache (#3360 ) * fix(chat): stabilize system prompt, sequence memory extraction, send stable session id to preserve KV cache Fixes #2927. As diagnosed in the issue, three things in Odysseus's request pattern actively destroyed local backends' (llama.cpp / LM Studio) KV-cache continuity, forcing a full prompt re-evaluation (15-30s+) on every turn: 1. Dynamic content folded into the system prompt every turn. Both the chat preface (ChatProcessor.build_context_preface) and the agent system prompt (_build_system_prompt) injected current_datetime_prompt() — text that changes every minute — directly into system-role messages, which llm_core then concatenates into the single system message sent as the cached prefix. Any byte difference there invalidates the entire cache. Moved this to a new current_datetime_context_message() helper that returns a standalone user-role message, inserted near the end of the array (right before the latest user turn) instead of mixed into the system prompt. The static system prefix (preset prompt + safety policy + agent base prompt) now stays byte-identical across turns of the same session. 2. Memory/skill extraction side-requests competed with the main completion. run_post_response_tasks fired extract_and_store / maybe_extract_skill via asyncio.create_task — fire-and-forget coroutines that could overlap the next turn's main request and steal llama.cpp's limited processing slots, evicting the cached checkpoint. They're now queued through a new _run_extraction_jobs_sequentially helper that waits for the session's stream to go idle and runs the jobs strictly one at a time. 3. No stable session identifier was sent to local backends, so llama.cpp assigned a new processing slot via LRU every turn ("session_id=<empty> server-selected (LCP/LRU)"), losing slot affinity. Added _apply_local_cache_affinity() in llm_core, which sets session_id and cache_prompt: true on outgoing payloads — gated to self-hosted OpenAI-compatible endpoints only (never api.openai.com or other cloud providers, which reject unrecognized request fields with a 400). Threaded session_id through stream_llm / llm_call_async / stream_agent_loop from the existing Odysseus session id. Tests in tests/test_kv_cache_invalidation_2927.py exercise the real payload- assembly and scheduling code paths: byte-identical system prefix across two turns of the same session (with a regression check that genuinely changed instructions DO still change it), the dynamic time block landing as a user-role message, extraction jobs waiting for the stream to go idle and running sequentially, and the outgoing payload carrying a stable session_id (same across turns of one session, different across sessions) only for self-hosted endpoints. Updated tests/test_user_time.py for the new message placement. * fix(tests): accept owner= kwarg in normalize_model_id monkeypatch The upstream normalize_model_id signature now takes an owner= keyword argument, and chat_helpers.py passes owner=getattr(sess, "owner", None) at the call site. Update the test stub lambda to **kwargs so it handles the new argument without breaking, and update chat_helpers.py to forward the owner parameter consistently. --------- Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-09 22:46:54 +01:00
nubs	f1cda91683	fix(agent): scope skill index to owner (#2404 ) Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>	2026-06-09 09:51:29 +02:00
Kenny Van de Maele	0aba00f4cf	refactor(tools): remove dead workspace-confinement plumbing (#3590 ) Commit `e6b1009` removed the workspace feature's entry point (deleted routes/workspace_routes.py + static/js/workspace.js and dropped the workspace-param parsing in chat_routes), but left the downstream backend plumbing dangling: chat_routes passed a hardcoded workspace=None into stream_agent_loop, which forwarded it to execute_tool_block, so the workspace value was permanently None and every workspace-gated branch was unreachable. Remove the now-dead code (no behavior change, since workspace was always None): - src/tool_execution.py: drop _resolve_tool_path_in_workspace and the workspace params/branches on execute_tool_block, _direct_fallback, _call_mcp_tool, _do_edit_file, and _resolve_search_root; restore the bash/python/bg cwd to _AGENT_WORKDIR. - src/agent_loop.py: drop the workspace param on stream_agent_loop, the dead 'ACTIVE WORKSPACE' system-prompt block, and the workspace forward. - routes/chat_routes.py: drop the hardcoded workspace=None arg and var. - tests: delete test_workspace_confine.py (tested the removed feature) and the workspace assertion in test_tool_policy.py. Full suite: 2903 passed, 1 skipped.	2026-06-09 08:30:50 +02:00
pewdiepie-archdaemon	3b01760e95	Prepare tested main sync cleanup	2026-06-09 09:34:42 +09:00
Mateus Oliveira	f7ae85590b	refactor(tools): consolidate duplicated _truncate and get_mcp_manager into src/tool_utils (#3478 ) * refactor(tools): consolidate duplicated _truncate and get_mcp_manager into src/tool_utils Move all copies of _truncate(), get_mcp_manager(), and set_mcp_manager() into a single leaf module (src/tool_utils.py) that imports only from src.constants. This eliminates the lazy-import hack ('from src import agent_tools' inside function bodies) in tool_execution.py and tool_implementations.py, and fixes a latent bug: the _truncate copy in tool_execution.py was missing the isinstance guard and would crash on None. Also deletes mcp_servers/_common.py — it was dead code with zero callers anywhere in the codebase, containing its own copy of truncate() and constants that already exist in src/constants.py. * fix(tools): route remaining get_mcp_manager imports to src.tool_utils The maintainer's feedback flagged src/task_scheduler.py:1857 and routes/task_routes.py:977. A project-wide search found a third call site in src/agent_loop.py that also imported get_mcp_manager from src.agent_tools instead of src.tool_utils. All three are now sourced from the canonical location in src.tool_utils. --------- Co-authored-by: mcnoliveira <mcnoliveira@gmail.com>	2026-06-09 01:05:30 +02:00
Lucas Daniel	0a324f20d2	fix(agent): stop treating illustrative Markdown fences as tool calls for native function-calling models (#3356 ) * fix(agent): stop executing illustrative Markdown fences as tool calls for native function-calling models _resolve_tool_blocks fell back to the textual parse_tool_blocks() fenced-block parser whenever a model produced no native tool_calls, regardless of whether that model has a reliable native function-calling channel. Native models (GPT/Claude/Grok/Qwen3/DeepSeek-V, etc. - _is_api_model true) commonly write illustrative ```bash/```python/```json examples in guide-only prose; the fallback parser matched these and executed them as real commands, sometimes looping for several rounds as the model tried to clarify with more examples (#3222). Restrict the textual fenced-block fallback to non-native models, which rely on it as their only tool-invocation channel. Native models are trusted to use their structured tool_calls channel for real invocations; when they don't emit one, a bare fence in their response is prose, not an action. The native tool_calls path itself is untouched. This sits one layer below #3088's guide-only policy enforcement: that PR blocks tool exposure/execution on explicit no-tools requests, while this fixes the parser so ordinary illustrative fences are never misread as calls in the first place, on any turn. * fix(agent): gate only the fenced-example pattern for native models, preserve DSML/invoke recovery and persistence _resolve_tool_blocks previously short-circuited the entire textual parser (tool_blocks = [] if is_api_model else parse_tool_blocks(...)) for native function-calling models with no native tool_calls. That also dropped Patterns 2-5 (explicit [TOOL_CALL]/<invoke>/<tool_code>/DSML markup leaked into content as text), which are real calls a model couldn't emit on its structured channel (e.g. DeepSeek-V falling back to DSML), not illustrative examples. parse_tool_blocks/strip_tool_blocks now take a skip_fenced flag that gates ONLY Pattern 1 (the fenced ```bash/```python/```json block matcher). _resolve_tool_blocks passes skip_fenced=is_api_model so fenced examples stop being executed for native models while [TOOL_CALL]/<invoke>/<tool_code>/DSML stay fully active and recoverable. cleaned_round mirrors the same gate when persisting round text, so an illustrative fence that wasn't executed isn't stripped from saved/reloaded history either (it was streaming once and then disappearing on reload).	2026-06-08 22:25:28 +02:00
RaresKeY	3a91c11ff8	fix: block app_api access to Cookbook host controls (#3231 )	2026-06-07 19:20:11 +02:00
YotamPeled	adbcb3763f	fix(agent): don't abort legitimate tool batches as runaway loops (#3183 ) The loop-breaker's runaway backstop counted per-tool-type call totals and tripped whenever any tool was used >=15 times — treating 15+ DISTINCT calls to one tool as a stuck loop. A real batch (e.g. "add these 18 birthdays to my calendar" emits 18 distinct manage_calendar create_event calls in one round) got flagged "calling manage_calendar over and over", the calls were discarded (next round tools_sent=0), and 0 events were created. Count IDENTICAL repeated call signatures instead (same tool AND args), via a small, unit-testable _detect_runaway_call() helper. Genuine batches pass; a model truly stuck repeating one call still trips the backstop. Adds a regression test. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 16:16:17 +02:00
RaresKeY	a3784da172	fix: block app_api access to shell routes (#3225 )	2026-06-07 15:19:08 +02:00
Nicholai	a3cb15d0a1	fix(agent): enforce guide-only tool policy (#3088 )	2026-06-06 18:48:24 -06:00
Mohammed Riaz	6ccd4500d7	fix(chat): show requested and actual reply models Show requested and actual reply models in chat labels when fallback or provider routing changes the responding model.	2026-06-06 04:30:16 -06:00
Ocean Bennett	fb9c7cf3da	fix(calendar): accept list event range aliases	2026-06-06 03:47:18 -06:00
Nicholai	33edc40eae	fix: route misfenced web lookups to web tools Fixes #3067	2026-06-06 03:46:31 -06:00
Nicholai	463713c2c6	feat(search): unify session transcript search (#2877 )	2026-06-05 18:08:31 -06:00
Kenny Van de Maele	8ce945d338	feat: Add plan mode to the chat agent (#638 ) * feat: Add plan mode to the chat agent Adds a plan mode: the agent investigates read-only, proposes a checklist, and waits for approval before changing anything. On approval it runs with full tools and checks items off as it goes. Enforcement reuses the existing disabled_tools gate. Includes a slash command: `/plan [on\|off]` (and `/toggle plan`) to flip the plan toggle from the chat input. - src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP). - src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the plan directive, force agent mode. - static/: plan toggle pill, Approve & Run, dockable plan window, task-list checkboxes, and the /plan slash command. - tests/test_plan_mode.py. * Plan mode: persistent re-referenceable plan + agent write-back Three improvements so a long plan survives a weak model and stays in reach: 1. Re-reference the plan (out-of-context fix). On the execution turn the frontend sends the approved checklist back (`approved_plan`); the backend pins it as a top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so the agent can always re-read the plan instead of losing the thread on a long run. New `build_active_plan_note()` (unit-tested). 2. Re-open / dock the plan anytime. The plan checklist is stored per-session (localStorage). When a plan exists, the plan-mode button opens a small menu ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan window — so it can stay docked while the agent works. The window live-refreshes as the plan changes. 3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps `- [x]` after finishing them, or to revise steps when the user asks. Marker tool (no I/O) → `plan_update` SSE event → the stored plan + docked window update live. The ACTIVE PLAN note instructs the agent to use it. Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb), src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse `approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools / tool_index (always-available, not admin-gated). Frontend: static/js/chat.js (plan store, send `approved_plan`, handle `plan_update`, capture restated checklists), static/app.js (plan-button menu), static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key). Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py. * Plan mode: drop bash/python, rely on read-only discovery tools Shell can mutate (write files, hit the network) and can't be constrained to read-only at the tool layer, so plan mode no longer relies on a prompt to keep it well-behaved — bash/python are removed from the read-only allowlist and added to the fail-closed block set. Discovery is covered by the dedicated read-only tools (read_file, grep, glob, ls) instead. Rewrites the plan-mode directive to state shell is disabled and lists the available read-only tools positively. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comment: note _MCP_READONLY_VERBS are prefixes not whole words Clarifies that entries like "summar" are intentional stems matched via startswith (covers summarise/summarize/summary), not typos. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: clarify why gating inverts the allowlist into a denylist Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an allowlist, so it returns the inverse (all known tool names minus the allowlist). The static mutator set is a backstop for the schema-derived name list, which misses XML-only tools and can fail to import. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: stop hardcoding the read-only tool list in the directive The model is already shown its available (read-only) tools by _assemble_prompt, which removes every disabled tool. Enumerating them again in the directive only duplicated that list and would drift as tools change. Point at the tools listed below instead. Addresses review feedback on #638.	2026-06-05 16:32:25 +02:00
Kenny Van de Maele	0a2adc9c96	Add ask_user tool: agent-posed multiple-choice questions (#2111 ) Let the agent pause and ask the user a multiple-choice question when a task is genuinely ambiguous and the answer changes what it does next — choosing between approaches, confirming an assumption, picking a target — instead of guessing. Modeled on the existing `ui_control` marker pattern: the `ask_user` tool returns an `ask_user` payload that the agent loop emits as an SSE event and then ends the turn. The frontend renders the question with clickable option buttons, a free-text "Other" input, and an x to dismiss; the user's choice is sent as the next message and the agent resumes with it in context. - src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O. Validates a non-empty question + 2..6 options, normalizes string/object options, returns the payload. - src/agent_loop.py: emit the `ask_user` event and break the round loop so the turn ends and waits for the user's selection. Stream the question as assistant text so it persists/replays (prevents a re-ask loop). - Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS, FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any user can be asked); the structured args serialize via the default json.dumps path. - routes/chat_routes.py: relay the `ask_user` event to the client. - static/js/chat.js + static/style.css: render the question card (options + free-text Other + dismiss x; removed once answered). Reuses CSS vars and the .modal-close button; emoji go through the monochrome-SVG pipeline. Bump chat.js cache pin. - tests/test_ask_user_tool.py: payload, multi flag, string options, option cap, validation errors, serializer round-trip, registration.	2026-06-05 11:49:11 +02:00
pewdiepie-archdaemon	fbd34334a5	Calendar overnight-event rendering + clickable [View note] link from chat - Calendar overnight events render proportionally across day boundaries via --start-frac / --end-frac CSS vars instead of bleeding as full-day on day 2. - Recurring-event delete strips the master uid + all master::* sibling instances optimistically so the row clears immediately instead of waiting for the next sync re-render. - manage_notes(create) now returns note_id + open_url, and agent_loop appends a markdown [View note](#note-<id>) link mirroring the deep-research pattern. - chatRenderer's hash-link router (already wired for #note-id) reaches the new notes.openNote(id) helper, which force-closes/reopens the Notes panel, polls for the target card, and runs a brief outline flash so the user can locate it on long lists.	2026-06-05 14:41:48 +09:00
pewdiepie-archdaemon	f8aaeab245	Merge remote-tracking branch 'origin/dev'	2026-06-05 12:14:34 +09:00
pewdiepie-archdaemon	f19ac6ed03	Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus # Conflicts: # static/js/cookbookRunning.js	2026-06-05 11:23:15 +09:00
Kenny Van de Maele	2be3779e6e	feat: Add workspace: confine agent tools to a folder (#1103 ) * feat: Add workspace: confine agent tools to a folder Pick a server folder as the agent's workspace so its file/shell tools work there and don't touch files outside it. File tools are hard-confined; bash/ python run with cwd set to the folder. Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` / `clear` / `pick` (open the directory browser). - routes/workspace_routes.py: GET /api/workspace/browse (admin-only). - src/tool_execution.py: hard path confinement for read_file/write_file; bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block. - src/agent_loop.py: workspace note prepended to the system prompt. - static/: overflow menu item, input-bar pill, directory-browser modal, and the /workspace slash command. - tests/test_workspace_confine.py. * Wire workspace confinement into tools that landed after this PR edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was written, so they bypassed the workspace boundary. Thread the workspace through: - edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace - grep/glob/ls: _resolve_search_root confines to the workspace (root + paths) - bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir default when no workspace is set) Tests cover edit_file + grep/ls confinement (inside ok, outside rejected). * Workspace picker: editable path bar + modal style cohesion + cross-platform hardening - Make the current-folder strip an editable address bar: type/paste a full path and press Enter to navigate (also reaches other Windows drives and hidden dirs the up-only browser cannot). - Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn* in favour of base .modal-content/.modal-body and the .confirm-btn button family; separators/hover use var(--border). Net -31 CSS lines. - Fix the path field overflowing the modal right edge (flex stretch + margin vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin. - Cross-platform confinement: normcase the workspace commonpath check so containment holds on case-insensitive filesystems (Windows/macOS). - Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd() instead of pwd. 5 pass.	2026-06-05 00:06:37 +02:00
Michiel Van de Velde	7ddc5eaef4	Merge pull request #2529 from NubsCarson/codex/2509-mcp-tool-input-params fix(mcp): expose MCP tool input parameters to the agent	2026-06-04 23:07:42 +02:00
Kenny Van de Maele	64d65b73c1	feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999 ) * feat: round-limit handling — Continue affordance at the cap + configurable cap When the agent loop runs out of rounds (per-message step cap, default 20) while still actively using tools, it stopped silently mid-task. Now: 1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows a "Continue" pill at the bottom of the chat that resumes the task from where it left off. Repeated cap-hits each get a fresh Continue (multiple continues in a row). 2. The cap is configurable in Settings → Agent ("Max steps per message"), validated on the client, at the save endpoint, and at the read site. - src/agent_loop.py: track `_exhausted_rounds` (set only when a full tool-executing round completes on the last allowed round — i.e. the agent wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged). - routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as `max_rounds`; forward the new event through the SSE relay. - routes/auth_routes.py: validate numeric settings on save (int + clamp; agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int). - src/settings.py: default `agent_max_rounds = 20`. - static/: Settings input + client-side clamp; the Continue pill (reuses the existing .stopped-indicator / .continue-btn classes and theme vars --border/--fg/--bg/--accent); appended to the chat container so it survives the message re-render at stream finalize. chat.js cache version bumped. * test: cover rounds_exhausted emission (cap-hit vs normal finish) Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings: a tool block every round exhausts the cap and must emit rounds_exhausted; a plain answer hits the done-break and must not. Guards the for/else logic.	2026-06-04 22:36:05 +02:00
Kenny Van de Maele	1cd0aa2b8c	feat(provider): add GitHub Copilot provider with device-flow auth (#1480 ) * feat(provider): add GitHub Copilot provider with device-flow auth Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5, Claude, Gemini, …) work through the normal chat + agent loop, incl. native tool calling and vision. Auth is one-click via the GitHub OAuth device flow; the access token is stored as the endpoint's (encrypted) api_key and sent directly as `Authorization: Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only provider-specific behaviour is a small set of required request headers, injected centrally. Sign-in is available from Settings → model endpoints ("Connect GitHub Copilot") and from chat via `/setup copilot`. - src/copilot.py (new), routes/copilot_routes.py (new): constants, header builders, device-flow start/poll, model discovery, owner-scoped endpoint provisioning. - src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers, per-request x-initiator/vision. - src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas. - src/model_context.py: known context windows for Copilot (no unauthenticated /models probe). - static/, README, tests/test_copilot.py. Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process	2026-06-04 21:13:14 +02:00
Kenny Van de Maele	7443c36bd9	feat: Add edit_file tool + file-change diffs (#1239 ) * Add edit_file tool + file-change diffs edit_file is an exact old_string -> new_string replacement on a file on disk (fails if old_string is missing or non-unique unless replace_all); write_file also returns a unified diff. Diffs render collapsed in the tool bubble (filename + +adds/-dels, theme colors); the raw JSON command box is hidden. Security: edit_file is a sensitive filesystem-write tool, treated everywhere write_file is — - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner), so on auth-enabled deployments a non-admin cannot run it; execute_tool_block refuses it for non-admin owners. - confined by the same path policy as read_file/write_file (allowlist + sensitive-file deny) via _resolve_tool_path. Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the only way to write files (they show a diff) — never edit_document (editor panel) or a bash heredoc/redirect. Tests (tests/test_edit_file.py): non-admin block (policy + execution gate), successful edit, not-found old_string, non-unique old_string (+ replace_all), and path outside the allowed roots. Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py, src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css, tests/test_edit_file.py. * Drop redundant import os in write_file closure os is already imported at module top.	2026-06-04 18:29:10 +02:00
pewdiepie-archdaemon	9112861d8e	cookbook agent debug loop: persistent log files, auto-adopt orphan tmux, Codex/Claude skill parity Three converging fixes so the chat agent + external Codex/Claude skills can actually debug a crashed serve instead of staring at a post-crash neofetch banner: * Serves now `tee` to /tmp/odysseus-tmux/SESSION.log on the host running them. Runner saves fds 3/4 before the tee and restores them right before `exec ${SHELL}`, so the post-crash interactive zsh banner does NOT pollute the log file. * `tail_serve_output` (chat agent) and `/api/codex/cookbook/output/{sid}` (Codex+Claude skills) both prefer the persistent log file over the tmux pane. Pane is fallback for sessions predating the tee runner. Default tail bumped 150 -> 400. * `list_served_models` "recent log" snippet seeks to the Traceback line instead of showing the last 6 lines (which was always the bash prompt). Cookbook auto-adoption sweep on `/api/cookbook/tasks/status`: every 20s (rate-limited) the cookbook SSHes each configured server, finds `serve-` / `cookbook-` tmux sessions running an actual model process (vllm/python/llama-server/etc., filtered via `pane_current_command`), and writes them into state.tasks. So when the agent falls back to raw ssh+tmux, the session appears in the Cookbook UI on the next poll. `serve_model` error path now reads `data["detail"]` in addition to `data["error"]` so the FastAPI HTTPException message ("Invalid characters in cmd") actually reaches the agent instead of being swallowed as a generic "Serve failed". Tool description updated to warn against `cd …`/`source …`/`&&` prefixes. Intent-without-action supervisor in agent_loop: when the model writes "Let me tail the output" / "I'll check the logs" / "Let me investigate" and ends the turn without emitting a tool call, the loop injects a sharp system nudge ("You said you would X — DO IT NOW") and continues. Capped at 2 nudges per chat so a model that genuinely cannot use the tool does not pin the loop. Codex/Claude skill parity: adds `/cookbook/cached`, `/cookbook/presets`, `/cookbook/preset/{name}`, `/cookbook/adopt` so external agents have the same surface as the chat agent. SKILL.md docs + odysseus_api.py wrapper updated for both bundles. `adopt_served_model` promoted to the always-on tool set so the agent has a documented fallback when serve_model rejects a cmd. Also various cookbook UI tweaks accumulated alongside the above (cookbook.js, cookbookRunning.js, cookbookServe.js, cookbook-diagnosis.js, settings.js, style.css).	2026-06-04 23:27:18 +09:00
NubsCarson	39825867a4	fix(mcp): route literal MCP requests to external schemas	2026-06-04 13:00:17 +00:00
Alexander Kenley	7b45a94b6d	Fix calendar routing and user-local time context (#408 ) * fix(chat): add user-local time context * fix(chat): route calendar follow-up phrasing * refactor(chat): log tool intent routing reasons * test(chat): align user time prompt shim --------- Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>	2026-06-04 13:20:04 +01:00
Giuseppe	f6a5f6592f	fix: log warnings on silently swallowed agent and endpoint failures (#2367 ) get_builtin_overrides() was swallowing all exceptions with a bare `except Exception: pass`, so misconfigured tool-description overrides would silently produce wrong agent behaviour with no log trace. The background endpoint refresh loop had the same pattern: any probe failure was silently ignored, giving operators no signal that the refresh was broken. Also removes a circular self-import (`from src.agent_loop import _build_base_prompt`) inside _build_system_prompt; the function is already in scope and the import created a latent circular reference risk. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-04 12:29:31 +01:00
Marius Popa	dc365a1b27	Fix Ollama agent single-token responses (#1591 ) Agent mode treated local /v1 endpoints, including Ollama on :11434, as native-tool-capable by host/model heuristics. On Ollama's OpenAI-compatible surface some models that advertise tool support stop after a single token when schemas are sent (issue #1567). Default local Ollama /v1 back to fenced tool blocks unless the endpoint explicitly has supports_tools=True. Also compare both the runtime chat URL and the normalized endpoint base when reading ModelEndpoint.supports_tools. That keeps a saved base URL such as http://localhost:11434/v1 effective when the active session URL is /v1/chat/completions. Tests: .venv/bin/python -m pytest tests/test_tool_support_heuristic.py	2026-06-04 11:45:10 +01:00
Lucas Daniel	68da800dcb	fix(agent): stop sending tool schemas to native Ollama endpoints (#1765 ) Models like gemma4, qwen3.5, and ministral served via Ollama's native /api/chat respond to OpenAI-style tool schemas by emitting a single native tool_call chunk and then stopping. The agent loop receives 1 token of round_response and no recognised ToolBlock, so the round ends immediately — the user sees a one-token response. Root cause: _is_api_model was True for any endpoint whose host appears in _API_HOSTS (which includes "host.docker.internal" and "localhost") OR whose model name matches a keyword like "gemma". Native Ollama endpoints were never excluded from this path. Fix: import _is_ollama_native_url from llm_core and treat native Ollama endpoints (/api/chat, port 11434) as text-only by default — falling back to the fenced-block tool path the local models are tuned for. The per-endpoint supports_tools=True toggle (Settings → Endpoints) still overrides this for users who have explicitly opted in. Fixes #1567	2026-06-03 13:23:42 +09:00
lekt8	b6843c7621	Route "read that report" to manage_research instead of the HTML render (#1375 ) After a deep-research job completes, a follow-up like "check it out" / "read that report" had the agent web_fetch the /api/research/report/{id} HTML render (and then drift into unrelated searches) instead of reading the saved report (issue #1363). The report text is already available via the manage_research tool (action read), and action list returns ids most-recent-first, so the agent can resolve "the recent report" itself. Strengthen the manage_research instructions: read a finished report via action list -> action read; do NOT web_fetch/app_api the report URL (it renders HTML, not clean text) and do NOT start a fresh web_search just to read an existing report. Annotate the app_api endpoint list to say the same. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 03:24:09 +09:00
lekt8	80de69ebb0	feat: document rrule in the manage_calendar tool schema (#1320 ) (#1324 ) * feat: document rrule in the manage_calendar tool schema (#1320) The create_event handler already persists `rrule` (a single event carrying an iCalendar RRULE), but the manage_calendar tool schema didn't list it, so the agent had no documented way to make a recurring event and took a roundabout path. Add `rrule?` to the create_event field list with examples (FREQ=WEEKLY;BYDAY=MO etc.) and an explicit note to create ONE event with the rule rather than looping. Covered by tests/test_calendar_rrule.py: do_manage_calendar create_event with an rrule stores one event with that recurrence; without it, the event is single. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: restore SessionLocal via monkeypatch in #1320 rrule test (review) Per review: the test patched core.database.SessionLocal at module import and never restored it, which could leak the temp DB into later tests in the same process. Move the patch into an autouse monkeypatch fixture so it is restored after each test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 02:37:45 +09:00

1 2

71 Commits