odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-18 02:35:23 -04:00

Author	SHA1	Message	Date
RaresKeY	33fe7276be	fix(endpoints): normalize URL handling (#4338 )	2026-06-16 03:59:18 +01:00
RaresKeY	4d10c16d02	fix(auth): clean up rename and null-owner ownership (#4340 )	2026-06-16 03:33:02 +01:00
TheDragonTail	0f966d6b9f	fix(embeddings): fall back to default cache dir when FASTEMBED_CACHE_PATH is empty (#3434 ) docker-compose.yml injects FASTEMBED_CACHE_PATH=${FASTEMBED_CACHE_PATH:-}, which sets the variable to an empty string when the host has not defined it. FASTEMBED_CACHE_DIR used os.getenv("FASTEMBED_CACHE_PATH", default), and os.getenv only returns the default when the variable is ABSENT -- so the empty value won and FASTEMBED_CACHE_DIR became "". os.makedirs("") then raised [Errno 2] No such file or directory: '', FastEmbed failed to initialise, and every vector feature (RAG, semantic memory, tool index) silently degraded on the default Docker stack. Treat an empty value like an absent one via `os.getenv(...) or default`. Add a regression test covering the empty, unset, and explicit cases. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 03:11:48 +01:00
Afonso Coutinho	7b09491557	fix: check-in calendar digest leaks every user's events (missing owner scope) (#1925 ) * fix: check-in calendar digest leaks every user's events (no owner scope) * Seed dtend on calendar events in digest test so the NOT NULL column is satisfied	2026-06-16 02:42:41 +01:00
Kenny Van de Maele	fafaf089c5	refactor(search): centralize the web-scraping User-Agent into one constant (#4325 ) The outbound UA for web_fetch / web_search was inlined in four places with two different values and nothing keeping them current: content.py pinned a mid-2021 Chrome 91 build, and providers.py sent a bare Mozilla/5.0 in three spots. Some sites serve a degraded or blocked page to a UA that old. Add WEB_FETCH_USER_AGENT to src/constants.py (env-overridable, matching the existing Copilot/Kimi UA-constant pattern) and import it in content.py and providers.py. Default to a current, common desktop UA so pages return their normal HTML: the market-leading desktop OS (Windows; NT 10.0 covers Windows 10 and 11) and browser (Chrome) on a current stable build. The version is now bumped in one place. Service-specific self-identifying agents (Copilot, Kimi, webhooks, cookbook) are intentionally left separate. Adds a regression pinning the constant shape, the env override, and a guard against a new inline Mozilla literal in the search sources. Closes #4324	2026-06-16 01:33:47 +00:00
holden093	dd2e23c9af	fix(agent): report phone numbers from resolve_contact when a matched contact has no email (#4327 ) When a CardDAV contact matched the search query but had no email address (only phone numbers), the tool silently dropped it and returned 'No contacts found'. Fall back to the contact's phone number(s) so the caller still receives usable information. Refs: #4178 (the contacts-domain classifier fix that made the model actually call resolve_contact for contacts queries, surfacing this pre-existing gap)	2026-06-16 00:03:33 +02:00
Kenny Van de Maele	074a1e6eff	fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 ) * fix(search): add download budgets to web_fetch with truncation notice and hard ceiling MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content buffered and cached the entire response body first, so a large or hostile URL could pull arbitrarily many bytes into memory and the content cache. The fetch is now a capped streaming GET (SSRF redirect guard unchanged): a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call override via full/max_bytes on the web_fetch tool, and a hard ceiling (WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed. When Content-Length already declares a body over the ceiling the fetch is refused before any body bytes are buffered. Truncated results carry truncated/fetched_bytes/total_bytes, the tool output leads with a partial-content notice telling the model how to re-fetch with full=true, and the tool schema documents the flag. A truncated PDF is reported as a budget error since a cut PDF is unparseable. The effective cap is part of the content-cache key so a truncated fetch is never served to a full-budget request. Existing tests that faked httpx.get or the old _get_public_url signature are adapted to the streaming interface; behavior pins are unchanged. Fixes #3812 * fix(search): close compressed-body cap bypass and protect the partial notice Addresses RaresKeY's review on #3955: - Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the wire bytes (and Content-Length) can be a fraction of the decoded body, so a tiny compressed response could pass the hard-cap preflight and then expand past the ceiling in a single decoded chunk before the streamed cap could slice it. Identity makes Content-Length the true body size and keeps each streamed chunk bounded by the network read, so the hard ceiling actually bounds memory. - Lead web_fetch output with the partial-content notice and cap the page title. The notice is the user-facing contract for partial fetches, but the title is untrusted, uncapped page content; placed ahead of the notice a giant title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads and the title is capped as a second guard. Adds regressions: the fetch advertises identity encoding, and a truncated result with an oversized title still surfaces the partial notice. * fix(search): reject compressed responses that ignore the identity request Requesting Accept-Encoding: identity is not enough on its own: a server can ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would decode that, so a tiny compressed body could balloon into one decoded chunk far past the hard cap before the streamed loop slices it (and Content-Length, the compressed wire length, makes the preflight and size metadata unreliable). Refuse a non-identity Content-Encoding before reading the body. Adds a regression where the server ignores the identity request and returns gzip; the fetch is refused before any body is decoded.	2026-06-15 17:38:09 +00:00
Lucas Daniel	f4e8990635	chore: add warnings to silent except Exception blocks (#3212 ) * log(app): add warnings to silent except Exception blocks - Internal tool auth header failure now logs a warning instead of silently passing, making auth bypass easier to spot in logs. - Token last_used_at update failure now logs at DEBUG (fire-and-forget, non-critical, but useful when debugging token tracking issues). - Image ownership verification failure now logs a warning so unexpected access-check errors surface instead of silently allowing the request. * log(chat_routes): add warnings to silent except Exception blocks - clear_orphaned_session_endpoint: log before rollback so failures appear in traces when users see stale/deleted model options. - _endpoint_has_model (JSON parse): log malformed cached_models instead of silently treating endpoint as valid. - _has_any_visible_model (JSON parse): log malformed cached_models instead of silently returning empty list. - timezone header parse: log failure so time-zone-related tool bugs (wrong scheduled times, calendar events) are traceable. - attachments JSON parse: log failure so silently-dropped attachments are visible in server logs. * log(email_routes): add warnings to silent except Exception blocks - Email alias resolution failure now logs a warning instead of silently returning an empty list, making broken account configs diagnosable. * log(document_routes): add warnings to silent except Exception blocks - Export ZIP request body parse failure now logs a warning so empty exports caused by malformed requests are diagnosable. - clear_active_document failure on detach now logs a warning to help trace doc re-injection bugs like #1160. * log(agent_loop): add warnings to silent except Exception blocks - builtin tool overrides load failure now logs a warning so misconfigured settings don't silently fall back to defaults without a trace. - Timezone context injection failure now logs a warning to help debug incorrect scheduled times in agent-created tasks. - PDF form-backed document detection failure now logs a warning so broken form-doc UI is traceable to the root cause. * log(llm_core): add warnings to silent except Exception blocks - Malformed URL in _is_ollama_native_url now logs a warning so bad endpoint configs are traceable instead of silently returning False. - Model list fetch failure now logs a warning with the endpoint URL so endpoints that silently vanish from the model picker are diagnosable. * log: pass exception via exc_info instead of string interpolation * fix(logging): avoid logging raw URLs in llm_core error paths Drop the raw url/base_chat_url from the Ollama-detection and model-list-fetch warning logs added by this sweep, since these values can contain private hostnames, internal IPs, credentials, or other deployment details. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 17:49:27 +01:00
Kfir Sadeh	fc3a5e555e	feat(paths): abstract runtime path logic for frozen distribution packages (#969 ) * feat(core): abstract runtime path logic for frozen distribution packages * Address review feedback: revert browser MCP check, persistent data dir default when frozen, and add path tests	2026-06-15 17:44:10 +01:00
Ashvin	7fd937fa57	fix(calendar): parse "mins"/"hrs" reminder offsets in manage_calendar (#4266 ) _reminder_minutes matched the offset with (?:m\|min\|minute\|minutes)\b and (?:h\|hr\|hour\|hours)\b. The trailing \b makes the common plural abbreviations "mins"/"hrs" fail to match (after "min" the "s" is a word char, so no boundary), so reminder_minutes "5 mins" or "2 hrs" returned None and the event was created with no reminder, silently. Widen the two unit regexes and the matching reminder_only description regex to a strict superset that also accepts mins/hrs. The sibling duration parser already accepts these forms (it has no \b), so this only brings the reminder parser in line.	2026-06-15 17:37:28 +02:00
Catalin Iliescu	c41caac438	fix(cookbook): only persist successfully stopped scheduled serves (#4267 ) Co-authored-by: Cata <cata@bigjohn.local>	2026-06-15 17:30:18 +02:00
RaresKeY	f66a23d19d	fix(ai): validate generated image result URLs (#4289 )	2026-06-15 16:40:49 +02:00
pewdiepie-archdaemon	1cc9a003fd	Fix failing post-merge tests	2026-06-15 22:49:06 +09:00
pewdiepie-archdaemon	6d507f8128	Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615 # Conflicts: # src/tool_implementations.py # static/js/research/panel.js	2026-06-15 21:20:15 +09:00
pewdiepie-archdaemon	2cbd55b8bd	Open email context for agent, email search across All Mail, cookbook serve polish - Agent: pass the open email reader (uid/folder/account/from/subject/body preview) on every chat submit so 'reply to this' / 'write email saying hi' route to ui_control open_email_reply with the right UID instead of inventing a new .md draft. Code-level enforcement (chat_routes strips create_document + send_email when active_email is set); cross-session active_doc_id is now trusted instead of being silently dropped. set_active_email/clear_active_email tool-layer helpers in tool_implementations. - ui_control open_email_reply: optional body argument so the agent can open-and-write in one call; envelope now forwards uid/folder/account/ body/panel through tool_output. Tool description sharpened and the parser rejects empty bodies on reply/reply-all (forces the agent to write rather than open an empty draft). - Email library: search now runs against [Gmail]/All Mail when the current folder is INBOX (archived emails surface). Whirlpool spinner + 'Searching…' placeholder while in flight. Each search result is stamped with its source folder so clicks open the right email instead of whatever shares its UID in INBOX. Search no longer re-applies the same text pill locally (which only checks subject/from/snippet, never body) so body-only matches don't get dropped after IMAP returns them. Initial inbox load bumped 100→500. - Email favorites: 'Favorite (pin to top)' / 'Unfavorite' in both the card menu and the open-reader more menu, backed by a new /api/email/flag/{uid}?on=true\|false endpoint. Flagged emails always bubble to the top of the grid regardless of active sort. - AI reply in doc editor: never overwrites existing draft text or the quoted history. AI suggestion is prepended; AI-generated 'On … wrote:' re-quotes are stripped so the original quote isn't visually edited. - Cookbook serve: pre-launch GPU driver / has_gpu / install / version- floor checks (vllm minimax_m2 needs 0.10.0+, deepseek_r1 needs 0.7.0 etc.) before the launch chain starts. Detect 'another model already running on this host' and offer Stop & launch (with graceful then force tmux kill helpers, port release wait). Per-vendor deep-link buttons (vLLM recipe / SGLang cookbook) with hardware hash. Backend picker is now a custom dropdown with accent-coloured logos for vLLM, SGLang, llama.cpp, Ollama, Diffusers; same glyphs added next to package names in Dependencies. Runtime-readiness note moved inside the panel (green when ready, red when missing) with an × dismiss. Esc collapses the expanded card; expanded card scrolls when it overflows; Trust Remote / Auto Tool / Reasoning Parser / Enforce Eager / Prefix Caching / Expert Parallel / Speculative / MoE Env on one row (Reasoning Parser auto-detected per model family). Dtype→Row 1, GPUs→Row 2 (rightmost). Removed redundant GPU 'auto' input — command builders read from the GPU button strip. Default cookbook open is Download tab. - Cookbook hwfit: 'Model (latest)' / 'Model (oldest)' header sorts by release_date; release dates can be backfilled with the new scripts/backfill_model_release_dates.py and recipe metadata pulled with scripts/import_from_vllm_recipes.py against the upstream vllm-project/recipes catalog (vllm_recipe + min_vllm_version stamped on entries). - Calendar: Quick add hint cycles a random Odysseus-themed example per open (wooden horse Friday, crew muster 10am daily, council on Ithaca, …). Typing a time like '11pm' in the event title updates the hero clock live. - Doc editor: email-mode Reply button (sparkle icon, accent) opens the same Fast/Full + context popover the email reader uses; Ctrl+Alt+M toggles markdown preview. - Memories panel: custom sort picker with per-option icons, default 'Latest', visible Enabled/Disabled toggle text matching the section description style.	2026-06-15 20:47:51 +09:00
andrewemer	cd02ac7ef6	fix(agent): skill-prescribed tools never reach the model's schema list (#4008 ) * Agent: make skill-prescribed tools actually callable The skill index and matched-skill procedures are injected into the prompt, but tool selection never followed: manage_skills wasn't in the RAG-selected schema list (so the model substituted manage_memory), and a matched skill could prescribe tools (grep, read_file) the model had no schema for. Now: - manage_skills rides along whenever the owner has any skills indexed - a Jaccard-matched skill's requires_toolsets join the selection - viewing a skill mid-turn via manage_skills unlocks its requires_toolsets for subsequent rounds - admin-intent turns send _ADMIN_TOOLS schemas, matching the prompt text _build_base_prompt already advertises - index_for(active_toolsets=None) no longer hides requires_toolsets skills from callers that don't know the active set Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Agent: validate skill requires_toolsets against known tools, not TOOL_SECTIONS grep/glob/ls ship as function schemas without a prompt-prose section, so gating on TOOL_SECTIONS silently dropped them from a skill's requires_toolsets. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-15 20:32:43 +09:00
cirim	e7abb7559d	fix(research): keep Discuss chats grounded on their report (#4006 ) * fix(research): preserve Discuss spin-off primer during context trimming trim_for_context() kept only system_msgs[:1] as essential and dropped the rest under budget pressure. A research "Discuss" spin-off seeds the report as a system message that sits after the preface system messages, so it landed in extra_system and was the first thing evicted once the chat grew — the conversation then lost its grounding and drifted off task. Treat any system message carrying research_spinoff_from metadata as essential, alongside the leading system prompt, so the seeded report survives trimming. maybe_compact already retains all system messages. Tests: tests/test_context_compactor.py::TestResearchPrimerPreserved * fix(research): ground Discuss spin-off chats on the seeded report build_chat_context injected global memory (pinned + hybrid-retrieved) and personal-doc RAG every turn, keyed off the user-level memory_enabled pref and a request-scoped use_rag flag — never the session. A research spin-off, whose primer declares the report the sole knowledge base, thus had unrelated keyword-matched facts pulled in ("wrong data") competing with the report; its rag=False flag was also ignored (use_rag defaulted on). Add _session_is_research_spinoff(sess) (detects the primer research_spinoff_from metadata; handles ChatMessage and dict forms) and, for such sessions, disable memory injection and force RAG off. Tests: tests/test_chat_helpers.py spin-off detection cases --------- Co-authored-by: Dan (cirim) <claude@cirim.org>	2026-06-15 20:31:57 +09:00
Josh Patra	f5d3e5098a	fix(llm): omit temperature for Kimi K2.5 and K2.6 (#3960 )	2026-06-15 20:29:22 +09:00
Josh Patra	4ee5ed4dce	fix(memory): return complete memory lists (#3885 )	2026-06-15 20:28:25 +09:00
Achilleas90	ffc0f1dccc	Harden CalDAV write-back with retries (#1193 ) Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-15 15:59:31 +09:00
KYDNO	955455b797	fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions (#3549 ) * fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(kimi): omit temperature for kimi-for-coding API calls Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-06-15 15:56:54 +09:00
Abhishek Kumbhar	a172522d87	fix(integrations): prevent blank API integrations (#3840 ) * fix(integrations): validate unified API form fields * fix(integrations): validate API integration fields server-side	2026-06-15 15:40:36 +09:00
Vishnu	d6a3c9a0fe	fix(utility): use utility model for background tasks (auto-title, memory audit) instead of chat model (#4027 )	2026-06-15 15:33:19 +09:00
Dividesbyzer0	33c26bab88	fix(agent): parse raw json web search calls (#4088 )	2026-06-15 15:19:38 +09:00
cyq	e52d078ea1	fix(agent): detect Polish web lookup intent (#4091 )	2026-06-15 15:19:03 +09:00
nsgds	7ae6133d7f	fix(agent): don't let a materialized default budget defeat context-window scaling (#4122 ) * fix(agent): don't let a materialized default budget defeat context scaling #1230 scales agent_input_token_budget to the model's context window unless the user explicitly set a budget, detected via is_setting_overridden(). But the settings-save path materializes every DEFAULT_SETTINGS key into settings.json (load_settings merges defaults; handlers persist the merged dict), so the persisted default 6000 reads as "overridden" and the budget code takes the min(6000, ctx) branch — silently re-capping long-context models at 6000 for anyone who has ever saved a setting. This reintroduces the exact regression #1170/#1230 set out to fix. Add is_setting_customized() (saved value != default) and gate the scaling on it instead of mere presence. A persisted default is not a user choice. is_setting_overridden has exactly one consumer (this budget path), so the change is contained. Tests cover the materialized-default regression, a deliberately-chosen budget still being honoured, and the absent-key case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): rework context-budget fix per review (#4122) Address RaresKeY's review: P2 (explicitness): is_setting_customized treated a saved value equal to the default as "not explicit", which ALSO blocked a user from deliberately pinning the default budget. Reframe the default value itself as the AUTO sentinel — agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context window", any other value is an explicit cap. A materialized default still reads as auto (fixing the original regression), and any non-default value the user chooses is now honoured. Drop the now-unused is_setting_customized helper. P2 (fallback context): auto-scaling trusted get_context_length() even when it returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known window), over-allocating on self-hosted/proxy setups. Add get_context_length_known() (also returns whether the window was actually discovered); the budget block passes 0 when unknown so auto-scaling stays conservative instead of inflating to an unproven window. hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that contract and answered the reviewer's question rather than silently reversing it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(agent): lock the materialized-default budget regression (review on #4121) Per WGlynn's review on the issue: add an end-to-end regression that saves an UNRELATED setting (which makes the settings-save path materialize the budget default into settings.json) and asserts the budget still auto-scales rather than re-reading as an explicit 6000 cap — locking the exact reopening shut. To make the test bite the production decision (not just re-derive it), extract `budget_is_explicit()` into src/context_budget.py and use it from the agent loop. It keys off value-vs-default (the default is the auto sentinel), NOT settings presence — which is the whole point, since the save path materializes defaults. Note: after this PR's rework, is_setting_overridden has ZERO production callers, so the merged-dict materialization smell can't reach any setting through a presence check today (WGlynn's durability concern). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(agent): bind the budget context window to its own provenance (review #4122) RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop kept only the `known` flag from get_context_length_known() and budgeted off the passed-in `context_length`, which can come from a different lookup. Two failures: - local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT fallback while the fresh probe proves the real (smaller) served context — we'd scale off the stale value; - callers that don't pass context_length (scheduled tasks, teacher escalation, skill test runs, bg_monitor) were capped at 6000 even when a long window is discoverable. Extract budget_context_for_model() which returns the freshly-probed window when known else 0, binding the flag to the value it proves; the agent loop uses it. Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): fix stale budget comments + tighten to the contract (review #4122) - settings.py: an explicit budget is clamped to the window only — hard_max is auto-only (#1190); drop the incorrect "and to hard_max". - is_setting_overridden docstring: drop the stale "adaptive budgets" example; point value-sensitive callers at context_budget.budget_is_explicit. - Tighten the budget-block comments to the contract (default = auto sentinel, non-default = explicit cap, hard_max = auto-only ceiling). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(agent): correct budget issue citations (#1190 → merged #1230/#1273) The context-budget contract (auto-sentinel, explicit budgets honoured, hard_max auto-only) merged via #1230 — #1190 was the earlier, closed, superseded PR. Re-point the contract comments at #1230 (the live source, already cited for the auto-sentinel two lines up in settings.py). The configurable hard_max setting (`agent_input_token_hard_max`) was a reviewer requirement first raised on #1190, omitted from the merged #1230, and actually added in #1273 — credit #1273 for it and correct the test comment's history (it previously implied this PR completed the requirement). Comment/docstring-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 15:17:28 +09:00
Dividesbyzer0	589fcd314a	fix(image): patch realesrgan torchvision compatibility (#4110 )	2026-06-15 15:16:41 +09:00
Max Hsu	039431f5ea	fix(mcp): detect npx cache entries before probing (#4034 )	2026-06-15 15:14:48 +09:00
Dividesbyzer0	7f571c8f7e	fix(agent): keep gpt-oss on text tool mode Treat gpt-oss local OpenAI-compatible models as text/fenced-tool models unless the endpoint explicitly declares native tool support.	2026-06-15 15:11:52 +09:00
cirim	056d1fb960	fix(llm): make connect timeout configurable Use a configurable LLM_CONNECT_TIMEOUT for call and stream connect budgets instead of the previous hard-coded 3s default.	2026-06-15 15:11:38 +09:00
Muhammed Midlaj	4b0a977988	fix(models): probe /v1/models for path-less LM Studio endpoints Probe /v1/models for path-less OpenAI-compatible model endpoints and surface clearer LM Studio diagnostics with the actual probed URL.	2026-06-15 15:09:50 +09:00
Boudbois2271	54690997ec	fix(calendar): treat same-day list_events range as full day Expand zero-width or inverted list_events windows to one day so start=end single-day queries return that day's events.	2026-06-15 15:09:19 +09:00
Wes Huber	be046dd29a	fix(cookbook): preserve state during lifecycle tick Log malformed cookbook state and re-read fresh state before writing scheduled-stop mutations so concurrent UI changes are preserved.	2026-06-15 15:07:03 +09:00
holden093	4c41834dc7	fix(youtube): consolidate duplicate handler Make src.youtube_handler a compatibility wrapper around services.youtube.youtube_handler so transcript state, URL parsing, and timeout behavior no longer diverge.	2026-06-15 15:03:41 +09:00
holden093	96052c5e8a	fix(agent): add contacts domain to tool classifier Add a contacts domain rule pack and deterministic contact intent detection so contact prompts surface resolve_contact/manage_contact tools.	2026-06-15 15:03:19 +09:00
adabarbulescu	afc81bdd7b	fix: drop thinking deltas from background agent loops Skip thinking-only deltas when accumulating background, scheduled-task, and teacher captured reply text.	2026-06-15 15:03:09 +09:00
Dividesbyzer0	a07fe35936	fix(agent): honor explicit web search requests Promote explicit web-search phrasing to tool use and keep web_search/web_fetch available for that turn even when the stale web toggle is false.	2026-06-15 15:02:10 +09:00
RaresKeY	a7766d0b7f	fix(agent): honor auth-disabled tool access after setup Check explicit auth-disabled mode before configured-admin ownership checks so single-user mode keeps full agent tool access after setup.	2026-06-15 15:01:48 +09:00
Tom	2857723e47	fix(security): restrict API-key encryption key file to 0o600 Lock the API key encryption key file to owner-only permissions on creation and when reading existing keys, with regression coverage for permissions and encryption roundtrip.	2026-06-15 15:00:11 +09:00
Michael	a633611823	fix(agent): let retrieval run for non-English low-signal queries Allow non-workspace low-signal prompts to fall through to tool retrieval so non-English requests are not limited to always-available tools.	2026-06-15 14:58:56 +09:00
muhamed hamed	3b3c0d6254	fix: detect HuggingFace token when downloading cookbook models (#3459 ) Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 21:53:16 +01:00
Mazen Tamer Salah	f5c1eb4b9d	fix(settings): degrade load_features to defaults on PermissionError load_settings() already catches PermissionError, but load_features() caught only FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable data/features.json (e.g. root-owned after a deploy) therefore raised instead of falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything that reads feature flags. Add PermissionError to the except tuple to match load_settings(). Adds tests/test_load_features_permission_error.py. Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 21:20:10 +01:00
Marius Popa	2a4bba2b9e	fix(api-keys): preserve encrypted keys when saving providers (#1920 ) * fix(api-keys): preserve encrypted keys when saving providers * test(api-keys): cover malformed raw key entries --------- Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 18:23:54 +01:00
Kenny Van de Maele	620fdd0859	feat(agent): confine agent file/shell tools to a selectable workspace (#3665 ) * feat(agent): workspace confinement via context-local binding + get_workspace tool Bind the per-turn workspace once in execute_tool_block; the shared path resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so file tools + bash/python are confined centrally and a new tool that uses the shared helpers cannot accidentally bypass it. Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory modal (reusing existing modal/button CSS), the /workspace slash command, and a get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic (realpath/normcase/commonpath) and docker-safe (container paths, no host assumptions). Reopens #2023. * ux(workspace): clarify workspace is not a sandbox Picker modal note + pill tooltip + get_workspace tool/output wording now state plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder, but bash/python only start there (cwd) and are not sandboxed. Modal note reuses the existing .muted class. * fix(agent): treat an active workspace as file-work intent A vague low-signal message (e.g. "look at the local project") matches no domain keywords, so tool retrieval is skipped and only always-available tools are offered — leaving the agent with no file access even though a workspace is set. When a workspace is active, include the file/code tools (incl. get_workspace) on low-signal turns so the agent can act on the folder. Also requires the tool index (ChromaDB) to be reachable for normal retrieval; that is an environment dependency, not part of this change. * ux(workspace): hide pill + overflow entry in chat mode Workspace only scopes the agent's file/shell tools, so the pill and the overflow 'Workspace' entry are agent-only now — hidden in chat mode like the bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is called from the agent/chat setMode handler. * prompt(tools): steer bash/python to defer to the dedicated file tools bash/python schema descriptions (what native-tool-calling models read) were bare and gave no steer, so models would do file ops via the shell (e.g. writing SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python in the schema + tool-index + prompt section to prefer read_file/write_file/ edit_file/grep/glob/ls and only be used for what those do not cover. * prompt(tools): keep bash/python deferral generic (no hardcoded tool names) Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc. by name, so the guidance does not go stale if those tools are renamed. * style(workspace): drop em-dashes from added code comments/strings * ux(workspace): terser non-sandbox note in picker (no tool-name list) * ux(workspace): mirror terse non-sandbox wording in pill tooltip * chore: untrack local venv symlink (run-only, not part of the feature) * prompt(workspace): keep get_workspace text generic (no hardcoded tool names) * fix(agent): low-signal + workspace surfaces only read-only file tools Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but not write_file/edit_file/bash/python -- those wait for a request that actually calls for them (RAG retrieval still adds them on a real ask). * feat(workspace): cap browse listing at 500 dirs with a truncated hint Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local _MAX_BROWSE_DIRS so a directory with thousands of children does not dump every row into the picker; the response carries a truncated flag and the modal tells the user to type a path to jump in. * chore: untrack local venv symlink (run-only artifact) * fix(workspace): vet the workspace root against the sensitive-path deny list at bind time The in-workspace resolver deny-lists sensitive paths inside the workspace, but the empty-path search root is the workspace itself, so a workspace of ~/.ssh could be listed via ls with no path. vet_workspace() (public, in tool_execution next to the resolvers) rejects non-directories and sensitive roots before the path is ever bound; chat_routes uses it instead of its inline isdir check. * fix(workspace): reject filesystem roots and stop showing rejected workspaces as active Review findings from #3665: P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes every absolute path 'inside' the workspace and collapses confinement into host-wide file access. A root is its own dirname, so reject when dirname(resolved) == resolved; the browse response now carries a selectable flag and the picker disables 'Use this folder' on unselectable dirs. P3: /workspace set stored any string client-side and the chat route silently dropped rejected values, so the pill could claim a confinement that was not in effect. New admin-gated /api/workspace/vet validates manual paths before they persist (canonical path returned), and when a posted workspace is rejected at send time the stream emits workspace_rejected so the client clears the stored value and toasts instead of continuing silently. * fix(workspace): check caller privilege before vetting the posted workspace Review finding: /api/chat_stream called vet_workspace() on the posted value for every caller and emitted workspace_rejected on failure, so a non-admin who can chat but cannot use file/shell tools could distinguish existing directories from missing/file/sensitive/root paths by whether the event appeared. The resolution now lives in _resolve_request_workspace, which drops the submitted value uniformly for non-admin callers, with no vetting and no event, before the path ever touches the filesystem. Admin and single-user behavior is unchanged. Test pins that valid and invalid paths are indistinguishable for a non-admin and that vet_workspace is never invoked for them.	2026-06-11 18:17:54 +02:00
Michael	95c54ac3cb	fix: use _truncate for tool output display limits in agent_loop (#3831 ) Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit truncation indicator when content is cut. Scoped down from the original PR: only agent/tool-output display behavior, no integrations.py changes. Co-authored-by: michaelxer <michaelxer@users.noreply.github.com> Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-11 17:05:13 +01:00
Kenny Van de Maele	263d41c58a	fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers (#3945 ) * fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers _apply_local_cache_affinity adds session_id + cache_prompt for llama.cpp KV-cache slot affinity (#2927), gated on _is_self_hosted_openai_compatible, which treated any unknown OpenAI-compatible host as self-hosted. Strict cloud providers added as custom endpoints (Mistral at api.mistral.ai) reject unknown body fields, so every request failed with 422 extra_forbidden. Self-hosted now also requires the endpoint to resolve as local via model_context.is_local_endpoint: loopback/private/tailscale host, or endpoint kind explicitly configured as "local" (the escape hatch for tunneled self-hosted servers). is_local_endpoint is promoted to a public name since llm_core now shares it. Fixes #3793 * test(llm): sweep cloud OpenAI-compatible hosts in affinity gating Parametrized cases adapted from #3839 (credit: Shabablinchikow): deepseek, x.ai, together, fireworks, and the Gemini OpenAI-compat endpoint must all stay free of the llama.cpp extras, not just the Mistral host from #3793. * fix(llm): narrow the Tailscale range to 100.64.0.0/10 in is_local_endpoint Review finding on #3945: _PRIVATE_PREFIXES carried a bare "100." prefix, treating all of 100.0.0.0/8 as local while Tailscale only uses the CGNAT block 100.64.0.0/10. Public 100.x hosts (e.g. AWS ranges outside the block) were classified local and still received the llama.cpp extras this PR exists to keep away from strict providers. Match the narrowed classification routes/model_routes.py already uses, with boundary tests just below, inside, and just above the range.	2026-06-11 17:51:03 +02:00
Mazen Tamer Salah	f941db29d3	fix(search): batch FTS hit lookups into one query (N+1) (#3909 ) _search_fts ran the FTS MATCH query, then looked up each hit's full row with its own db.query(...).filter(id == message_id).first() inside a loop, so a search returning N hits issued N extra SELECTs. Fetch all hit rows in a single IN(...) query via _fetch_messages_by_id and reassemble results in hit (relevance) order. Adds tests/test_session_search_batch_fetch.py asserting a single batched query (and no query for empty input). Existing session-search tests stay green.	2026-06-11 16:31:54 +02:00
RaresKeY	c500bcb47d	fix(uploads): migrate upload ownership on rename (#3617 )	2026-06-11 16:01:04 +02:00
Mazen Tamer Salah	f7a3605b16	fix(webhooks): keep references to in-flight delivery tasks (#3859 ) fire() and fire_and_forget() scheduled delivery with bare create_task()/ loop.create_task() and kept no reference. asyncio holds only a weak reference to a task, so the GC could collect a delivery (or the fire() coroutine itself) before it completed, silently dropping the webhook. Track in-flight tasks in a set on the manager via a _spawn_tracked() helper that holds a strong reference for the task's lifetime and discards it on completion (add_done_callback), and route both schedule sites through it. Adds tests/test_webhook_task_refs.py.	2026-06-11 15:53:52 +02:00
George Lawton	4f48cfa9ae	fix: omit temperature for Opus 4.7+ on native Anthropic path (#3117 ) Anthropic removed the sampling parameters (temperature, top_p, top_k) starting with Claude Opus 4.7 — sending temperature at all, even 0.0, returns HTTP 400. _build_anthropic_payload sent it unconditionally, so every native-Anthropic request to Opus 4.7/4.8 failed: the research probe (ResearchHandler._probe_endpoint, temperature=0) aborted runs before they started, and all DeepResearcher._llm calls 400'd. Add _anthropic_rejects_temperature (version-gates opus-N-M >= (4,7)) and omit temperature in the Anthropic builder for those models. Older Claude models (Opus 4.6 and below, Sonnet/Haiku) keep temperature and the existing [0,1] clamp. The version gate is hardened against real-world model id shapes: - a word-boundary anchor so a substring like `octopus-4-8` is not read as Opus and stripped of temperature; - a 1-2 digit minor cap so a dated id such as `claude-opus-4-20250514` (Opus 4.0, listed in ANTHROPIC_MODELS) parses as major-only and keeps temperature, while dated 4.7+ snapshots still match; - a non-string guard so a non-string model can't raise AttributeError (the previous builder never called .lower() on it). Adds regression tests covering 4.7/4.8 omission, older/dated/legacy retention, the substring overmatch, and non-string inputs. Fixes #3065 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 16:27:40 +03:00

1 2 3 4 5 ...

430 Commits