On a large Gmail mailbox the email-summary poller's SINCE scan often finds
nothing (INTERNALDATE/date-header quirks), so it falls back to SEARCH ALL. That
returns one enormous UID line; the socket read can time out mid-response, and the
exception was swallowed — leaving the unread '* SEARCH 325188 …' bytes on the
socket. The next command (the downstream re-select) then read those leftover
bytes and failed with 'EXAMINE => unexpected response: b'325188 …''.
Extract the fallback into _latest_inbox_fallback_uids(conn, reconnect): on a
failed SEARCH ALL it logs out the poisoned connection and reconnects, returning
the fresh connection for downstream use. Reconnecting is correct by construction
— a new connection cannot carry the old one's leftover bytes — so the re-select
always runs on a clean socket.
The same SEARCH ALL + reuse pattern also exists in mcp_servers/email_server.py
and routes/email_routes.py; left for a separate change to keep this surgical.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.
Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.
- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME
Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
POST /api/calendar/test issues a single PROPFIND with raw httpx
Basic auth. CalDAV servers configured for Digest (Baïkal default,
SabreDAV-based servers, Radicale with htdigest) reject Basic with
401, so the UI "Test connection" button surfaces "Auth failed —
check username/password" even when the URL and credentials are
correct.
src/caldav_sync.py (the real sync path) uses caldav.DAVClient,
which negotiates the scheme via niquests, so production sync
already works against these servers. The test endpoint just
doesn't match. Bring it to parity: keep the cheap Basic first
attempt, and on a 401-with-Digest-challenge retry once with
httpx.DigestAuth before deciding it's an auth failure.
Repro: configure CalDAV against a stock Baïkal install — test
button returns 401, sync succeeds.
Co-authored-by: Shatti2 <codered5678@gmail.com>
On Windows, Python defaults to the active code page (cp1252) for
subprocess I/O. HuggingFace CLI outputs U+2713 (✓) when validating
tokens, which cp1252 cannot encode, crashing the download process.
Set PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 in the subprocess
environment so Unicode output from hf/pip/llama-server is handled
correctly.
Fixes#1543
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 'from urllib.parse import quote as _q' at line 734 shadows the
module-level _q (istrstrstrstrstrstrIMAPutility) imported from email_helpers, causing
UnboundLocalError at lines 191 and 278 where _q is used before the
local import executes. This silently breaks the entire auto-summarize
pass.
The AI document-tidy endpoint parses verdicts from LLM JSON output
and calls .lower().strip() directly. If the model returns null or a
non-string element, this crashes with AttributeError. Coerce to str
so malformed output is treated as 'keep' instead of crashing.
Every other uid.decode() call in this function uses
'uid.decode() if isinstance(uid, bytes) else str(uid)' but the
warning at line 832 does bare uid.decode(), crashing with
AttributeError when uid is already a string.
The shell pattern 'if [ $? -eq 0 ]; ... else ... echo DOWNLOAD_FAILED (exit $?)' always reports 'exit 1' because $? inside the else branch is the exit code of the [ test command, not the download. Capture into _ec first.
`_auto_summarize_pass_single` in `routes/email_pollers.py` opens a
long-lived IMAP connection at line 172 and then performs ~700 lines of
work — IMAP `select`/`FETCH`/`SEARCH`, network POSTs to the LLM
endpoint, SQLite writes, and per-uid awaits. The only `conn.logout()`
calls were on three safe paths (early `"No recent emails"`, early
`"No model configured"`, and the happy path at the very end). If any
exception fired between `conn` being created and the final happy path,
the outer `except` block at line 921 caught it, logged, and returned —
without ever calling `conn.logout()`. The IMAP socket leaked until
the server's idle timeout killed it.
This is the same shape as the just-merged upstream fixes#1325
(`_imap_move` in `routes/email_helpers.py`) and #1330 (`_list_emails_sync`
in `routes/email_routes.py`), but in the *background* poller path —
`_auto_summarize_poller` invokes it every 30 min, so the leak
accumulates on every crashed pass instead of being a transient
request-path leak.
The fix is the exact try/finally pattern from #1330:
1. initialize `conn = None` before the try
2. let the try-block assign `conn = _imap_connect(...)`
3. drop the three explicit `conn.logout()` calls on safe paths
4. add a `finally:` block that calls `conn.logout()` if `conn` was set
Tests in `tests/test_email_polly_imap_leak.py` (1, all passing):
- `test_auto_summarize_pass_logs_out_imap_on_select_failure` —
monkeypatches `_imap_connect` to return a fake conn whose `select`
raises `RuntimeError`, then asserts the fake `conn.logout` was
called exactly once and the function returned an `Error: ...`
string. Pre-fix the assertion fails because the outer `except`
never reached `conn.logout`; post-fix the `finally` block
guarantees it on every exit path.
Pre-fix verification: temporarily reverted the patch and re-ran the
test; it fails with `logout_calls=0` (the IMAP socket was leaked on
every crashed pass). Post-fix: `logout_calls=1`.
Uniqueness:
- `git log --all --oneline -S 'conn.logout' -- routes/email_pollers.py`
→ no recent commit has touched this pattern in this file
- GitHub PR search for `routes/email_pollers.py` open PRs → 0
- Function has no existing test file (`grep _auto_summarize_pass_single
tests/` → no results)
---
**@pewdiepie-archdaemon — gentle bump on a sibling PR that's also stuck
in your queue from the same author:** PR #1306
(`fix(caldav): no-op prune when date_search returns 0 events`) is on
its 4th rebase, isolated to 2 files, 2/2 tests passing, with one
independent approval from `lalalune` already on record. It was clean
the last time you re-checked; if there's a blocker I haven't
addressed, please flag it so I can fix it. Otherwise, both #1306 and
this one are ready to merge.
Co-authored-by: isharak7m <192635824+isharak7m@users.noreply.github.com>
* Stop multi-file uploads from tripping the per-IP concurrency guard
The /api/upload concurrency check summed its condition over `files`, but the
condition didn't reference the loop variable — so it collapsed to len(files)
whenever the IP had any recent upload. A single multi-file batch sent right
after another upload therefore counted itself as N concurrent uploads and hit
max_concurrent_uploads (3), returning 429. The browser swallows the 429 (no
`files` in the body) and sends the chat with no attachments, so the model
"doesn't even see" them (issue #1346).
Count genuine recent upload events instead, via a pure count_recent_uploads()
helper, independent of the current batch's file count. save_upload still
enforces the per-minute sliding-window rate limit per file, so throttling is
preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Also reconcile the per-minute upload rate limit with the batch cap
Follow-up within #1346: even after the concurrency-guard fix, a 6+ file batch
still failed because save_upload() counts each file against upload_rate_limit
(was 5/min) while the composer allows MAX_FILES=10 per batch — the reporter saw
"5 attachments work, 6 fail". Raise the per-minute file cap to 60 so a single
full batch (and a few of them) isn't self-rejected; burst abuse stays bounded by
max_concurrent_uploads. Add a real 6-file regression + a config guard that the
cap exceeds the frontend MAX_FILES.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
POST /api/skills/{id}/markdown set sk.name = slugify(sk.name or match['name']),
taking the name parsed from the edited markdown frontmatter. A changed name
makes update_skill() move the skill directory on disk and re-key its usage
sidecar, orphaning the original id. The UI still holds that original id, so the
next DELETE /api/skills/{id} fails the name/id lookup and 404s — 'can't delete
them now'.
The audit save path (_apply_skill_md) already guards against exactly this with
sk.name = name and an explicit 'must NEVER rename the skill' comment. Apply the
same pin here: keep the stored name on markdown save (content edits still take
effect; only the rename is suppressed). Drops the now-unused slugify import.
Adds tests/test_skill_save_no_rename.py: saving markdown whose frontmatter
renames the skill keeps the original name and applies the edit, and a
subsequent delete-by-original-id succeeds. Pure unit test — calls the route
handlers directly with a mock Request (no server/network), like
test_skills_delete_owner.py.
Co-authored-by: lalalune <shawgotbags@gmail.com>
If any exception occurred after conn was created but before the
explicit conn.logout() call, the IMAP connection leaked. Use
try/finally to guarantee cleanup on all exit paths.
If c.store() or c.expunge() raised an exception, the connection was
never logged out. Use try/finally to ensure c.logout() is always
called regardless of how the function exits.
* fix: pass owner to start_research in chat stream path
Research launched from the chat stream omits the owner parameter,
causing those research sessions to never appear in the user's
research library (which filters by owner). All other start_research
call sites in this file already pass owner=_user.
* test: assert all start_research calls in chat_routes pass owner
Uses AST inspection to verify every start_research() call site
includes the owner= keyword argument, preventing regressions where
new call sites forget to scope research by user.
* fix: Cookbook local GGUF serving inside Docker
Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint.
Fixes included here:
use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub
fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts
preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal
Before this change, Docker local serves could fail in several ways:
Cookbook pointed llama.cpp at the wrong GGUF path
generated serve runner scripts crashed before launch with a shell syntax error
successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1
This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration
* test: add test for docker-local endpoint rewrites
* fix: closed document no longer stays active and leaks into new chats (#1160)
Closing a document tab calls _detachDocFromSession: a doc with content is
PATCHed to session_id="" (unlinked, session_id -> NULL, is_active stays True),
an empty one is DELETEd. But the in-memory active-document pointer
(tool_implementations._active_document_id) was never cleared on either path.
The chat doc-injection last-resort looks up that pointer by id and injects it
when `not cand.session_id or cand.session_id == session`. An unlinked doc has
session_id NULL, so the stale pointer re-surfaced a closed document in later,
unrelated chats — the agent kept reading/suggesting edits to a doc the user
had closed.
Fix: add clear_active_document(doc_id) and call it when a document is unlinked
(PATCH session_id="") or deleted, so the pointer no longer resurrects a closed
document. clear_active_document only clears when the id matches (or no id), so a
different active doc is left untouched.
Covered by tests/test_active_document_clear.py (4 cases).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: add route-level regression for #1160 (detach/delete clears active doc)
Per review: prove the actual API path, not just the helper. Drives
PATCH /api/document/{id} (session_id="") and DELETE /api/document/{id}
through TestClient against a temp SQLite DB under real owner routing, and
asserts get_active_document() is cleared (and untouched when a different
document is closed).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: make #1160 route regression hang-proof and dev-DB-independent
The route test could hang in other environments: it set DATABASE_URL at import
time, which is ignored if core.database was already imported, so it fell back to
the real dev DB and could contend for its locks (maintainer saw it hang, exit
124).
Rebind to a DEDICATED temporary SQLite engine (NullPool) and patch the document
route module's SessionLocal to it via an autouse fixture — so the test never
touches the dev DB and is independent of import order. Runs in ~0.3s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: drive #1160 route regression without TestClient (fixes local hang)
The route test used Starlette TestClient (middleware app + threadpool), which
hung in the maintainer's environment. Rework it to call the async route handlers
directly — extracted from the router — with a minimal fake request against a
temp-SQLite-patched SessionLocal. Same real coverage (handler + DB + owner
routing), but it completes reliably (~0.3s) with no TestClient/threadpool.
Verified the maintainer's exact batch now passes:
pytest tests/test_document_close_clears_active_route.py \
tests/test_active_document_clear.py \
tests/test_document_tool_owner_scope.py -> 14 passed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat: CalDAV write-back — push local event create/update/delete to the remote (#800)
CalDAV sync was pull-only (src/caldav_sync.py), so events created, edited, or
deleted in Odysseus on a CalDAV-backed calendar only changed local SQLite and
never reached the server — they silently vanished on the next pull and never
appeared on the user's phone (iCloud, etc.).
This adds the missing write half:
- src/caldav_writeback.py builds the VEVENT, re-discovers the remote calendar by
the same URL-hash the local id was derived from (the remote URL isn't stored),
and PUTs/DELETEs the event by UID via the caldav lib. The pure pieces
(build_event_ical, find_remote_calendar, push_event) take inputs by argument so
they unit-test against a fake client with no network.
- create/update/delete event handlers (routes/calendar_routes.py) call it
best-effort for caldav-sourced calendars only: the local DB stays the source of
truth, a remote failure is logged, never fatal, and local calendars are untouched.
Tests: tests/test_caldav_writeback.py (9, pure logic incl. iCal serialization,
hash discovery, create/update/delete orchestration) and
tests/test_caldav_writeback_route.py (3, route-level: a caldav calendar pushes,
a local one does not, delete pushes a delete). 12 passed.
Note: write-back re-discovers the remote calendar per write (the URL isn't
persisted locally); a follow-up could cache it. Live-iCloud verification needs a
real account — flagging for a maintainer pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: drive #800 route regression without TestClient (fixes local hang)
Same fix as the document route test: the CalDAV write-back route regression used
Starlette TestClient (middleware app + threadpool) which hung in the maintainer's
environment. Rework it to call the async create/delete calendar handlers directly
— extracted from the router — with a minimal fake request, temp-SQLite-patched
SessionLocal, and writeback_event stubbed to record calls. Same coverage (a
caldav calendar pushes, a local one does not, delete pushes a delete), completes
in ~0.3s with no TestClient.
Verified the maintainer's exact batch:
pytest tests/test_caldav_writeback.py tests/test_caldav_writeback_route.py -> 12 passed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dispatch_reminder call on line 699 references _gcu(request) which is
never defined. The local helper wrapping get_current_user is _owner.
Every POST to /api/notes/fire-reminder raises NameError and returns 500.
POST /api/embeddings/endpoint takes a user-supplied URL and immediately
makes an outbound httpx request to it with no validation. The admin gate
added earlier (PR #80) closed the unauthenticated-access part of #132; this
addresses the remaining request: validate the URL before fetching it.
Odysseus is local-first, so pointing the embedding endpoint at a loopback or
LAN server (local vLLM / llama.cpp / Ollama) is a normal setup — a blanket
private-IP block would break the primary use case. So the guard:
- always rejects non-HTTP(S) schemes (file://, gopher://, ftp:// …),
- always rejects the link-local range (169.254.0.0/16, incl. the cloud
instance-metadata 169.254.169.254 exfil vector) plus multicast /
reserved / unspecified, and IPv4-mapped-IPv6 forms of the above,
- keeps loopback/LAN allowed by default, and
- adds EMBEDDING_BLOCK_PRIVATE_IPS=true for full SSRF lockdown on exposed
multi-tenant deployments.
Logic lives in src/url_safety.py (stdlib only, resolver injectable) so it is
unit-testable without real DNS; the route calls it before the health-check
request. Covered by tests/test_url_safety.py (8 cases).
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use func.lower() when updating SQL owner columns, match prefs keys
case-insensitively, and normalize session usernames before comparing
during rename. Prevents silently skipping legacy mixed-case owner data.
Fixes#1165
The Cookbook's manual hardware simulator ("what if I had this setup") let users
pick a backend, but _apply_manual_hardware only accepted cuda/rocm/cpu_x86/
cpu_arm and silently coerced anything else to cuda. So selecting Apple/Metal
simulated a CUDA box instead — and ranked safetensors-only repos a Mac can't
serve, even though the rest of hwfit (services.hwfit.fit, the serve-command
generation) already supports Metal as GGUF-only via llama.cpp/Ollama.
Add "metal" to the accepted backends (now a named _MANUAL_BACKENDS set, kept a
subset of what fit.py understands) and set unified_memory=True for it — Apple
Silicon shares one memory pool with the GPU — while clearing that flag for the
discrete (cuda/rocm) and CPU backends. _apply_manual_hardware is lifted to
module scope so it is directly unit-testable; both route call sites are
unchanged.
Adds tests/test_hwfit_manual_backend.py, including an end-to-end check that a
simulated Metal box only recommends GGUF-servable models.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the Venice provider PR. Wire api.venice.ai into the three
host allowlists so Venice behaves like the other paid OpenAI-compatible
clouds:
- agent_loop: add api.venice.ai to _API_HOSTS so the agent sends native
OpenAI tool-call schemas (Venice supports function calling) instead of
degrading to fenced-block parsing.
- teacher_escalation: add api.venice.ai to _SOTA_HOSTS so the escalation
loop stays OFF for Venice (it's a paid top-tier API; no need to add
teacher-model latency).
- webhook_routes: add venice to KNOWN_PROVIDERS so the sync chat webhook
can auto-resolve base_url from provider=venice.
Tests: tests/test_venice_hosts.py pins tool-host matching + SOTA
classification for Venice; py_compile on touched modules.
Co-authored-by: Cursor <cursoragent@cursor.com>
Two bugs in the export_ics path:
1. X-WR-CALNAME was written raw: calendar names containing commas,
semicolons or backslashes produced invalid ICS (RFC 5545 §3.3.11
requires those characters to be escaped as \, \; and \\).
Fix: wrap cal.name in the existing _ics_escape() helper, which is
already used for SUMMARY, DESCRIPTION, and LOCATION on the lines
immediately below.
2. DTSTART and DTEND on non-all-day events always emitted the naive
ISO string (e.g. 20260602T100000) regardless of CalendarEvent.is_utc.
Consumers treat a naive datetime as floating/local time, so UTC
events imported into Google Calendar or Apple Calendar shifted by
the user's timezone offset. Fix: append 'Z' when is_utc is True,
matching the pattern already used by the serialise_event() helper
at line 408.
The Cookbook → Dependencies tab reported llama-cpp-python[server] as "not
installed" even when it was installed and usable for serving. The local check
looked up distribution metadata as pkg["name"].replace("_", "-") — for the
import name `llama_cpp` that yields "llama-cpp", but the module ships in the
`llama-cpp-python` distribution. importlib.metadata.version("llama-cpp") then
raised PackageNotFoundError and the package was marked missing (the import
itself succeeds, which is why serving still worked).
Derive the distribution name from the package's declared pip spec instead
(stripping [extras] and version markers), falling back to the munged import
name only when no pip spec is declared. New _pip_dist_name() helper.
Adds tests/test_cookbook_package_detection.py covering the llama_cpp mapping,
extras/marker stripping, plain names, the no-pip-spec fallback, and that the
route wires the helper in (guarding against the exact regression).