Replace the flat dump of every model in the chat-input picker with a
quick-switch. Opening the picker now shows a search box, an auto-tracked
Recent list (last 5 picks), and a manual Favorites list instead of every
available model crammed into a 280px dropdown. With large catalogs
(e.g. OpenRouter's 350+ models) this was unusable as both a quick-switch
and a browser.
- Recent: each pick is recorded most-recent-first (capped at 5) under a
new odysseus-model-recent key, so the next open has it one click away.
- Favorites: an inline star on every row toggles favorite state and
writes the existing odysseus-model-favorites key, so the sidebar Models
section stays in sync. The star toggles only — it never picks the model.
- Search filters a flat list across the whole catalog; favorited rows
keep their filled star while filtered.
- Small catalogs (<=12 models) still list everything in browse mode so
tiny installs aren't forced to search for a model.
- Touch friendly: stars are always visible (no hover-reveal) and tap
targets grow on narrow screens.
No changes to sidebar visibility defaults.
Closes#399
The global Escape arbiter in ui.js only sees `.modal` elements, so the many
ad-hoc dropdowns and context popups that are built on the fly and appended to
<body> ignored Escape entirely: document-library card/chat menus, chat
context/stats/overflow popups, cookbook serve & running menus, calendar event
menus, and compare pane menus.
Add a small DOM-free dismissal registry (static/js/escMenuStack.js). Menus
register a dismiss callback while open, and the arbiter closes the
most-recently-opened one first, so a menu opened over a modal closes before the
modal. bindMenuDismiss() wires the ubiquitous "append-to-body, close on outside
click" idiom to both the outside-click listener and the Escape stack in one
call, and dismissOrRemove() lets the pre-existing bulk removers (scroll/swipe/
modal-dismiss cleanup, reopen sweeps) tear a menu down through its real teardown
instead of orphaning its stack entry.
Covers ~14 menus across documentLibrary, chatRenderer, cookbookServe,
cookbookRunning, calendar, and compare/panes. Every teardown path — item click,
outside click, swipe, toggle, rebuild, bulk cleanup — routes through the
registry so no entry is ever stranded.
tests/test_esc_menu_stack_js.py pins the registry's LIFO and
exactly-one-per-press guarantees (node-driven; skips when node is absent).
- Turn the "/setup" text on the welcome screen and fallback state into a clickable link that automatically runs the setup command.
- Add an interactive down-arrow "Use in Chat" button next to copy button on typewriter-generated setup code blocks.
- Programmatically trim the "..." placeholder when inserting API keys, focusing the cursor right after "sk-".
- Implement click-delegation for supported provider spans and raw code elements inside the setup guide to instantly pre-populate the input bar.
Library, Notes, and the other floating tool windows (Tasks, Calendar,
Gallery, Email, Cookbook, Brain, Settings, Theme, Compare, Research,
Sessions) could be moved and snapped but never resized — there were no
resize handles and dragging the edges did nothing.
Add a shared makeWindowResizable() helper and wire it into the existing
makeWindowDraggable() so every draggable window gains native-style
edge/corner resizing from one place:
- Grab any of the four edges or four corners to resize; the cursor
reflects the active handle (ew/ns/nwse/nesw-resize).
- Detects pointer proximity to the border instead of injecting handle
elements, so it works regardless of each window's overflow model
(.modal-content scrolls its body; .notes-pane scrolls an inner el).
- Min-size clamp (320x200) and viewport clamping so a window can't be
collapsed to nothing or dragged off-screen.
- Per-window size is remembered and restored on reopen.
- Disabled on mobile (windows are full-screen sheets there) and while a
window is docked or fullscreen-snapped.
- Touch supported at tablet width and up; self-heals a missed pointer-up
so a lost mouseup can't leave a window stuck in resize mode.
The photo-detail view is an absolutely-positioned (inset:0) overlay
inside .gallery-images-container, so its height resolved to the photo
grid sitting behind it. When the library has only a few photos that grid
is short, which crushed the detail view: the image was clipped and the
metadata sidebar (overflow-y:auto) was squeezed into a tiny,
internally-scrolling strip. With a large library the grid is tall, which
is why the panel looked fine in the demo video but cramped for users with
few photos.
When the detail view is open, hide the grid-view siblings and drop the
overlay into normal flow so the container -- and the window, up to its
existing 92vh max-height -- sizes to the detail's own content (image +
metadata). Nothing is clipped or squeezed regardless of how many photos
exist. Works on both desktop and the mobile full-screen sheet; the grid,
albums and editor views keep sizing to their own content.
Also add before/after comparison screenshots (docs/gallery-314-*.png).
The Settings window inherited the base `.modal` vertical centering
(`align-items:center`). Its height is content-driven, so every tab is a
different height — and a vertically centered window grows and shrinks
around its own midpoint, making the in-modal nav rail (and the whole
window) appear to jump vertically when switching between pages.
Top-anchor the Settings window on desktop (`align-items:flex-start` plus
a fixed `margin-top`) so the top edge stays put and the panel only ever
grows downward. Scoped to desktop only — on mobile the panel is a
full-height bottom sheet that is already stable. Opening and dragging the
window both clear the inline margin/top, so window placement is otherwise
unchanged.
Fixes#208
Two bugs hid the popup that opens on double-click (or right-click) of
a GPU button in the Serve panel:
1. z-index 240 vs the cookbook modal at 260 — popup rendered behind
the modal it was spawned from.
2. Horizontal position was just `button.left`, with no clamp against
the viewport. GPU buttons sit near the right edge of the modal, so
the popup got anchored at a left that pushed most of its body past
the viewport's right edge.
Switch the popup to position:fixed (escapes scrolling / transform
stacking contexts on any ancestor), bump z-index to 10010 (above the
themed-confirm / overlay layer that sits around 9000-10000), and
clamp left/top after measuring the rendered size — including flipping
above the button if there isn't room below. The popup is now fully
visible regardless of which GPU button it's anchored to or how
narrow the viewport is.
The collapse handler waited a fixed itemCount*25+230ms for the
section-domino-out keyframes, but the CSS rule only targeted .list-item.
#models-section uses .models-row, so the rule matched nothing: no
animation played and itemCount was 0, leaving a flat ~230ms pause before
the section snapped shut.
- CSS: the collapse/expand animation rules now match
:is(.list-item, .models-row) so the Models rows actually animate.
- JS: drive the collapse off the real animations via getAnimations()
instead of a hard-coded timeout. Wait only on the section-domino-out
keyframes (ignoring unrelated/infinite animations); collapse
immediately when nothing animates so there is never a dead pause. A
generation token neutralizes stale callbacks from rapid toggles, with
a 600ms safety net so a section can't get stuck open.
The email auto-calendar pass (settings.email_auto_calendar / the
extract_email_events task) scans recently received mail and lets an LLM
create / update / cancel calendar events. Two problems made it a cross-tenant,
remotely triggerable hole:
1. No owner scoping. _auto_summarize_pass(account_id=None) fans out over EVERY
enabled account of EVERY user. For each message it fetched an upcoming-events
snapshot with NO owner filter (all tenants' events) and handed those uids +
titles to the extraction LLM, then executed the model's ops via
do_manage_calendar(...) with owner=None. do_manage_calendar only filters by
owner when owner is not None, so create/update/delete ran across ALL users'
calendars. Net: every user's event titles/times were disclosed to the model,
and the model could cancel/move/duplicate any tenant's events by uid.
2. No prompt-injection wrapping. The raw email From/Subject/body were
interpolated straight into an instruction-shaped extraction prompt (unlike
the chat path, which wraps external text via src/prompt_security). Anyone
who can email a user whose instance has auto-calendar enabled could inject
operations: create attacker-controlled "meeting" events (the path even
auto-harvests URLs from the body into the event location/description — a
phishing primitive) or cancel/modify the victim's real events, with zero
human in the loop.
Fix:
- Add core.database.get_upcoming_events(owner) and use it for the snapshot, so
the LLM only ever sees the processed account owner's events.
- Look up the EmailAccount owner in _auto_summarize_pass_single and pass owner=
to every do_manage_calendar call, so create/update/delete are scoped to that
user (owner=None stays the single-user / legacy escape hatch).
- Tell the extraction model the email is untrusted data and not to follow
instructions inside it (defense-in-depth against injection).
Add tests/test_calendar_owner_scope.py: get_upcoming_events returns only the
given owner's events (and everything when owner is None). Fails against the old
unscoped query.
* fix: run bcrypt off the event loop in auth routes
The auth routes are async, but each bcrypt call ran synchronously on the event
loop. bcrypt (checkpw/hashpw) is intentionally CPU-expensive (~100-300 ms), so
every login / signup / setup / change-password froze the single event loop for
that window, stalling all other in-flight requests (chat streams, polling, ...).
/api/auth/login is the worst case: it is reachable unauthenticated, runs bcrypt
twice (verify_password, then create_session re-verifies), and is rate-limited
only per-IP. A burst of login attempts serializes the whole server — cheap
DoS amplification.
Offload the bcrypt-bearing AuthManager calls (setup, signup/create_user,
login's verify_password + create_session, change_password) via
asyncio.to_thread, matching how the codebase already offloads blocking work
(e.g. src/builtin_actions._run_subprocess, email summarize). The event loop
stays responsive while bcrypt runs on a worker thread.
Add tests/test_auth_event_loop.py: asserts login runs verify_password and
create_session on a worker thread, not the loop thread. Fails if those calls
are awaited inline again.
* test: isolate auth event-loop test from heavy core/* import chain
The regression test imported routes.auth_routes, which pulls in
core.auth and so triggers core/__init__.py — transitively importing
src.llm_core (hangs at import under the project venv) and the SQLAlchemy
declarative models (metaclass error on a bare core.database import / under
the conftest sqlalchemy stubs). Reported by the maintainer: collection
failed on system Python and hung under the venv.
Stub core.auth/core.database before the import, mirroring the existing
_ensure_stub pattern in test_auth_regressions.py and test_null_owner_gates.py.
AuthManager is only a type hint here and the handler is exercised with a
MagicMock, so no real core machinery is needed. Test now imports cleanly
and passes in <0.3s without bcrypt/sqlalchemy installed.
/api/auth/settings is auth-exempt (the frontend + the pre-login page read it for
keybinds/TTS prefs), so non-admin and unauthenticated callers get a scrubbed
copy. The previous scrub only blanked TOP-LEVEL string values whose key matched a
short suffix list — so a secret nested under a non-secret parent key, or stored
under a key outside the list, would leak. A real exposure when the app is
reachable over a Cloudflare tunnel / reverse proxy.
- src/settings_scrub.py: NEW stdlib-only module with the scrub helpers (deep/
recursive; broadened secret-key patterns). Kept separate from auth_routes so it
imports + unit-tests WITHOUT pulling the FastAPI / auth / database chain
(addresses review: the test no longer fails at collection on the DB import).
- routes/auth_routes.py: import scrub_settings from the module.
- tests/test_settings_scrub.py: import the tiny module directly.
Ran: pytest tests/test_settings_scrub.py (8 passed); verified the test pulls no
db/auth modules into sys.modules; py_compile routes/auth_routes.py.
Co-authored-by: Kanaru92 <107661007+Kanaru92@users.noreply.github.com>
The context compactor computed split_point against convo_msgs (system
messages filtered out) but applied it directly to session.history which
includes the system messages. After compaction, the original system
prompt was dropped and replaced by an off-by-N slice of the full history.
This silently dropped the system prompt (preset, persona, RAG context)
from every compacted session — the model would lose persona, RAG, and
preset guidance on the next turn after a long conversation.
The split in maybe_compact does:
convo_msgs = [m for m in messages if m['role'] != 'system']
split_point = len(convo_msgs) // 2
so split_point is indexed against the system-stripped list. But the
helper _update_session_history took (session, split_point, summary) and
did session.history[split_point:]. session.history is the full list
including the leading system messages, so this dropped the first
system_msg_count messages.
Fix: pass system_msg_count=len(system_msgs) into _update_session_history
and use session.history[system_msg_count + split_point:] as the recent
slice, with session.history[:system_msg_count] prepended to preserve
persona/preset/RAG system messages.
Validated: tests/test_compactor_data_loss.py both tests now pass (were
failing). tests/test_context_compactor.py 12 pre-existing tests still
pass.
Symptom was: post-compaction history = [summary] + assistant_1 + user_2
+ assistant_2 (system_A was lost).
Co-authored-by: Ernest Hysa <ernest@example.com>
* fix: extract full year in research query entities, not just the century
* fix: same year capture-group bug in the services search copy
* test: research query extracts the full year
get_tool_index() calls index_builtin_tools() on first init
(src/tool_index.py:469-470), and _warmup_tool_index then calls it
explicitly right after. Every cold boot embeds all 58 built-in tools
twice and double-upserts them into the ChromaDB collection.
The remaining get_tools_for_query call still pre-warms the query path.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
Small update to the styles that bothered me, i noticed in the window/modal for calendar when editing a day the time icons had a mask that overlapped the icon. I simply added 'background-image: none' prop to it/
Two bugs caused GPU inference to silently fall back to CPU inside the
Odysseus Docker container even when the GPU was correctly passed through.
## entrypoint.sh — CUDA_HOME detection only covered CUDA 13.x wheels
The nvcc glob only searched
vidia/cu13, which matches the
vidia-nvcc-cu13 pip wheel layout. CUDA 12.x wheels install nvcc to
vidia/cuda_nvcc/bin/nvcc (nvidia-cuda-nvcc-cu12) or
vidia/cu12
(nvidia-nvcc-cu12) — completely different paths. The glob found nothing,
so CUDA_HOME was never set.
Worse, VLLM_USE_FLASHINFER_SAMPLER=0 was inside the same if-block, so it
was never set either. vLLM then tried to JIT-compile the FlashInfer
sampler at startup, failed with 'Could not find nvcc', and crashed — even
though the GPU was fully visible to the container.
Fix: expand the search to also check nvidia/cu12 and nvidia/cuda_nvcc.
Move VLLM_USE_FLASHINFER_SAMPLER=0 to an unconditional export after the
loop (it is sampler-only, no impact on the attention path, and the correct
setting for any container where CUDA headers may be incomplete).
## cookbook_routes.py — llama.cpp Linux source build silently fell back to CPU
The cmake invocation was:
cmake -B build -DGGML_CUDA=ON 2>/dev/null || cmake -B build
2>/dev/null suppressed all configure errors. When nvcc is absent (the
slim base image has no CUDA toolkit — intentional), cmake fails silently,
then the || fallback re-runs without -DGGML_CUDA=ON. A CPU-only binary is
produced with no warning. Additionally, a stale CMakeCache.txt from the
failed CUDA attempt was reused (no rm -rf build), poisoning the next
configure run. The macOS branch already did rm -rf build for exactly this
reason; the Linux branch did not.
Fix: before cmake, detect pip-installed nvcc across the same three path
patterns as entrypoint.sh and expose it via CUDA_HOME/PATH. If nvcc is
found, run a clean CUDA build with full error visibility. If not, fall
back to a CPU build with an explicit warning telling the user how to get
a GPU build (install vLLM via Cookbook -> Dependencies, which brings the
CUDA wheels including nvcc, then re-launch).
## .env.example — document Windows COMPOSE_FILE separator
Added a comment showing the semicolon separator required on Windows
Docker Desktop alongside the existing colon-separator (Linux) example.
* fix: drop over-broad 'cookie'/'copyright' low-quality markers
* fix: detect cookie/copyright boilerplate via phrases, not bare words
* test: keep research findings that merely mention cookies or copyright
The memory import-review list (.memory-suggestions) is shown inside the
overflow:hidden .admin-card but, unlike the sibling .memory-list, it had
no scroll bounding of its own (no flex:1 / min-height:0 / overflow-y).
A long review list therefore grew past the card and was clipped, leaving
lower entries and their controls unreachable with no usable scroll area.
Give .memory-suggestions the same flex:1 + min-height:0 + overflow-y:auto
bounding the memories list already uses so the review list scrolls
internally within the modal. Pin the review header (the title and the
save all / back controls) with position:sticky so they stay visible while
the items scroll under them, and add a small scrollbar gutter so the bar
does not sit flush against the item cards.
Fixes#455
* fix: fail fast when ChromaDB is unreachable instead of blocking startup
* fix: only cache the ChromaDB client after a successful heartbeat
* test: cover ChromaDB fast-fail preflight and no-cache-on-failure
Hardens issues found in a security review of the current tree (separate from
the cookbook SSH PR):
- Email thread rendering (static/js/emailLibrary.js): the flat read path runs
inbound HTML through the allowlist sanitizer, but the two threaded paths
(_renderTurnsAsBubbles / _renderTurnsFromServer — the default view) injected
server-parsed `body_html` raw into the DOM. A crafted inbound email could
inject arbitrary markup (phishing/form/credential-capture/tracking; full XSS
if a deployment relaxes the script CSP). Now sanitized on all paths.
- Attachment extraction (routes/email_routes.py, routes/email_helpers.py): the
on-disk extraction dir was `ATTACHMENTS_DIR / f"{folder}_{uid}"` with
user-controlled folder/uid and no containment, so a folder like `../../tmp`
could escape ATTACHMENTS_DIR. New attachment_extract_dir() flattens both to a
single safe segment and asserts containment.
- Diagnostics routes (routes/diagnostics_routes.py): /api/db/stats,
/api/rag/stats, /api/test/youtube, /api/test-research relied only on the
global session check (any logged-in user). Now require_admin-gated.
- Defense-in-depth HTML escaping: session HTML export escapes the session name
(routes/session_routes.py); the MCP OAuth page escapes the reflected Host
header / server_id (routes/mcp_routes.py).
- Internal-tool token now compared with secrets.compare_digest (constant time)
in core/middleware.py and app.py.
Adds regression tests in tests/test_security_regressions.py.