Wrap the height:28px / top:0 rule in @media (min-width:769px) so it
can't leak into mobile, where a different touch-friendly variant
already sets min-height:36px + top:-2px.
Base .memory-toolbar-btn is 24px tall at top:-4px. Bump the compose
button alone to 28px (4px taller) and top:-2px (moves down 2px) so
it reads as the primary action in the toolbar without affecting
Select/Refresh.
Transparent at rest, accent gradient animates in on hover with a 0.18s
ease transition. Drag affordance + col-resize cursor still work; the
stripe just stops bothering you when not touched.
Right-side handle mirrors the gradient direction (left-to-right
gradient flipped to right-to-left).
Gmail composer chips arrive with inline border:1px solid #ddd + an
assumed white background, so on dark themes they read as a barely-
visible white box with the filename invisible. Override .gmail_chip /
.gmail_drive_chip inside .email-reader-body:
- Strip inline width:386px / height:20px (use auto + max-width:100%),
- Re-flow as inline-flex with a 6px gap so icon + filename align.
- Background tinted with var(--fg) 4%, border = var(--border).
- Anchor uses var(--accent) and the filename span uses var(--fg) so
text is always legible regardless of theme.
- Icon img clamped to 16x16.
When the email-answered event fires (user just sent a reply, so the
source email auto-marks as done), the row was getting the .active
class instantly with no visible cue beyond the checkbox tick. Add a
brief .email-auto-done-flash class on the row that runs two keyframe
animations:
- email-auto-done-row: tints the row background with the accent for
~1.2s then fades to transparent.
- email-auto-done-check: pops the done checkbox to 1.4× with an
accent ring that expands outward over 0.6s.
Class self-removes after 1.2s so it doesn't replay on re-renders.
- Revert the email row layout — sender/date stay above subject again,
matching the original two-line item that the user actually wanted.
- The account filter chips (#email-lib-accounts) wrapped onto multiple
rows on desktop. Promote the mobile-only horizontal-scroll rule to
apply at every breakpoint so the chips always sit on one row with
overflow scroll, regardless of screen size.
Group row's auto-sort-sessions-btn padding-left 6→4, and
.sort-dropdown-item left padding 8→6 so 'Last Active', 'Newest First',
'By Folder', '↑↓ Rearrange', '● Select' all shift in by the same
amount, matching the Group nudge.
Subject was on its own line below sender/date. Move it inline so each
email occupies one row: sender capped at 35% width (ellipsis), subject
takes the remaining space (ellipsis), date pins to the right. Tighter
list density at the cost of dropping the spare line for snippet text
(none was being rendered anyway).
Two related fixes for the Chats section header:
- The 'manage' label only slid out when the button itself was hovered.
Add section-header-flex:hover to the reveal rule so hovering the sort
icon (or anywhere in the section header) also opens the label.
- Parent-hover opacity bumped 0.45 → 0.85 so the 'manage' text reads
much more clearly when revealed. Direct hover on the button still
pushes to full opacity 1.
The shared .section-header-btn:hover rule paints a tinted background
across all section header buttons. On the Chats manage button this
showed as a box behind the sliding 'manage' label, which the user
didn't want. Override background to transparent for that one button.
Email's 'new' label is absolutely positioned to the LEFT of the '+'
icon, which works there because the '+' is the visible/clickable
anchor. The chats manage button has no visible glyph at rest, so the
label was rendered outside the button's bounding box — hovering
'manage' lost the :hover state and clicking it missed.
Override list-item-plus-label inside chats-manage-btn:
position: static (in flex flow) + max-width:0 / max-width:80px
expand-on-hover so the button's clickable rect grows alongside the
text. Hover stays sticky; click hits.
Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:
Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
with a cross-link to AI Defaults. Provider hints use real clickable
anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
carries only Endpoint+Model with a cross-link to Search; Vision /
Default / Utility fallbacks unified under one numbered-row design
matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
Permissions toggle that expands the scope-edit panel, in-field
copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
under the button (drop-up on tight viewports); each integration
form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
row at the bottom that loops every category.
AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
persona's system prompt; empty/unknown falls back to warm-neutral.
Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
so dismissing a popup no longer closes the whole Settings modal.
Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
search/integrations/reminders for card h2 icons + the Added Models
sub-section icons.
Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
blocks; shorter subtitle wording.
Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
every integration form.
- Provider logos render inline next to the search fallback selects
and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
button.
- Image generation default flipped to off.
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.
Highlights:
- Ollama as a first-class backend in the Cookbook:
* Download input accepts ollama-style names (name:tag) → backend=ollama
* /api/cookbook/ollama/library (cached scrape of ollama.com + curated
fallback so classic models like qwen2.5 stay reachable)
* "Browse Ollama library" toggle below Download with size chips
* Engine=Ollama in hwfit toolbar merges the Ollama library into the
main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
pulled up 8px to align with the row's checkboxes, GGUF File dropdown
exposed for Ollama backend, GPU re-render on Edit serve restore,
_forceBackend flag so saved serveState wins over backend detection,
cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
Clear-offline buttons in Added Models toolbar, offline-pill removed
(opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
reachable (vLLM returns 404 on bare /v1, fully working endpoints
were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
* fix: expose supports_tools toggle for local endpoints in UI
Local endpoints (Ollama, vLLM, etc.) default to fenced tool blocks
when supports_tools is not set, which breaks tool calling for models
that support native function calling. The backend already supports
per-endpoint supports_tools overrides via the PATCH API, but there
was no UI to set it.
Add a 'Tools: Auto/On/Off' toggle button for local endpoints that
cycles through the three states:
- Auto (null): use the existing heuristic
- On (true): always use native function calling
- Off (false): always use fenced tool blocks
Fixes#3141
* docs: add screenshot of supports_tools toggle showing Auto/On/Off states
* Add Tools toggle screenshot for PR #3195
* refactor: convert Tools toggle to select dropdown per review feedback
Replace cycle-through button with a <select> dropdown for the
supports_tools tri-state setting. Options: Auto / On / Off with
explicit labels. Uses existing admin select styling. Fires PATCH
on change event. Same API contract (Auto=null, On=true, Off=false).
* Update Tools toggle screenshot (now dropdown select)
* fix: remove orphan screenshot and move Tools dropdown below button row
- Remove docs/screenshots/tools-toggle-three-states.png (unreferenced image causing test_no_orphan_images_in_docs to fail)
- Move Tools dropdown to its own line below Disable/Delete buttons, aligned right
- Keep Disable and Delete buttons grouped together per maintainer feedback
* fix: move Tools select onto same row left of Disable/Delete, use CSS class
Per vdmkenny feedback: move the Tools dropdown select from its own row below
the button group onto the same row, to the left of the Disable and Delete
buttons (which stay adjacent on the right). Replace inline style on the
button row with the existing .admin-ep-actions CSS class, adding
align-items:center for proper vertical alignment.
* chore: remove committed screenshots from tree
Screenshots should be in PR description/comments, not in repo history.
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
* fix: hide Select buttons in memory/skills tabs when list is empty
* fix: disable Select buttons instead of hiding them when list is empty
* fix: dim disabled Select button and remove focus outline
* fix: reload skills after single deletion so count and toolbar stay in sync
* fix: lower minimized-dock z-index from 10020 to 100 so modals stack above it
* Revert "fix: lower minimized-dock z-index from 10020 to 100 so modals stack above it"
This reverts commit 5b092ee6cd.
* feat(skills): import SKILL.md bundles from public GitHub URLs
Supports GitHub tree/blob/raw links and skills.sh pages that resolve to GitHub.
Installs SKILL.md plus sibling text assets under data/skills/imported/.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): admin-gate URL import and validate redirect hosts
- require_admin on POST /api/skills/import-from-url (matches other skill admin routes)
- reject cross-host redirects after httpx follow_redirects
- test for redirect host validation
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): match Brain Add panel import/submit button styles
- Skill URL Import: theme-io-btn + download icon (same as memory Import)
- Add Skill submit: confirm-btn confirm-btn-primary
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): allow api.github.com during directory import
Real imports hit the GitHub contents API after redirects; whitelist
api.github.com and add regression tests. Shrink Import button with flex:none.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): align skill Import button with URL input row
Match memory-add-input height (28px) in memory-add-row and center the
download icon with flexbox instead of vertical-align hacks.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): cancel modal-body margin on skill Import button
The skill Import button sits in .memory-add-row beside an input; the
global .modal-body button { margin-top: 6px } rule only affected buttons,
pushing Import down and misaligning the download icon. Reset margin-top
and match Memory Import SVG markup at 28px row height.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(skills): surface GitHub API errors on URL import
Pass through GitHub response messages (especially 403 rate limits) as
SkillImportError instead of a generic download failure.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
* Make edge-docked windows resizable
Add draggable resize seams for left and right docked windows.
Keep the main chat area from getting too narrow and remember each window's dock width.
* Show emoji shortcodes as icons by default
Keep text-only emoji mode opt-in so model output like 😊 goes through the normal emoji renderer.
* Fix dock resize seams and left dock layout
Hide the resize seam when another floating modal is open, and keep the left-docked window from covering the chat area.
* Keep narrow modal tabs usable
* Fix split layout with both edge docks
* Fix left snap after right dock
* Enable left edge snap for all windows
* Tighten dock resize handle observers
* Use edge docking for settings window
* feat: Add plan mode to the chat agent
Adds a plan mode: the agent investigates read-only, proposes a checklist, and
waits for approval before changing anything. On approval it runs with full
tools and checks items off as it goes. Enforcement reuses the existing
disabled_tools gate.
Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the
plan toggle from the chat input.
- src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP).
- src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the
plan directive, force agent mode.
- static/: plan toggle pill, Approve & Run, dockable plan window, task-list
checkboxes, and the /plan slash command.
- tests/test_plan_mode.py.
* Plan mode: persistent re-referenceable plan + agent write-back
Three improvements so a long plan survives a weak model and stays in reach:
1. Re-reference the plan (out-of-context fix). On the execution turn the frontend
sends the approved checklist back (`approved_plan`); the backend pins it as a
top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so
the agent can always re-read the plan instead of losing the thread on a long
run. New `build_active_plan_note()` (unit-tested).
2. Re-open / dock the plan anytime. The plan checklist is stored per-session
(localStorage). When a plan exists, the plan-mode button opens a small menu
("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan
window — so it can stay docked while the agent works. The window live-refreshes
as the plan changes.
3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps
`- [x]` after finishing them, or to revise steps when the user asks. Marker
tool (no I/O) → `plan_update` SSE event → the stored plan + docked window
update live. The ACTIVE PLAN note instructs the agent to use it.
Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb),
src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse
`approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools
/ tool_index (always-available, not admin-gated).
Frontend: static/js/chat.js (plan store, send `approved_plan`, handle
`plan_update`, capture restated checklists), static/app.js (plan-button menu),
static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key).
Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py.
* Plan mode: drop bash/python, rely on read-only discovery tools
Shell can mutate (write files, hit the network) and can't be constrained to
read-only at the tool layer, so plan mode no longer relies on a prompt to keep
it well-behaved — bash/python are removed from the read-only allowlist and added
to the fail-closed block set. Discovery is covered by the dedicated read-only
tools (read_file, grep, glob, ls) instead.
Rewrites the plan-mode directive to state shell is disabled and lists the
available read-only tools positively. Addresses review feedback on #638.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Comment: note _MCP_READONLY_VERBS are prefixes not whole words
Clarifies that entries like "summar" are intentional stems matched via
startswith (covers summarise/summarize/summary), not typos. Addresses review
feedback on #638.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Plan mode: clarify why gating inverts the allowlist into a denylist
Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the
comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an
allowlist, so it returns the inverse (all known tool names minus the allowlist).
The static mutator set is a backstop for the schema-derived name list, which
misses XML-only tools and can fail to import. Addresses review feedback on #638.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* Plan mode: stop hardcoding the read-only tool list in the directive
The model is already shown its available (read-only) tools by _assemble_prompt,
which removes every disabled tool. Enumerating them again in the directive only
duplicated that list and would drift as tools change. Point at the tools listed
below instead. Addresses review feedback on #638.
Let the agent pause and ask the user a multiple-choice question when a
task is genuinely ambiguous and the answer changes what it does next —
choosing between approaches, confirming an assumption, picking a target —
instead of guessing.
Modeled on the existing `ui_control` marker pattern: the `ask_user` tool
returns an `ask_user` payload that the agent loop emits as an SSE event
and then ends the turn. The frontend renders the question with clickable
option buttons, a free-text "Other" input, and an x to dismiss; the user's
choice is sent as the next message and the agent resumes with it in
context.
- src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O.
Validates a non-empty question + 2..6 options, normalizes string/object
options, returns the payload.
- src/agent_loop.py: emit the `ask_user` event and break the round loop so
the turn ends and waits for the user's selection. Stream the question as
assistant text so it persists/replays (prevents a re-ask loop).
- Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS,
FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any
user can be asked); the structured args serialize via the default
json.dumps path.
- routes/chat_routes.py: relay the `ask_user` event to the client.
- static/js/chat.js + static/style.css: render the question card (options +
free-text Other + dismiss x; removed once answered). Reuses CSS vars and
the .modal-close button; emoji go through the monochrome-SVG pipeline.
Bump chat.js cache pin.
- tests/test_ask_user_tool.py: payload, multi flag, string options, option
cap, validation errors, serializer round-trip, registration.
- Schedule cookbook serves through the existing ScheduledTask system: the
serve preset gets a ^ button next to Launch that opens a daily/hourly/
weekly form mirroring the admin-switch style; the schedule action runs
action_cookbook_serve, which delegates to /api/model/serve and stamps
the resulting task with _scheduledStopAtMs. A background
cookbook_serve_lifecycle loop ticks every 60s and kills any serve
whose window has ended, also dropping the auto-registered endpoint
so the model picker doesn't keep pointing at a dead server.
- Stop and remove on a Running serve now awaits the SSH/tmux kill,
re-checks tmux has-session, and surfaces an error toast (leaving the
row) when the kill failed. Previously fire-and-forget, so a failed
SSH/tmux call silently left the live serve running while the row
vanished from the UI.
- Cookbook tasks/status orphan-adoption sweep no longer requires the
serve-/cookbook- session-id prefix; any tmux session whose pane is
running a known model-server process gets auto-pulled into Running.
Without this loosening, a cookbook-launched serve whose tmux id
fell back to a bare number was invisible — you couldn't see it,
let alone stop it.
- Ollama serve always launches a fresh process under cookbook's tmux
(no more monitor-mode reattach to a systemd/Docker ollama Stop can't
reach). The handler pre-picks a free port by probing the target
host over SSH and mutates req.cmd's OLLAMA_HOST so the runner script
AND the auto-registered endpoint agree on the same bind port.
- Auto-register uses host.docker.internal (when running inside Docker)
instead of localhost, matching the URL /setup adds for Ollama by
hand. Local cookbook serves now produce a chat-reachable endpoint
on first launch.
- Cascade-delete: removing a scheduled cookbook task also deletes any
linked calendar event (cookbook_task_id marker in the description).
- Tasks list groups cookbook_serve under a "Cookbook" category that
sorts above the rest, so scheduler-launched serves are easy to find.
* feat: Add workspace: confine agent tools to a folder
Pick a server folder as the agent's workspace so its file/shell tools work
there and don't touch files outside it. File tools are hard-confined; bash/
python run with cwd set to the folder.
Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` /
`clear` / `pick` (open the directory browser).
- routes/workspace_routes.py: GET /api/workspace/browse (admin-only).
- src/tool_execution.py: hard path confinement for read_file/write_file;
bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block.
- src/agent_loop.py: workspace note prepended to the system prompt.
- static/: overflow menu item, input-bar pill, directory-browser modal, and
the /workspace slash command.
- tests/test_workspace_confine.py.
* Wire workspace confinement into tools that landed after this PR
edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was
written, so they bypassed the workspace boundary. Thread the workspace through:
- edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace
- grep/glob/ls: _resolve_search_root confines to the workspace (root + paths)
- bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir
default when no workspace is set)
Tests cover edit_file + grep/ls confinement (inside ok, outside rejected).
* Workspace picker: editable path bar + modal style cohesion + cross-platform hardening
- Make the current-folder strip an editable address bar: type/paste a full
path and press Enter to navigate (also reaches other Windows drives and
hidden dirs the up-only browser cannot).
- Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn*
in favour of base .modal-content/.modal-body and the .confirm-btn button
family; separators/hover use var(--border). Net -31 CSS lines.
- Fix the path field overflowing the modal right edge (flex stretch + margin
vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin.
- Cross-platform confinement: normcase the workspace commonpath check so
containment holds on case-insensitive filesystems (Windows/macOS).
- Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd()
instead of pwd. 5 pass.
Reverts b98ee04 + 4ed48ba + a19b6d2.
Calendar events turned out to be the wrong abstraction for scheduling model serve windows. Pivoting to the existing ScheduledTask infrastructure (cron / daily / weekly recurrence, next_run tracking, edit-from-Tasks-tab UI) in a follow-up commit. The ScheduledTask path:
- reuses dispatch logic the rest of the app already understands
- drops the calendar dependency entirely (no auto-created "Cookbook" calendar, no calendar.js hook)
- shows up in the Tasks UI that already exists for everything else
What this revert removes:
- src/cookbook_scheduler.py — calendar reconciler
- routes/cookbook_schedule_routes.py — /api/cookbook/schedule/* endpoints
- static/js/cookbookSchedule.js — Schedule modal / settings card
- cookbook_scheduler_enabled + cookbook_schedule_calendar_href settings keys
- The window.cookbookOpenScheduleForm hook in calendar.js
- The Schedule button + paired-button CSS in cookbookServe.js + style.css
* feat: round-limit handling — Continue affordance at the cap + configurable cap
When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:
1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
a "Continue" pill at the bottom of the chat that resumes the task from where
it left off. Repeated cap-hits each get a fresh Continue (multiple continues
in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
validated on the client, at the save endpoint, and at the read site.
- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
tool-executing round completes on the last allowed round — i.e. the agent
wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
`max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
existing .stopped-indicator / .continue-btn classes and theme vars
--border/--fg/--bg/--accent); appended to the chat container so it survives
the message re-render at stream finalize. chat.js cache version bumped.
* test: cover rounds_exhausted emission (cap-hit vs normal finish)
Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
* feat(provider): add GitHub Copilot provider with device-flow auth
Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5,
Claude, Gemini, …) work through the normal chat + agent loop, incl. native
tool calling and vision.
Auth is one-click via the GitHub OAuth device flow; the access token is stored
as the endpoint's (encrypted) api_key and sent directly as `Authorization:
Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to
the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only
provider-specific behaviour is a small set of required request headers,
injected centrally.
Sign-in is available from Settings → model endpoints ("Connect GitHub
Copilot") and from chat via `/setup copilot`.
- src/copilot.py (new), routes/copilot_routes.py (new): constants, header
builders, device-flow start/poll, model discovery, owner-scoped endpoint
provisioning.
- src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers,
per-request x-initiator/vision.
- src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas.
- src/model_context.py: known context windows for Copilot (no unauthenticated
/models probe).
- static/, README, tests/test_copilot*.py.
* Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process
Drop the custom Schedule modal in favor of opening the calendar's existing event-creation form pre-filled with the model's name + cookbook YAML in the description. The user lands in the same event editor they already know from regular calendar use, just pointed at the auto-created "Cookbook" calendar.
Backend:
- POST /api/cookbook/schedule/ensure-calendar — idempotent: creates a calendar named "Cookbook" if one doesn't exist for the current user, saves its href into cookbook_schedule_calendar_href, flips cookbook_scheduler_enabled on. Verifies the saved href against /api/calendar/calendars on every call so a manually-deleted calendar self-heals.
Frontend:
- calendar.js: expose window.cookbookOpenScheduleForm(draft) which opens the calendar modal (if not open), calls _showEventForm, then pre-fills summary / description / rrule / calendar dropdown. Force-expands the "Add details" section so the user can see which calendar it's heading into.
- cookbookSchedule.js: Schedule-button click now calls ensure-calendar, builds the cookbook: YAML block, and routes to window.cookbookOpenScheduleForm instead of openModal(). The legacy custom modal stays as a fallback for the case where calendar.js hasn't loaded.
UX tweak:
- cookbookServe.js: replace the standalone "Schedule…" text button with a small icon-only button (clock SVG) glued to the right edge of Launch. The pair forms one visual unit — Launch on the left, schedule-now on the right — sharing a thin divider. CSS handles the rounded corners + divider.
Add a calendar-driven scheduler so a user can pick a model in Cookbook, click "Schedule…" instead of "Launch", choose time windows + days of the week + (optional) end date, and have Odysseus auto-launch the serve when the window starts and hard-kill it when the window ends. The calendar IS the source of truth — events on a designated calendar are interpreted as serve schedules, so editing the event in the calendar UI immediately changes the schedule.
Whole feature is gated by setting `cookbook_scheduler_enabled` (default False). Disabling the setting silences the reconciler and the API refuses requests; setting + three new files = entire surface, easy to revert.
New files:
- src/cookbook_scheduler.py — background reconciler: ticks every 60s, reads next ±90s of calendar events on the designated calendar, launches/kills serves to match. Honors "refuse if GPUs busy" (skips with reason, no retry). Adopts pre-existing manual serves matching the event's model so window-end cleanup still applies. Tags scheduler-owned tasks with `_scheduledBy: <event_uid>` so it never kills serves it doesn't own.
- routes/cookbook_schedule_routes.py — POST /api/cookbook/schedule/from-cookbook builds RRULE+ICS events from the modal's input (model, slots[], days[], until). GET /upcoming returns the next 24h with per-event status (scheduled / running / adopted / skipped / failed / ended) for the UI. POST /reconcile-now manually kicks the reconciler.
- static/js/cookbookSchedule.js — Schedule button click handler + modal. Daily/hourly time slot picker, multi-slot ("+ add another time slot"), weekday chips with Weekdays/Weekend/Every-day quicksets, optional Until date. Calls /from-cookbook on save. Whole module is a single IIFE; deleting the file plus its <script> tag removes the UI surface.
Existing files touched (minimal):
- app.py: register the new router + add the reconcile loop as a startup task (~10 lines, all in one block). Reconcile loop checks the feature flag on every tick, so leaving it running with the flag off costs ~one settings lookup per minute.
- static/index.html: one new <script> tag for cookbookSchedule.js.
- static/js/cookbookServe.js: add a "Schedule…" button next to the existing Launch button. Hidden by default; cookbookSchedule.js reveals it after confirming the feature flag is on.
- static/style.css: ~80 lines for the modal styles (mobile-aware via @media).
User choices baked in:
- Calendar events are the source of truth.
- Refuse to launch if GPUs busy (skip + log reason in scheduler.events[uid].reason).
- Hard kill at event end.
- No retry on a skipped event within the window.
- Multi-slot per day supported (one calendar event per slot, shared RRULE).
- Pre-existing manual serves get adopted at window start so they're killed at end.
Known follow-ups (not in this commit):
- Settings UI to pick the schedule calendar + toggle the feature flag.
- Calendar event color/badge for status (running/skipped/failed).
- "Lazy launch on first request" — currently launches at event start. Replacing _launch_serve with a proxy that defers vllm until the first chat request is a contained future change.
* Add edit_file tool + file-change diffs
edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.
Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
- added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
refuses it for non-admin owners.
- confined by the same path policy as read_file/write_file (allowlist +
sensitive-file deny) via _resolve_tool_path.
Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.
Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.
Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.
* Drop redundant import os in write_file closure
os is already imported at module top.
Three converging fixes so the chat agent + external Codex/Claude skills can actually debug a crashed serve instead of staring at a post-crash neofetch banner:
* Serves now `tee` to /tmp/odysseus-tmux/SESSION.log on the host running them. Runner saves fds 3/4 before the tee and restores them right before `exec ${SHELL}`, so the post-crash interactive zsh banner does NOT pollute the log file.
* `tail_serve_output` (chat agent) and `/api/codex/cookbook/output/{sid}` (Codex+Claude skills) both prefer the persistent log file over the tmux pane. Pane is fallback for sessions predating the tee runner. Default tail bumped 150 -> 400.
* `list_served_models` "recent log" snippet seeks to the Traceback line instead of showing the last 6 lines (which was always the bash prompt).
Cookbook auto-adoption sweep on `/api/cookbook/tasks/status`: every 20s (rate-limited) the cookbook SSHes each configured server, finds `serve-*` / `cookbook-*` tmux sessions running an actual model process (vllm/python/llama-server/etc., filtered via `pane_current_command`), and writes them into state.tasks. So when the agent falls back to raw ssh+tmux, the session appears in the Cookbook UI on the next poll.
`serve_model` error path now reads `data["detail"]` in addition to `data["error"]` so the FastAPI HTTPException message ("Invalid characters in cmd") actually reaches the agent instead of being swallowed as a generic "Serve failed". Tool description updated to warn against `cd …`/`source …`/`&&` prefixes.
Intent-without-action supervisor in agent_loop: when the model writes "Let me tail the output" / "I'll check the logs" / "Let me investigate" and ends the turn without emitting a tool call, the loop injects a sharp system nudge ("You said you would X — DO IT NOW") and continues. Capped at 2 nudges per chat so a model that genuinely cannot use the tool does not pin the loop.
Codex/Claude skill parity: adds `/cookbook/cached`, `/cookbook/presets`, `/cookbook/preset/{name}`, `/cookbook/adopt` so external agents have the same surface as the chat agent. SKILL.md docs + odysseus_api.py wrapper updated for both bundles.
`adopt_served_model` promoted to the always-on tool set so the agent has a documented fallback when serve_model rejects a cmd.
Also various cookbook UI tweaks accumulated alongside the above (cookbook.js, cookbookRunning.js, cookbookServe.js, cookbook-diagnosis.js, settings.js, style.css).
- Claude Agent integration: AGENT_CONFIGS.claude, INTG_TYPES.claude,
setup_claude_routes + integrations/claude/ skill bundle. Wired in
app.py alongside the existing Codex integration; same scope-gated
/api/codex/* backend; agent form has new description so users know
it's setup for an external CLI, not an agent streamed inside Odysseus.
- Remove mark_email_boundaries action: not good enough yet. Stripped
from task UI, scheduler defaults, registry, tool schema, clear-cache
route. Added to RETIRED_HOUSEKEEPING_ACTIONS so existing rows + their
task_runs auto-purge on startup.
- Cookbook download reliability: "Reconnect" fix button in the crash
diagnosis runs _reconnectTask after probing has-session. 30s confirm
window before marking a download "done" — kills the Finished/Downloading
flicker when tmux briefly drops between captures.
- Mobile UX: tap anywhere on a note card body opens the editor;
Update button morphs to Archive when no text was edited; bell icon
accent-colored; chip-trashing notif pills fade so only the icon
rotates into the trash zone.
- Settings integrations: SVG-per-provider in email + API preset
dropdowns, custom drop-up-aware menus, accent sub-header icons
(IMAP/SMTP), consistent card styling between list + edit, contacts
Edit/Delete icons, agent form description copy.
Backend (services/hwfit + routes):
- VRAM column sort now shows global highest first (was special-cased to
ascending then truncated top-N, which made "highest VRAM" mathematically
unreachable). Every column path uses reverse=True for the truncation.
- Hardware probe cache TTL 30min -> 24h so changing filters doesn't keep
re-probing the rig during a session; Rescan button still forces fresh.
- Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang can't serve them);
default non-prequantized to BF16 on 2+ GPUs.
- AWQ / AWQ-8bit / GPTQ-8bit get a -1.0 quality penalty so FP8 wins ties.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above M2.5.
- hf_models.json: zai-org/GLM-5.1 added; zai-org/GLM-5 quantization flipped
Q4_K_M -> BF16. DeepSeek-V4-Flash / -Pro + their -Base variants registered
with new FP4-MoE-Mixed / FP8-Mixed quant keys (calibrated BPP from the
actual 156 GB / 284 GB disk footprints).
- New FP4-MoE-Mixed + FP8-Mixed entries in QUANT_BPP / QUANT_SPEED_MULT /
QUANT_QUALITY_PENALTY / QUANT_BYTES_PER_PARAM / PREQUANTIZED_PREFIXES.
Frontend — Scan/Download:
- Engine + Quant swapped in the toolbar; Quant defaults to "All".
- Ctx (range slider) ported from origin/main: 8k/16k/32k/50k/128k/Max. Drag
re-sorts by vram ascending (smallest fitting first); back to Max → score.
- Ctx slider rail now visible — was background:transparent in a duplicate
later-cascade rule. Hardcoded grey + !important.
- Search input moved to the far right of the toolbar.
- Type/Standard default; "Context" not uppercased; Search placeholder dimmed.
- Engine "?" + Quant "?" inline help chips inside their dropdown boxes.
- Fit-column dot toggles fit-only filter; un-toggling re-sorts by VRAM desc.
- Quant column truncates to 9 chars + ellipsis ("FP4-MoE-M..."), full in
tooltip. Smart title-suffix strips the parts already in the repo name
(QuantTrio/MiniMax-M2-AWQ + quant AWQ-4bit -> just "(4bit)").
- Conditional warning for safetensors models on non-GPU rigs only.
- Dependency Install / Installed / Installed▾ / N/A all 75.85px wide.
- Rebuild llama.cpp moved into the llama_cpp dep row, styled as a tag.
- Foldable Download admin-card (h2 chevron); line under h2 only when folded.
- HF token save gets a green ✓ + "Saved" flash.
- Cached scan no longer counts stalled rows as downloaded.
- Footer: "Request it →" link with GitHub mark to the public discussion
(#1962) for model-add requests.
Frontend — Running tab:
- Strict download-finish check (DOWNLOAD_OK or /snapshots/, not bare
"Download complete"). True overall % for multi-shard downloads:
((N-1)+frac)/total instead of hf_transfer's per-shard aggregate.
- ETA in the uptime ticker: "downloading: 12m 34s · ETA 1h 23m".
- Clear button kills the tmux session too; if the output still shows a
live shard line, the pill is hidden + relabels as "reconnect" + revives
on click.
- Self-heal: on cookbook open AND every bg-monitor cycle (10s, throttled
to 8s), scan persisted done/error/crashed downloads and probe their
tmux session — if alive, flip status back to running and reattach.
- Per-launch zombie probe: clicking Download on a model whose persisted
state is done but tmux is still alive revives the existing task and
refuses to start a duplicate.
- Pre-launch GPU probe: vllm / sglang / diffusers serve check
/api/cookbook/gpus first; warns + confirms if no GPU is visible.
- Server-side state guard: rejects "done" POSTs for downloads lacking
DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
shard is N<total — stale tabs can't poison persisted state any more.
- Running count includes tasks whose output looks active even if persisted
status got stuck. Dir text on the running row, font matched to uptime.
Serve panel:
- Ctx text input always resets to model max on open (default 20000 when
metadata is missing).
- Max Seqs default 8 -> 4. KV Cache dtype select 32px tall.
- Lightning icon on Launch (same as Action toggle).
- Diagnosis card simplified (no fold/copy/dismiss), suggestion font
matches body; action buttons get icons on the left (Retry/Copy/Edit/
Install/Kill/Switch/etc.).
- Incomplete-download serve warning when model status is
downloading / stalled / has_incomplete.
- MTP "?" tooltip ("supported on a few model families … up to ~3× faster").
Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
"stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
Q4_K_M -> BF16 (it is the native base, not a quant).
Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
required; bare "Download complete" no longer marks the task done after
the first config file. Clear button now sends tmux kill-session too.
True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.
CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
The empty-state tip ("Add an AI endpoint from Settings...") shares a 60px
max-height ceiling with the one-line .welcome-sub / .welcome-version. On
narrow phones the welcome block shrink-wraps and the tip wraps to 4-5 lines
(~67px), so the shared ceiling clipped its last line ("...key into the
chat.") - the only setup hint a first-run user gets.
Give .welcome-tip its own taller max-height (120px), placed above the
@media (max-height: 650px) block so that rule's max-height:0 still collapses
the tip on short viewports. .welcome-sub / .welcome-version are untouched,
and desktop is unchanged (the tip is ~50px there, well under the ceiling).
Sidebar section titles were vertically clipped in Chromium/Edge (fine in
Firefox). Raise line-height 1 → 1.3, mirroring the existing .list-item fix.
The titles are flex-centred in a fixed-height (29px) header, so this adds
glyph headroom without any reflow.
iOS Safari auto-zooms when a focused input has font-size < 16px. Bump
text-entry controls to 16px under (hover: none) and (pointer: coarse) so
desktop sizing is untouched. Date/time inputs and selects are excluded —
they open native pickers and never zoom.
Doc-editor tiers keep their size hierarchy: Large lands at 18px (above the
16px threshold) instead of collapsing onto Medium, and the email rich-body
Large (17px) is left alone since it was already zoom-safe. All three editor
layers (textarea, highlight overlay, line numbers) move together so the
syntax overlay stays metrically aligned.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>