fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 )

* fix(search): add download budgets to web_fetch with truncation notice and hard ceiling MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content buffered and cached the entire response body first, so a large or hostile URL could pull arbitrarily many bytes into memory and the content cache. The fetch is now a capped streaming GET (SSRF redirect guard unchanged): a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call override via full/max_bytes on the web_fetch tool, and a hard ceiling (WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed. When Content-Length already declares a body over the ceiling the fetch is refused before any body bytes are buffered. Truncated results carry truncated/fetched_bytes/total_bytes, the tool output leads with a partial-content notice telling the model how to re-fetch with full=true, and the tool schema documents the flag. A truncated PDF is reported as a budget error since a cut PDF is unparseable. The effective cap is part of the content-cache key so a truncated fetch is never served to a full-budget request. Existing tests that faked httpx.get or the old _get_public_url signature are adapted to the streaming interface; behavior pins are unchanged. Fixes #3812 * fix(search): close compressed-body cap bypass and protect the partial notice Addresses RaresKeY's review on #3955: - Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the wire bytes (and Content-Length) can be a fraction of the decoded body, so a tiny compressed response could pass the hard-cap preflight and then expand past the ceiling in a single decoded chunk before the streamed cap could slice it. Identity makes Content-Length the true body size and keeps each streamed chunk bounded by the network read, so the hard ceiling actually bounds memory. - Lead web_fetch output with the partial-content notice and cap the page title. The notice is the user-facing contract for partial fetches, but the title is untrusted, uncapped page content; placed ahead of the notice a giant title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads and the title is capped as a second guard. Adds regressions: the fetch advertises identity encoding, and a truncated result with an oversized title still surfaces the partial notice. * fix(search): reject compressed responses that ignore the identity request Requesting Accept-Encoding: identity is not enough on its own: a server can ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would decode that, so a tiny compressed body could balloon into one decoded chunk far past the hard cap before the streamed loop slices it (and Content-Length, the compressed wire length, makes the preflight and size metadata unreliable). Refuse a non-identity Content-Encoding before reading the body. Adds a regression where the server ignores the identity request and returns gzip; the fetch is refused before any body is decoded.
refactor(search): import REQUEST_TIMEOUT from constants in providers.py (#4331 )
2026-06-17 10:15:27 -04:00 · 2026-06-15 17:38:09 +00:00 · 2026-06-15 17:22:08 +00:00 · 2026-06-15 18:05:15 +01:00 · 2026-06-15 18:55:15 +02:00 · 2026-06-15 17:49:27 +01:00
104 changed files with 10201 additions and 2035 deletions
@@ -1,61 +0,0 @@
-# CodeQL code scanning
-#
-# Purpose: GitHub's own static analysis engine reads the application source
-# (Python backend + the JavaScript frontend) and looks for real
-# vulnerabilities -- SQL/command injection, path traversal, auth mistakes,
-# unsafe deserialization. Findings appear in the repo's Security tab. This is
-# the deepest check in the suite and the most valuable for a high-profile
-# target.
-#
-# It runs on every push to main and on a weekly schedule (to catch newly
-# disclosed query patterns against unchanged code). It deliberately does NOT
-# run on pull requests: most PRs here come from forks, whose read-only token
-# cannot publish results, which would produce confusing failures. To scan pull
-# requests too, a maintainer can instead enable CodeQL "default setup" in
-# Settings -> Security -> Code scanning (one toggle, no file needed) -- see
-# docs/security-ci.md.
-
-name: CodeQL
-
-on:
-  push:
-    branches: [main]
-  schedule:
-    # Weekly, Monday 06:00 UTC.
-    - cron: '0 6 * * 1'
-  workflow_dispatch:
-
-permissions: {}
-
-concurrency:
-  group: codeql-${{ github.workflow }}-${{ github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  analyze:
-    name: Analyze (${{ matrix.language }})
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      security-events: write  # publish results to the Security tab
-    strategy:
-      fail-fast: false
-      matrix:
-        # Both are interpreted, so CodeQL needs no build step (build-mode none).
-        language: [python, javascript-typescript]
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10  # v6.0.3
-        with:
-          persist-credentials: false
-
-      - name: Initialize CodeQL
-        uses: github/codeql-action/init@8aad20d150bbac5944a9f9d289da16a4b0d87c1e  # v4.36.2
-        with:
-          languages: ${{ matrix.language }}
-          build-mode: none
-
-      - name: Perform CodeQL analysis
-        uses: github/codeql-action/analyze@8aad20d150bbac5944a9f9d289da16a4b0d87c1e  # v4.36.2
-        with:
-          category: "/language:${{ matrix.language }}"
@@ -1,476 +1,65 @@
-# Odysseus
+<p align="center">
+  <img src="docs/odysseus-wordmark.png" alt="Odysseus" width="280">
+</p>

-> **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
+<p align="center">
+  A self-hosted AI workspace for chat, agents, research, documents, email, notes, calendar, and local model workflows.
+</p>

-```
-───────────────────────────────────────────────
- ⊹ ࣪ ˖ ૮( ˶ᵔ ᵕ ᵔ˶ )っ  Odysseus vers. 1.0
-───────────────────────────────────────────────
-```
+<p align="center">
+  <a href="#quick-start">Quick Start</a> ·
+  <a href="docs/setup.md">Setup Guide</a> ·
+  <a href="CONTRIBUTING.md">Contributing</a> ·
+  <a href="ROADMAP.md">Roadmap</a>
+</p>

-![Odysseus](docs/odysseus.jpg)
+<p align="center">
+  <a href="https://repology.org/project/odysseus-ai/versions"><img src="https://repology.org/badge/vertical-allrepos/odysseus-ai.svg" alt="Packaging status"></a>
+</p>

-A self-hosted AI workspace -- meant to be the self-hosted version of the UI experience you get from ChatGPT and Claude. But with more jank and fun. Running on your own hardware, with your own data -- local-first, privacy-first, and no trojan.
+<p align="center">
+  <img src="docs/odysseus.jpg" alt="Odysseus interface">
+</p>

-[![Packaging status](https://repology.org/badge/vertical-allrepos/odysseus-ai.svg)](https://repology.org/project/odysseus-ai/versions)
-
-## Features
-  - **Chat** -- chat with any local model or API; adding them is super simple.<br>　<sub>vLLM · llama.cpp · Ollama · OpenRouter · OpenAI · GitHub Copilot</sub>
-  - **Agent** -- hand it tools and let it run the whole task itself.<br>　<sub>built on [opencode](https://github.com/anomalyco/opencode) · MCP · web · files · shell · skills · memory</sub>
-  - **Cookbook** -- Scans your hardware, recommends models, click to download and serve.. easy!<br>　<sub>built on [llmfit](https://github.com/AlexsJones/llmfit) · VRAM-aware · GGUF / FP8 / AWQ · fit scoring · vLLM / llama.cpp serving</sub>
-  - **Deep Research** -- multi-step runs that gather, read, and synthesize sources into a nice visual report.<br>　<sub>adapted from [Tongyi DeepResearch](https://github.com/Alibaba-NLP/DeepResearch)</sub>
-  - **Compare** -- a fun tool to compare models side by side. Test completely blind, no bias!<br>　<sub>multi-model · blind test · synthesis</sub>
-  - **Documents** -- YOU write the text, AI is there to assist, not the opposite.<br>　<sub>multi-tab editor · markdown · HTML · CSV · syntax highlighting · AI edits · suggestions</sub>
-  - **Memory / Skills** -- Persistent memory and skills, your agent evolves over time as it better understands you and your tasks!<br>　<sub>ChromaDB · fastembed (ONNX) · vector + keyword retrieval · import/export</sub>
-  - **Email** -- IMAP/SMTP inbox with AI triage built in: urgency reminders, auto-tag, auto-summary, auto-reply drafts, auto-spam.<br>　<sub>IMAP · SMTP · per-account routing · CalDAV-aware</sub>
-  - **Notes & Tasks** -- Quick notes with reminders, a todo list, and scheduled tasks the agent can act on.<br>　<sub>note pings · checklist · cron-style tasks · ntfy / browser / email channels</sub>
-  - **Calendar** -- Local-first calendar with CalDAV sync to Radicale / Nextcloud / Apple / Fastmail.<br>　<sub>CalDAV pull · .ics import/export · per-calendar colors · agent-aware</sub>
-  - **Works on mobile** -- looks and runs great on your phone, not just desktop.<br>　<sub>responsive · installable (PWA) · touch gestures</sub>
-  - **Extras** -- more to explore, happy if you give it a go!<br>　<sub>image editor · theme editor · file uploads (vision + PDF) · web search · presets · sessions · 2FA</sub>
-
-## Demo
-A full, hover-to-play tour lives on the landing page (`docs/index.html`).
-
-<details>
-<summary>Screenshots / clips</summary>
-
-### Chat & Agents
-![Chat & Agents](docs/chat.gif)
-### Deep Research
-![Deep Research](docs/research.gif)
-### Compare
-![Compare](docs/compare.gif)
-### Documents
-![Documents](docs/document.gif)
-### Notes & Tasks
-![Notes & Tasks](docs/notes.gif)
-
-</details>
+---

 ## Quick Start

-Defaults work out of the box: clone, run, then configure models/search/email
-inside **Settings**. Only edit `.env` for deployment-level overrides like
-`APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
+> `dev` is the default branch and gets the newest changes first. Use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main) if you want the more curated branch.

-On first setup, Odysseus creates an admin account (`admin` unless
-`ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
-For Docker installs, the same line is in `docker compose logs odysseus`.
-Use that for the first login, then change it in **Settings**.
-
-Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
-pull request guidelines.
-
-### Docker (recommended)
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
-cp .env.example .env       # optional, but recommended for explicit defaults
+cp .env.example .env
 docker compose up -d --build
 ```
-To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.

-Open `http://localhost:7000` when the containers are healthy. Docker Compose
-binds the web UI to `127.0.0.1` by default. If the port is taken, set
-`APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
-only when you intentionally want LAN/reverse-proxy access.
+Open `http://localhost:7000` when the containers are healthy. The first admin password is printed in `docker compose logs odysseus`.

-> **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
-> Cookbook serves local models on CPU only. For GPU-accelerated model serving,
-> run natively instead — see [Apple Silicon](#apple-silicon) below.
+Native installs, GPU notes, Windows/macOS instructions, HTTPS, and configuration live in the [setup guide](docs/setup.md).

-### Native Linux / macOS
-```bash
-git clone https://github.com/pewdiepie-archdaemon/odysseus.git
-cd odysseus
-python3 -m venv venv
-source venv/bin/activate
-pip install -r requirements.txt
-python setup.py
-python -m uvicorn app:app --host 127.0.0.1 --port 7000
-```
-Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
-downloads and serves. The app itself is lightweight; local model serving is the
-heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
-connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
+## Features

-### Apple Silicon
-Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
-M-series Mac, run Odysseus natively:
+- **Chat + Agents** — local/API models, tools, MCP, files, shell, skills, and memory.
+- **Cookbook** — hardware-aware model recommendations, downloads, and serving.
+- **Deep Research** — multi-step web research with source reading and report generation.
+- **Compare** — blind side-by-side model testing and synthesis.
+- **Documents** — writing-first editor with AI edits, suggestions, Markdown, HTML, CSV, and syntax highlighting.
+- **Email** — IMAP/SMTP inbox with triage, tags, summaries, reminders, and reply drafts.
+- **Notes, Tasks + Calendar** — reminders, todos, scheduled agent tasks, and CalDAV sync.
+- **Extras** — gallery/image editor, themes, uploads, web search, presets, sessions, and 2FA.

-```bash
-git clone https://github.com/pewdiepie-archdaemon/odysseus.git
-cd odysseus
-./start-macos.sh
-```
+## Demo

-It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
-
-```bash
-ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
-# then open http://<tailscale-ip>:7860
-```
-
-The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
-set there are picked up automatically without a command-line override each run.
-
-Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
-expose this port directly to the public internet. To build a clickable app wrapper:
-
-```bash
-./build-macos-app.sh
-```
-
-<details>
-<summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
-
-**Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
-ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
-they are reachable from the host but not exposed to your LAN/public internet
-unless you opt in.
-
-**Cookbook storage in Docker.** Downloads live in `./data/huggingface`
-(`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
-serve engines live in `./data/local` (`~/.local` in the container), so they
-survive container recreation.
-
-**Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
-Odysseus SSH key and add the public key to the remote server's
-`~/.ssh/authorized_keys`. From the host you can also run:
-
-```bash
-ssh-copy-id -i data/ssh/id_ed25519.pub user@server
-```
-
-**Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
-only detect GPUs that Docker exposes to the container — if the host runtime or
-device passthrough is not configured, Cookbook sees the iGPU, another card, or
-CPU instead of your intended GPU.
-
-For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
-optionally install the host runtime or update `.env`.
-
-```bash
-# Read-only diagnostic (default — installs nothing, never edits .env):
-scripts/check-docker-gpu.sh
-
-# Print OS-specific install commands without running them:
-scripts/check-docker-gpu.sh --print-install-commands
-
-# Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
-scripts/check-docker-gpu.sh --install-nvidia-toolkit
-
-# Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
-scripts/check-docker-gpu.sh --enable-nvidia-overlay
-
-# Full assisted setup — install toolkit, then enable overlay if passthrough works:
-scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
-```
-
-Safety notes:
- The app never installs host GPU runtime automatically.
- The app never edits `.env` automatically.
- `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
-  and only after GPU passthrough succeeds. `--yes` skips prompts but does not
-  bypass the passthrough gate.
- `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
-  Git and the Docker build context.
-
-To enable manually without the script, add this to `.env`:
-
-```bash
-COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
-```
-
-**AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
-
-```bash
-scripts/check-docker-amd-gpu.sh
-```
-
-Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
-numeric render group id:
-
-```bash
-COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
-RENDER_GID=989
-```
-
-For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
-
-**Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
-often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
-or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
-overlay workflow above. For stack UIs, point the stack at one of the standalone
-files instead, which bundle the base stack plus the GPU settings:
-
- `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
-  on the host.
- `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
-  `video`/`render` group membership, and `RENDER_GID` when needed.
-
-The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
-source of truth; the standalone files mirror them for single-file deployments.
-
-Verify after enabling either overlay:
-
-```bash
-docker compose exec odysseus nvidia-smi -L   # NVIDIA
-docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*'  # AMD
-```
-
-> **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
-> container confirms Docker GPU access, but llama.cpp also needs `cudart` and
-> the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
-> library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
-> tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
-> not a Docker passthrough failure. Reinstall the serve engine via
-> **Cookbook → Dependencies** to get a CUDA-enabled build.
->
-> The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
-> the container confirms device passthrough, not ROCm userspace or a
-> ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
-> inside the slim Odysseus image.
-
-**Ollama with Docker.** If Ollama runs on the host, add this endpoint in
-Settings:
-
-```text
-http://host.docker.internal:11434/v1
-```
-
-Ollama must listen outside its own loopback interface:
-
-```bash
-OLLAMA_HOST=0.0.0.0:11434 ollama serve
-```
-
-This connects Odysseus in Docker to an Ollama server that is already running on
-your host machine; it does not start Ollama inside the container.
-`host.docker.internal` is Docker's hostname for the host machine from inside the
-container. Cookbook **Serve** is a separate workflow for serving downloaded
-models through Odysseus/llama.cpp, so Windows users with an existing Ollama
-install usually only need to add the endpoint in Settings.
-
-**Useful checks.**
-
-```bash
-docker compose ps
-docker compose logs --tail=120 odysseus
-docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
-```
-
-**macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
-runs setup, and starts uvicorn on port `7860` because AirPlay often holds
-`7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
-do not run on macOS. MLX-only models are not served by Odysseus.
-
-</details>
-
-### Native Windows
-
-**One-command launcher** (creates the venv, installs deps, runs setup, starts the
-server; safe to re-run):
-
-```powershell
-git clone https://github.com/pewdiepie-archdaemon/odysseus.git
-cd odysseus
-powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
-```
-
-Or do it by hand:
-
-```powershell
-git clone https://github.com/pewdiepie-archdaemon/odysseus.git
-cd odysseus
-py -3.11 -m venv venv
-venv\Scripts\Activate.ps1
-pip install -r requirements.txt
-python setup.py
-python -m uvicorn app:app --host 127.0.0.1 --port 7000
-```
-
-If `python` points at an older interpreter, use `py -3.12` (or another installed
-3.11+ version) for the venv step.
-
-**Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
-email, calendar, deep research) runs fully native. For full **Cookbook** background
-model downloads and the agent shell tool, also install
-[Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
-Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
-[Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
-`http://localhost:11434/v1` in Settings.
-
-Open `http://localhost:7000`, log in with the generated admin password,
-and configure everything else inside **Settings**.
-
-## Troubleshooting & Advanced Setup
-
-### `chromadb-client` conflicts with embedded ChromaDB
-If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
-
-**Fix:** uninstall `chromadb-client` and force-reinstall the full package:
-```bash
-./venv/bin/pip uninstall chromadb-client -y
-./venv/bin/pip install --force-reinstall chromadb
-```
-
-### HTTPS + LAN/Tailscale exposure
-To expose Odysseus on a local network or Tailscale with HTTPS:
-1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
-2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
-   ```bash
-   mkcert -install
-   mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
-   ```
-3. Run `uvicorn` with the generated certs:
-   ```bash
-   python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
-   ```
-4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
-
-### Optional Dependencies
-`requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
-
-| Package | Feature unlocked |
-|---------|-----------------|
-| `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
-| `ddgs` | DuckDuckGo as a search provider option. |
-| `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
-| `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
-
-### Faster, reproducible installs with uv (optional)
-[uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
-venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
-
-```bash
-uv venv venv --python 3.13
-uv pip install -r requirements.txt
-# then continue as usual: python setup.py, uvicorn, ...
-```
-
-`requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
-
-```bash
-uv pip compile requirements.txt -o requirements.lock   # snapshot current resolution
-uv pip sync requirements.lock                          # reproduce it exactly later
-```
-
-`requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
-
-### Outlook / Office 365 email
-Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
-and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
-passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
-current limitation and the planned integration direction.
-
-## Security Notes
-Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
-
- Keep `AUTH_ENABLED=true` for any network-accessible deployment.
- Keep `LOCALHOST_BYPASS=false` outside local development.
- Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
- Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
- Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
- Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
- Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
- Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
- If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
- Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
- Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
- Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
-
-### Private or proxied deployments
-Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
-
-1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
-2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
-3. Put the authenticated Odysseus web/API entrypoint behind that layer.
-4. Keep raw service and model ports internal-only.
-
-Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
-`ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
-
-Common internal-only ports from the default docs/compose setup:
-
-| Port | Service |
-|---|---|
-| `7000` | Odysseus raw app port |
-| `8080` | SearXNG |
-| `8091` | ntfy |
-| `8100` | ChromaDB host port for manual/compose access |
-| `11434` | Ollama |
-| `8000-8020` | Common local model/provider APIs |
+A full hover-to-play tour lives on the landing page: [`docs/index.html`](docs/index.html).

 ## Contributing
-Help is welcome. The best entry points are fresh-install testing, provider setup
-bugs, mobile/editor polish, docs, and small focused refactors. See
-[ROADMAP.md](ROADMAP.md) for the current help-wanted list.

-## Configuration
-Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
-for deployment-level defaults and secrets you want present before first boot.
-Key settings:
+Help is welcome. The best entry points are fresh-install testing, provider setup bugs, mobile/editor polish, docs, and small focused refactors. See [CONTRIBUTING.md](CONTRIBUTING.md) and [ROADMAP.md](ROADMAP.md).

-| Variable | Default | Description |
-|---|---|---|
-| `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
-| `LLM_HOSTS` | -- | Comma-separated list for model discovery |
-| `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
-| `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
-| `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
-| `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
-| `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
-| `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
-| `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
-| `AUTH_ENABLED` | `true` | Enable/disable login |
-| `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
-| `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
-| `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
-| `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
-| `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
-| `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
-| `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
-| `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
-| `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
-| `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
-| `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
-| `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
-| `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
-| `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
-| `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
+## Security

-All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
-
-### Built-in MCP servers (optional setup)
-
-Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
-
-To enable the browser MCP (page navigation, screenshots, vision), run once:
-
-```bash
-npx -y @playwright/mcp@latest --version
-```
-
-That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
-
-## Architecture
-```
-app.py                   # FastAPI entry point
-core/      auth, database, middleware, constants
-src/       llm_core, agent_loop, agent_tools, chat_processor, search/
-routes/    chat, session, document, memory, model … endpoints
-services/  docs, memory, search, hwfit (Cookbook) …
-static/    index.html + app.js + style.css + js/ (modular front-end)
-docs/      landing page (index.html) + preview clips
-```
-
-## Data
-All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
-`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
-
-To back up or restore everything in `data/`, see the
-[Backup & Restore guide](docs/backup-restore.md).
+Odysseus is a self-hosted workspace with powerful local tools. Keep auth enabled, keep private data out of Git, and do not expose raw model/service ports publicly. Deployment details are in the [setup guide](docs/setup.md#security-notes).

 ## Star History

@@ -483,19 +72,5 @@ To back up or restore everything in `data/`, see the
 </a>

 ## License
-AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).

-```
-                                  |
-                                 |||
-                                |||||
-                  |    |    |   |||||||
-                 )_)  )_)  )_)   ~|~
-                )___))___))___)\  |
-               )____)____)_____)\\|
-             _____|____|____|_____\\\__
-             \                       /
-       ~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
-               ~^~  all aboard!  ~^~
-       ~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
-```
+AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).
@@ -331,8 +331,8 @@ if AUTH_ENABLED:
                        request.state.current_user = "internal-tool"
                    request.state.api_token = False
                    return await call_next(request)
-            except Exception:
-                pass
+            except Exception as _e:
+                logger.warning("Internal tool auth header check failed", exc_info=_e)
            # Allow DIRECT localhost requests (internal service calls from
            # heartbeats etc.). Tunnel/proxy-forwarded requests are excluded by
            # _is_trusted_loopback so LOCALHOST_BYPASS can't be abused over a
@@ -385,11 +385,10 @@ if AUTH_ENABLED:
                                    _db.close()
                            try:
                                await _asyncio.to_thread(_do)
-                            except Exception:
-                                pass
+                            except Exception as _e:
+                                logger.debug("Failed to update token last_used_at", exc_info=_e)
                        _asyncio.create_task(_touch_last_used(matched_id))
                        # Keep bearer-token callers out of normal cookie/user
-                        # routes. API-aware routes can read api_token_owner.
                        request.state.current_user = "api"
                        request.state.api_token = True
                        request.state.api_token_id = matched_id
@@ -464,8 +463,8 @@ async def serve_generated_image(filename: str, request: Request):
                _db.close()
    except HTTPException:
        raise
-    except Exception:
-        pass
+    except Exception as _e:
+        logger.warning("Image ownership verification failed for %r", filename, exc_info=_e)
    ext = filename.rsplit('.', 1)[-1].lower()
    mime = {
        "png": "image/png", "jpg": "image/jpeg", "jpeg": "image/jpeg",
@@ -2,12 +2,15 @@ import os
 import logging
 import sqlite3
 from datetime import datetime, timezone
+from pathlib import Path
 from sqlalchemy import event, create_engine, Column, String, Text, Boolean, DateTime, Integer, ForeignKey, JSON, Index, func, text
 from sqlalchemy.engine import Engine
 from sqlalchemy.types import TypeDecorator
 from sqlalchemy.ext.declarative import declarative_base, declared_attr
 from sqlalchemy.orm import relationship, sessionmaker, backref

+from src.runtime_paths import get_app_root
+
 logger = logging.getLogger(__name__)

 # Create base class for declarative models
@@ -29,9 +32,26 @@ class TimestampMixin:
    def updated_at(cls):
        return Column(DateTime, default=utcnow_naive, onupdate=utcnow_naive, nullable=False)

-# Get database URL from environment, default to SQLite in DATA_DIR
+# Ensure the writable data directory exists before SQLite connects.
 from src.constants import DATA_DIR, AUTH_FILE, MEMORY_FILE, USER_PREFS_FILE, SETTINGS_FILE
-DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite:///{DATA_DIR}/app.db")
+Path(DATA_DIR).mkdir(parents=True, exist_ok=True)
+
+
+def _default_database_url() -> str:
+    return f"sqlite:///{Path(DATA_DIR) / 'app.db'}"
+
+
+def _normalize_sqlite_url(url: str) -> str:
+    if not url.startswith("sqlite:///"):
+        return url
+    db_path = url.replace("sqlite:///", "", 1)
+    if db_path == ":memory:" or os.path.isabs(db_path):
+        return url
+    return f"sqlite:///{(Path(get_app_root()) / db_path).resolve().as_posix()}"
+
+
+# Get database URL from environment, default to SQLite in DATA_DIR
+DATABASE_URL = _normalize_sqlite_url(os.getenv("DATABASE_URL", _default_database_url()))

 # Create engine
 engine = create_engine(
@@ -324,6 +344,13 @@ class EmailAccount(TimestampMixin, Base):
    smtp_password  = Column(String, default="")

    from_address   = Column(String, default="")
+    display_name   = Column(String, nullable=True)   # "Hriday Ranka" — used in From: header
+
+    # OAuth2 (Google / Google Workspace). Tokens stored encrypted via secret_storage.
+    oauth_provider      = Column(String, nullable=True)   # "google" or None
+    oauth_access_token  = Column(String, nullable=True)   # encrypted
+    oauth_refresh_token = Column(String, nullable=True)   # encrypted
+    oauth_token_expiry  = Column(String, nullable=True)   # unix timestamp string

    __table_args__ = (
        Index('ix_email_accounts_owner_default', 'owner', 'is_default'),
@@ -1427,6 +1454,25 @@ def _migrate_add_task_automation_columns():
    except Exception as e:
        logging.getLogger(__name__).warning(f"task automation migration: {e}")

+def _migrate_add_email_oauth_columns():
+    """Add Google OAuth and display_name columns to email_accounts if missing."""
+    try:
+        with engine.connect() as conn:
+            cols = [r[1] for r in conn.execute(text("PRAGMA table_info(email_accounts)"))]
+            for col, typedef in [
+                ("oauth_provider",      "TEXT"),
+                ("oauth_access_token",  "TEXT"),
+                ("oauth_refresh_token", "TEXT"),
+                ("oauth_token_expiry",  "TEXT"),
+                ("display_name",        "TEXT"),
+            ]:
+                if col not in cols:
+                    conn.execute(text(f"ALTER TABLE email_accounts ADD COLUMN {col} {typedef}"))
+            conn.commit()
+    except Exception as e:
+        logging.getLogger(__name__).warning(f"email oauth columns migration: {e}")
+
+
 def _migrate_add_oauth_config():
    """Add oauth_config column to mcp_servers table if missing."""
    try:
@@ -1771,6 +1817,7 @@ def init_db():
    _migrate_add_tidy_verdict()
    _migrate_add_doc_source_email_cols()
    _migrate_add_oauth_config()
+    _migrate_add_email_oauth_columns()
    _migrate_add_task_automation_columns()
    _migrate_add_disabled_tools()
    _migrate_add_mcp_oauth_tokens_column()
@@ -1,14 +1,16 @@
 # Security CI guide

-This project runs a set of automated security checks on every pull request and
-on every push to `main`. This page explains what each one does, whether it can
+This project runs a set of automated security checks on pull requests and
+selected branch pushes. This page explains what each one does, whether it can
 block a merge, and the few one-time settings you should turn on to get the full
 benefit.

 ## What runs, and why

-Each check lives in its own file under `.github/workflows/`. They run
-automatically; you do not start them.
+Most checks live in files under `.github/workflows/`. CodeQL is configured
+through GitHub's code scanning default setup, so it appears as a dynamic GitHub
+workflow instead of a checked-in workflow file. They run automatically; you do
+not start them.

 | Check | What it protects against | Blocks a merge? |
 |---|---|---|
@@ -88,11 +90,14 @@ let the workflows run on one pull request first, then add them here.
 2. Turn on **Dependency graph** (usually on by default for public repos) -- this
   powers Dependency review and Dependabot.
 3. Turn on **Dependabot alerts** and **Dependabot security updates**.
-4. Under **Code scanning**, you have two ways to scan the app code with CodeQL:
-   - The included `codeql.yml` workflow already scans `main` and runs weekly.
-   - To also scan **pull requests** (recommended, since most contributions come
-     from forks), click **Set up -> Default** under Code scanning. GitHub then
-     runs CodeQL on pull requests for you, with no token limitations.
+4. Under **Code scanning**, use **Set up -> Default** for CodeQL. GitHub then
+   runs CodeQL as a dynamic workflow without the fork-token limitations that
+   affect checked-in advanced workflows.
+
+   Do not also add a checked-in CodeQL workflow while default setup is enabled:
+   GitHub rejects advanced CodeQL uploads when default setup is active. If the
+   project later needs an advanced CodeQL workflow, disable default setup first
+   and keep only one CodeQL publishing path active.

 ## Keeping it current

@@ -0,0 +1,425 @@
+# Odysseus Setup Guide
+
+This page keeps the detailed install, deployment, troubleshooting, and configuration notes out of the front README.
+
+## Quick Start
+
+> **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
+
+Defaults work out of the box: clone, run, then configure models/search/email
+inside **Settings**. Only edit `.env` for deployment-level overrides like
+`APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
+
+On first setup, Odysseus creates an admin account (`admin` unless
+`ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
+For Docker installs, the same line is in `docker compose logs odysseus`.
+Use that for the first login, then change it in **Settings**.
+
+Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
+pull request guidelines.
+
+### Docker (recommended)
+```bash
+git clone https://github.com/pewdiepie-archdaemon/odysseus.git
+cd odysseus
+cp .env.example .env       # optional, but recommended for explicit defaults
+docker compose up -d --build
+```
+To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.
+
+Open `http://localhost:7000` when the containers are healthy. Docker Compose
+binds the web UI to `127.0.0.1` by default. If the port is taken, set
+`APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
+only when you intentionally want LAN/reverse-proxy access.
+
+> **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
+> Cookbook serves local models on CPU only. For GPU-accelerated model serving,
+> run natively instead — see [Apple Silicon](#apple-silicon) below.
+
+### Native Linux / macOS
+```bash
+git clone https://github.com/pewdiepie-archdaemon/odysseus.git
+cd odysseus
+python3 -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
+python setup.py
+python -m uvicorn app:app --host 127.0.0.1 --port 7000
+```
+Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
+downloads and serves. The app itself is lightweight; local model serving is the
+heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
+connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
+
+### Apple Silicon
+Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
+M-series Mac, run Odysseus natively:
+
+```bash
+git clone https://github.com/pewdiepie-archdaemon/odysseus.git
+cd odysseus
+./start-macos.sh
+```
+
+It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
+
+```bash
+ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
+# then open http://<tailscale-ip>:7860
+```
+
+The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
+set there are picked up automatically without a command-line override each run.
+
+Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
+expose this port directly to the public internet. To build a clickable app wrapper:
+
+```bash
+./build-macos-app.sh
+```
+
+<details>
+<summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
+
+**Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
+ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
+they are reachable from the host but not exposed to your LAN/public internet
+unless you opt in.
+
+**Cookbook storage in Docker.** Downloads live in `./data/huggingface`
+(`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
+serve engines live in `./data/local` (`~/.local` in the container), so they
+survive container recreation.
+
+**Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
+Odysseus SSH key and add the public key to the remote server's
+`~/.ssh/authorized_keys`. From the host you can also run:
+
+```bash
+ssh-copy-id -i data/ssh/id_ed25519.pub user@server
+```
+
+**Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
+only detect GPUs that Docker exposes to the container — if the host runtime or
+device passthrough is not configured, Cookbook sees the iGPU, another card, or
+CPU instead of your intended GPU.
+
+For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
+optionally install the host runtime or update `.env`.
+
+```bash
+# Read-only diagnostic (default — installs nothing, never edits .env):
+scripts/check-docker-gpu.sh
+
+# Print OS-specific install commands without running them:
+scripts/check-docker-gpu.sh --print-install-commands
+
+# Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
+scripts/check-docker-gpu.sh --install-nvidia-toolkit
+
+# Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
+scripts/check-docker-gpu.sh --enable-nvidia-overlay
+
+# Full assisted setup — install toolkit, then enable overlay if passthrough works:
+scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
+```
+
+Safety notes:
+- The app never installs host GPU runtime automatically.
+- The app never edits `.env` automatically.
+- `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
+  and only after GPU passthrough succeeds. `--yes` skips prompts but does not
+  bypass the passthrough gate.
+- `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
+  Git and the Docker build context.
+
+To enable manually without the script, add this to `.env`:
+
+```bash
+COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
+```
+
+**AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
+
+```bash
+scripts/check-docker-amd-gpu.sh
+```
+
+Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
+numeric render group id:
+
+```bash
+COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
+RENDER_GID=989
+```
+
+For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
+
+**Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
+often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
+or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
+overlay workflow above. For stack UIs, point the stack at one of the standalone
+files instead, which bundle the base stack plus the GPU settings:
+
+- `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
+  on the host.
+- `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
+  `video`/`render` group membership, and `RENDER_GID` when needed.
+
+The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
+source of truth; the standalone files mirror them for single-file deployments.
+
+Verify after enabling either overlay:
+
+```bash
+docker compose exec odysseus nvidia-smi -L   # NVIDIA
+docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*'  # AMD
+```
+
+> **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
+> container confirms Docker GPU access, but llama.cpp also needs `cudart` and
+> the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
+> library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
+> tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
+> not a Docker passthrough failure. Reinstall the serve engine via
+> **Cookbook → Dependencies** to get a CUDA-enabled build.
+>
+> The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
+> the container confirms device passthrough, not ROCm userspace or a
+> ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
+> inside the slim Odysseus image.
+
+**Ollama with Docker.** If Ollama runs on the host, add this endpoint in
+Settings:
+
+```text
+http://host.docker.internal:11434/v1
+```
+
+Ollama must listen outside its own loopback interface:
+
+```bash
+OLLAMA_HOST=0.0.0.0:11434 ollama serve
+```
+
+This connects Odysseus in Docker to an Ollama server that is already running on
+your host machine; it does not start Ollama inside the container.
+`host.docker.internal` is Docker's hostname for the host machine from inside the
+container. Cookbook **Serve** is a separate workflow for serving downloaded
+models through Odysseus/llama.cpp, so Windows users with an existing Ollama
+install usually only need to add the endpoint in Settings.
+
+**Useful checks.**
+
+```bash
+docker compose ps
+docker compose logs --tail=120 odysseus
+docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
+```
+
+**macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
+runs setup, and starts uvicorn on port `7860` because AirPlay often holds
+`7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
+do not run on macOS. MLX-only models are not served by Odysseus.
+
+</details>
+
+### Native Windows
+
+**One-command launcher** (creates the venv, installs deps, runs setup, starts the
+server; safe to re-run):
+
+```powershell
+git clone https://github.com/pewdiepie-archdaemon/odysseus.git
+cd odysseus
+powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
+```
+
+Or do it by hand:
+
+```powershell
+git clone https://github.com/pewdiepie-archdaemon/odysseus.git
+cd odysseus
+py -3.11 -m venv venv
+venv\Scripts\Activate.ps1
+pip install -r requirements.txt
+python setup.py
+python -m uvicorn app:app --host 127.0.0.1 --port 7000
+```
+
+If `python` points at an older interpreter, use `py -3.12` (or another installed
+3.11+ version) for the venv step.
+
+**Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
+email, calendar, deep research) runs fully native. For full **Cookbook** background
+model downloads and the agent shell tool, also install
+[Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
+Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
+[Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
+`http://localhost:11434/v1` in Settings.
+
+Open `http://localhost:7000`, log in with the generated admin password,
+and configure everything else inside **Settings**.
+
+## Troubleshooting & Advanced Setup
+
+### `chromadb-client` conflicts with embedded ChromaDB
+If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
+
+**Fix:** uninstall `chromadb-client` and force-reinstall the full package:
+```bash
+./venv/bin/pip uninstall chromadb-client -y
+./venv/bin/pip install --force-reinstall chromadb
+```
+
+### HTTPS + LAN/Tailscale exposure
+To expose Odysseus on a local network or Tailscale with HTTPS:
+1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
+2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
+   ```bash
+   mkcert -install
+   mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
+   ```
+3. Run `uvicorn` with the generated certs:
+   ```bash
+   python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
+   ```
+4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
+
+### Optional Dependencies
+`requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
+
+| Package | Feature unlocked |
+|---------|-----------------|
+| `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
+| `ddgs` | DuckDuckGo as a search provider option. |
+| `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
+| `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
+
+### Faster, reproducible installs with uv (optional)
+[uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
+venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
+
+```bash
+uv venv venv --python 3.13
+uv pip install -r requirements.txt
+# then continue as usual: python setup.py, uvicorn, ...
+```
+
+`requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
+
+```bash
+uv pip compile requirements.txt -o requirements.lock   # snapshot current resolution
+uv pip sync requirements.lock                          # reproduce it exactly later
+```
+
+`requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
+
+### Outlook / Office 365 email
+Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
+and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
+passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
+current limitation and the planned integration direction.
+
+## Security Notes
+Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
+
+- Keep `AUTH_ENABLED=true` for any network-accessible deployment.
+- Keep `LOCALHOST_BYPASS=false` outside local development.
+- Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
+- Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
+- Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
+- Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
+- Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
+- Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
+- If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
+- Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
+- Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
+- Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
+
+### Private or proxied deployments
+Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
+
+1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
+2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
+3. Put the authenticated Odysseus web/API entrypoint behind that layer.
+4. Keep raw service and model ports internal-only.
+
+Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
+`ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
+
+Common internal-only ports from the default docs/compose setup:
+
+| Port | Service |
+|---|---|
+| `7000` | Odysseus raw app port |
+| `8080` | SearXNG |
+| `8091` | ntfy |
+| `8100` | ChromaDB host port for manual/compose access |
+| `11434` | Ollama |
+| `8000-8020` | Common local model/provider APIs |
+
+## Configuration
+Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
+for deployment-level defaults and secrets you want present before first boot.
+Key settings:
+
+| Variable | Default | Description |
+|---|---|---|
+| `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
+| `LLM_HOSTS` | -- | Comma-separated list for model discovery |
+| `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
+| `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
+| `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
+| `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
+| `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
+| `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
+| `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
+| `AUTH_ENABLED` | `true` | Enable/disable login |
+| `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
+| `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
+| `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
+| `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
+| `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
+| `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
+| `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
+| `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
+| `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
+| `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
+| `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
+| `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
+| `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
+| `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
+| `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
+
+All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
+
+### Built-in MCP servers (optional setup)
+
+Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
+
+To enable the browser MCP (page navigation, screenshots, vision), run once:
+
+```bash
+npx -y @playwright/mcp@latest --version
+```
+
+That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
+
+## Architecture
+```
+app.py                   # FastAPI entry point
+core/      auth, database, middleware, constants
+src/       llm_core, agent_loop, agent_tools, chat_processor, search/
+routes/    chat, session, document, memory, model … endpoints
+services/  docs, memory, search, hwfit (Cookbook) …
+static/    index.html + app.js + style.css + js/ (modular front-end)
+docs/      landing page (index.html) + preview clips
+```
+
+## Data
+All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
+`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
+
+To back up or restore everything in `data/`, see the
+[Backup & Restore guide](docs/backup-restore.md).
@@ -102,6 +102,7 @@ python3 ~/.claude/skills/odysseus/scripts/odysseus_api.py POST /api/codex/memory

 ## Email draft + send

+- Prefer `POST /api/codex/emails/draft-document` for agent-written email replies. It creates an editable Odysseus Document with `language: "email"` and does not touch IMAP/send.
 - `POST /api/codex/emails/draft` — body matches `SendEmailRequest` (`to`, `cc`, `bcc`, `subject`, `body`, `body_html`, `attachments`, `account_id`, `in_reply_to`, `references`). Requires `email:draft` (or `email:send`).
 - `POST /api/codex/emails/send` — same body. Requires `email:send`. Never send without explicit user instruction.

@@ -17,6 +17,11 @@ def _usage() -> int:
    print("  odysseus_api.py todos add TITLE", file=sys.stderr)
    print("  odysseus_api.py emails list [limit]", file=sys.stderr)
    print("  odysseus_api.py emails read UID", file=sys.stderr)
+    print("  odysseus_api.py emails draft-doc JSON_PAYLOAD", file=sys.stderr)
+    print("  odysseus_api.py documents list [limit]", file=sys.stderr)
+    print("  odysseus_api.py documents read DOC_ID", file=sys.stderr)
+    print("  odysseus_api.py documents create JSON_PAYLOAD", file=sys.stderr)
+    print("  odysseus_api.py documents delete DOC_ID", file=sys.stderr)
    print("  odysseus_api.py cookbook tasks", file=sys.stderr)
    print("  odysseus_api.py cookbook servers", file=sys.stderr)
    print("  odysseus_api.py cookbook cached [HOST]", file=sys.stderr)
@@ -79,6 +84,33 @@ def main() -> int:
            method = "GET"
            path = f"/api/codex/emails/{sys.argv[3]}"
            body = None
+        elif action in ("draft-doc", "draft_document") and len(sys.argv) >= 4:
+            method = "POST"
+            path = "/api/codex/emails/draft-document"
+            body = " ".join(sys.argv[3:])
+        else:
+            return _usage()
+    elif command in ("documents", "docs"):
+        if len(sys.argv) < 3:
+            return _usage()
+        action = sys.argv[2].lower()
+        if action == "list":
+            method = "GET"
+            limit = sys.argv[3] if len(sys.argv) >= 4 else "50"
+            path = f"/api/codex/documents?limit={limit}"
+            body = None
+        elif action == "read" and len(sys.argv) >= 4:
+            method = "GET"
+            path = f"/api/codex/documents/{sys.argv[3]}"
+            body = None
+        elif action == "create" and len(sys.argv) >= 4:
+            method = "POST"
+            path = "/api/codex/documents"
+            body = " ".join(sys.argv[3:])
+        elif action == "delete" and len(sys.argv) >= 4:
+            method = "DELETE"
+            path = f"/api/codex/documents/{sys.argv[3]}"
+            body = None
        else:
            return _usage()
    elif command == "cookbook":
@@ -17,6 +17,11 @@ def _usage() -> int:
    print("  odysseus_api.py todos add TITLE", file=sys.stderr)
    print("  odysseus_api.py emails list [limit]", file=sys.stderr)
    print("  odysseus_api.py emails read UID", file=sys.stderr)
+    print("  odysseus_api.py emails draft-doc JSON_PAYLOAD", file=sys.stderr)
+    print("  odysseus_api.py documents list [limit]", file=sys.stderr)
+    print("  odysseus_api.py documents read DOC_ID", file=sys.stderr)
+    print("  odysseus_api.py documents create JSON_PAYLOAD", file=sys.stderr)
+    print("  odysseus_api.py documents delete DOC_ID", file=sys.stderr)
    print("  odysseus_api.py cookbook tasks", file=sys.stderr)
    print("  odysseus_api.py cookbook servers", file=sys.stderr)
    print("  odysseus_api.py cookbook cached [HOST]", file=sys.stderr)
@@ -79,6 +84,33 @@ def main() -> int:
            method = "GET"
            path = f"/api/codex/emails/{sys.argv[3]}"
            body = None
+        elif action in ("draft-doc", "draft_document") and len(sys.argv) >= 4:
+            method = "POST"
+            path = "/api/codex/emails/draft-document"
+            body = " ".join(sys.argv[3:])
+        else:
+            return _usage()
+    elif command in ("documents", "docs"):
+        if len(sys.argv) < 3:
+            return _usage()
+        action = sys.argv[2].lower()
+        if action == "list":
+            method = "GET"
+            limit = sys.argv[3] if len(sys.argv) >= 4 else "50"
+            path = f"/api/codex/documents?limit={limit}"
+            body = None
+        elif action == "read" and len(sys.argv) >= 4:
+            method = "GET"
+            path = f"/api/codex/documents/{sys.argv[3]}"
+            body = None
+        elif action == "create" and len(sys.argv) >= 4:
+            method = "POST"
+            path = "/api/codex/documents"
+            body = " ".join(sys.argv[3:])
+        elif action == "delete" and len(sys.argv) >= 4:
+            method = "DELETE"
+            path = f"/api/codex/documents/{sys.argv[3]}"
+            body = None
        else:
            return _usage()
    elif command == "cookbook":
@@ -102,6 +102,7 @@ python3 integrations/codex/scripts/odysseus_api.py POST /api/codex/memory '{"tex

 ## Email draft + send

+- Prefer `POST /api/codex/emails/draft-document` for Codex-written email replies. It creates an editable Odysseus Document with `language: "email"` and does not touch IMAP/send.
 - `POST /api/codex/emails/draft` — body matches `SendEmailRequest` (`to`, `cc`, `bcc`, `subject`, `body`, `body_html`, `attachments`, `account_id`, `in_reply_to`, `references`). Requires `email:draft` (or `email:send`).
 - `POST /api/codex/emails/send` — same body. Requires `email:send`. Never send without explicit user instruction.

@@ -885,8 +885,109 @@ def _smtp_connect(account=None, cfg=None):
    return conn


+def _read_agent_email_confirm_setting() -> bool:
+    """True if the user wants agent send_email/reply_to_email calls to be
+    queued for manual approval instead of SMTPed immediately. Defaults to
+    True so a fresh install is safe — agents have been observed inventing
+    signatures and sending to real recipients without the user's review."""
+    try:
+        from src.settings import get_setting
+        return bool(get_setting("agent_email_confirm", True))
+    except Exception:
+        return True
+
+
+def _stash_agent_draft(*, to, subject, body, in_reply_to=None, references=None,
+                      cc=None, bcc=None, account=None) -> dict:
+    """Insert the composed email into scheduled_emails with status
+    'agent_draft' and a far-future send_at so the scheduled-send poller
+    never picks it up. Returns the pending payload the model surfaces to
+    the user (and that the chat UI can render as an approval card)."""
+    try:
+        from src.constants import SCHEDULED_EMAILS_DB
+    except Exception:
+        return {"success": False, "error": "Pending-email storage unavailable"}
+    pending_id = uuid.uuid4().hex[:16]
+    far_future = "9999-12-31T00:00:00"
+    now = datetime.utcnow().isoformat()
+    try:
+        conn = sqlite3.connect(SCHEDULED_EMAILS_DB)
+        # Touch the schema in case the email-routes init hasn't run yet
+        # (MCP server can boot independently).
+        conn.execute("""
+            CREATE TABLE IF NOT EXISTS scheduled_emails (
+                id TEXT PRIMARY KEY,
+                to_addr TEXT NOT NULL,
+                cc TEXT,
+                bcc TEXT,
+                subject TEXT,
+                body TEXT NOT NULL,
+                in_reply_to TEXT,
+                references_hdr TEXT,
+                attachments TEXT,
+                send_at TEXT NOT NULL,
+                created_at TEXT NOT NULL,
+                status TEXT NOT NULL DEFAULT 'pending',
+                error TEXT,
+                owner TEXT DEFAULT '',
+                account_id TEXT,
+                odysseus_kind TEXT
+            )
+        """)
+        conn.execute("""
+            INSERT INTO scheduled_emails
+            (id, to_addr, cc, bcc, subject, body, in_reply_to, references_hdr,
+             attachments, send_at, created_at, status, account_id, odysseus_kind, owner)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 'agent_draft', ?, ?, ?)
+        """, (
+            pending_id,
+            to if isinstance(to, str) else ", ".join(to),
+            cc if isinstance(cc, str) else (", ".join(cc) if cc else None),
+            bcc if isinstance(bcc, str) else (", ".join(bcc) if bcc else None),
+            subject or "",
+            body or "",
+            in_reply_to or None,
+            references if isinstance(references, str) else (" ".join(references) if references else None),
+            "[]",
+            far_future,
+            now,
+            account or None,
+            "agent_draft",
+            "",
+        ))
+        conn.commit()
+        conn.close()
+    except Exception as e:
+        return {"success": False, "error": f"Failed to stash draft: {e}"}
+    return {
+        "success": True,
+        "pending": True,
+        "pending_id": pending_id,
+        "to": to if isinstance(to, str) else ", ".join(to),
+        "subject": subject or "",
+        "body": body or "",
+        "message": (
+            "✋ Draft staged for your approval — nothing has been sent yet.\n"
+            "Review the To/Subject/Body above. Reply 'send' to deliver, or "
+            "'cancel' to discard."
+        ),
+    }
+
+
 def _send_email(to, subject, body, in_reply_to=None, references=None, cc=None, bcc=None, account=None):
-    """Send an email via SMTP. Returns dict with status."""
+    """Send an email via SMTP. Returns dict with status.
+
+    When the `agent_email_confirm` setting is on (the default), the email
+    is NOT SMTPed — instead it lands in scheduled_emails as an
+    `agent_draft` row and the user reviews + approves it from the chat
+    UI. This closes the auto-send hole that let earlier models invent
+    signatures and ship them to real recipients without confirmation."""
+    if _read_agent_email_confirm_setting():
+        return _stash_agent_draft(
+            to=to, subject=subject, body=body,
+            in_reply_to=in_reply_to, references=references,
+            cc=cc, bcc=bcc, account=account,
+        )
    send_account, cfg = _resolve_send_config(account)
    msg = EmailMessage()
    msg["From"] = _clean_header_value(cfg["from_address"])
@@ -31,6 +31,7 @@ ALLOWED_SCOPES = {
 TOKEN_PROFILES = {
    "chat": ["chat"],
    "codex_todos": ["todos:read", "todos:write"],
+    "codex_documents": ["documents:read", "documents:write"],
    "codex_email_drafts": ["email:read", "email:draft", "documents:read", "documents:write"],
 }

@@ -159,6 +160,8 @@ def setup_api_token_routes() -> APIRouter:
            payload = await request.json()
        except Exception:
            payload = {}
+        if not isinstance(payload, dict):
+            payload = {}
        with get_db_session() as db:
            token = db.query(ApiToken).filter(ApiToken.id == token_id).first()
            if not token:
@@ -6,7 +6,7 @@ import os
 import time
 import logging
 from datetime import datetime
-from typing import Dict, Any, AsyncGenerator, List
+from typing import Dict, Any, AsyncGenerator, List, Optional

 from fastapi import APIRouter, Request, HTTPException, Form, Query
 from fastapi.responses import StreamingResponse
@@ -126,7 +126,8 @@ def _clear_orphaned_session_endpoint(sess, owner: str | None = None) -> bool:
        sess.model = ""
        sess.headers = {}
        return True
-    except Exception:
+    except Exception as e:
+        logger.warning("Failed to clear orphaned session endpoint", exc_info=e)
        db.rollback()
        return False
    finally:
@@ -144,7 +145,8 @@ def _endpoint_cache_contains_model(endpoint, model: str) -> bool:
        return True
    try:
        models = json.loads(raw) if isinstance(raw, str) else raw
-    except Exception:
+    except Exception as e:
+        logger.warning("Failed to parse cached models list, treating as containing model", exc_info=e)
        return True
    if not isinstance(models, list) or not models:
        return True
@@ -236,7 +238,8 @@ def _recover_empty_session_model(sess, session_id: str, owner: str | None = None
                is_chatgpt_subscription = False
        try:
            cached = json.loads(ep.cached_models) if isinstance(ep.cached_models, str) else (ep.cached_models or [])
-        except Exception:
+        except Exception as e:
+            logger.warning("Failed to parse cached_models for endpoint %r", getattr(ep, "id", "?"), exc_info=e)
            cached = []
        if not cached:
            visible = []
@@ -526,6 +529,66 @@ def setup_chat_routes(
        active_doc_id = form_data.get("active_doc_id", "").strip()
        logger.info(f"[doc-inject] chat_mode={chat_mode}, active_doc_id={active_doc_id!r}")

+        # Active email reader — when the user has an email open in the UI, the
+        # frontend passes its uid/folder/account so "reply", "summarize this",
+        # etc. resolve to the real email instead of the agent inventing a
+        # fake markdown draft.
+        active_email_uid = form_data.get("active_email_uid", "").strip()
+        active_email_folder = form_data.get("active_email_folder", "INBOX").strip() or "INBOX"
+        active_email_account = form_data.get("active_email_account", "").strip()
+        active_email_ctx: Optional[Dict[str, str]] = None
+        # Always reset between requests so a stale active-email pointer from
+        # a previous turn (different reader closed, different account, etc.)
+        # can't leak in when the user has no email open this turn.
+        try:
+            from src.tool_implementations import clear_active_email
+            clear_active_email()
+        except Exception:
+            pass
+        if active_email_uid:
+            active_email_ctx = {
+                "uid": active_email_uid,
+                "folder": active_email_folder,
+                "account": active_email_account,
+            }
+            # Try to enrich with subject + from so the agent's system prompt
+            # block can quote them. Best-effort: a stale cache is fine, a
+            # missing email just means we pass uid/folder/account only.
+            try:
+                from routes.email_routes import _read_cache_get, _read_cache_key
+                _ck = _read_cache_key(active_email_account or None, active_email_folder, active_email_uid, owner=get_current_user(request))
+                _cached_email = _read_cache_get(_ck)
+                if _cached_email and isinstance(_cached_email, dict):
+                    active_email_ctx["subject"] = str(_cached_email.get("subject") or "")
+                    active_email_ctx["from"] = str(
+                        _cached_email.get("from_address")
+                        or _cached_email.get("from")
+                        or _cached_email.get("from_name")
+                        or ""
+                    )
+                    _body_preview = (_cached_email.get("body") or "")[:2000]
+                    if _body_preview:
+                        active_email_ctx["body_preview"] = _body_preview
+            except Exception as _e:
+                logger.debug(f"[email-inject] cache enrich skipped: {_e}")
+            # Stash so email tools can resolve "this email" without UID guessing.
+            try:
+                from src.tool_implementations import set_active_email
+                set_active_email(
+                    uid=active_email_uid,
+                    folder=active_email_folder,
+                    account=active_email_account or None,
+                    subject=active_email_ctx.get("subject"),
+                    sender=active_email_ctx.get("from"),
+                )
+            except Exception as _e:
+                logger.debug(f"[email-inject] set_active_email failed: {_e}")
+            logger.info(
+                "[email-inject] active_email uid=%s folder=%s account=%s subject=%r",
+                active_email_uid, active_email_folder, active_email_account or "(default)",
+                active_email_ctx.get("subject", ""),
+            )
+
        try:
            # Attachment-only sends: skip the message-required check when the
            # user has attached one or more files (the attachment IS the action).
@@ -586,8 +649,8 @@ def setup_chat_routes(
        elif attachments:
            try:
                att_ids = [str(x) for x in json.loads(attachments)]
-            except Exception:
-                pass
+            except Exception as e:
+                logger.warning("Failed to parse attachments JSON, ignoring attachments", exc_info=e)

        no_memory = str(form_data.get("no_memory", "")).lower() == "true"
        pre_context_tool_policy = build_effective_tool_policy(
@@ -641,15 +704,27 @@ def setup_chat_routes(
                            active_doc_id,
                        )
                        active_doc = None
-                    elif doc_session and doc_session != session:
-                        logger.warning(
-                            "[doc-inject] ignoring stale active_doc_id %s from session %s while in session %s",
-                            active_doc_id,
-                            doc_session,
-                            session,
-                        )
-                        active_doc = None
                    else:
+                        # NOTE: previously dropped the doc when doc.session_id
+                        # != current chat session — but that broke the common
+                        # case of "open an email draft from one chat, ask a
+                        # different chat to write into it". The frontend only
+                        # sends active_doc_id for docs currently visible in
+                        # the UI, and we already owner-checked above, so trust
+                        # the explicit signal. We just log the mismatch and
+                        # re-bind the doc to the current session so future
+                        # turns find it via the session-fallback path too.
+                        if doc_session and doc_session != session:
+                            logger.info(
+                                "[doc-inject] cross-session active_doc_id %s (was session %s, now %s) — accepting and rebinding",
+                                active_doc_id, doc_session, session,
+                            )
+                            try:
+                                active_doc.session_id = session
+                                _doc_db.commit()
+                            except Exception as _e:
+                                _doc_db.rollback()
+                                logger.warning(f"[doc-inject] session rebind failed: {_e}")
                        logger.info(f"[doc-inject] found by ID: title={active_doc.title!r}, lang={active_doc.language!r}, is_active={active_doc.is_active}, content_len={len(active_doc.current_content or '')}")
                else:
                    logger.warning(f"[doc-inject] NOT FOUND by ID {active_doc_id}")
@@ -714,6 +789,21 @@ def setup_chat_routes(
                "manage_skills",      # skill presets tied to user
            })

+        # Active email reader open → strip the tools that let the agent
+        # "drift" to a new compose: create_document (writes a fake email-
+        # shaped .md file) and send_email (sends fresh to a recipient the
+        # agent invented). With those gone, the only paths left for "write
+        # email saying X" are ui_control open_email_reply (draft) and
+        # reply_to_email (immediate send) — both of which use the open
+        # email's UID. Code-level enforcement instead of relying on a
+        # prompt rule the model can ignore.
+        if active_email_ctx and active_email_ctx.get("uid"):
+            disabled_tools.update({
+                "create_document",
+                "send_email",
+                "mcp__email__send_email",
+            })
+
        # Enforce per-user privileges
        _privs = {}
        _user = ctx.user
@@ -1181,6 +1271,7 @@ def setup_chat_routes(
                        max_rounds=_max_rounds,
                        context_length=ctx.context_length,
                        active_document=active_doc,
+                        active_email=active_email_ctx,
                        session_id=session,
                        disabled_tools=disabled_tools if disabled_tools else None,
                        tool_policy=tool_policy,
@@ -91,6 +91,20 @@ def _scope_owner(request: Request, allowed: set[str]) -> str:
    return require_user(request)


+def _scope_owner_all(request: Request, required: set[str]) -> str:
+    """Return owner only when an API token has every required scope."""
+    if getattr(request.state, "api_token", False):
+        scopes = set(getattr(request.state, "api_token_scopes", []) or [])
+        missing = required - scopes
+        if missing:
+            raise HTTPException(403, f"API token missing required scope: {' and '.join(sorted(missing))}")
+        owner = getattr(request.state, "api_token_owner", None)
+        if not owner:
+            raise HTTPException(403, "API token has no owner")
+        return owner
+    return require_user(request)
+
+
 def _find_endpoint(router: APIRouter | None, method: str, path: str):
    if router is None:
        return None
@@ -138,7 +152,7 @@ def setup_codex_routes(
                    "read": scoped(EMAIL_READ_SCOPES),
                    "draft": scoped(EMAIL_DRAFT_SCOPES),
                    "send": scoped(EMAIL_SEND_SCOPES),
-                    "actions": ["list", "read", "draft", "send"],
+                    "actions": ["list", "read", "draft_document", "draft", "send"],
                },
                "memory": {
                    "read": scoped(MEMORY_READ_SCOPES),
@@ -262,6 +276,56 @@ def setup_codex_routes(
    # Both handlers in routes/email_routes.py already accept `owner=` via
    # FastAPI Depends, so we call them directly without patching state.

+    def _email_draft_document_content(body: dict[str, Any]) -> str:
+        def clean(v: Any) -> str:
+            if isinstance(v, list):
+                return ", ".join(str(x).strip() for x in v if str(x).strip())
+            return str(v or "").strip()
+
+        to = clean(body.get("to"))
+        cc = clean(body.get("cc"))
+        bcc = clean(body.get("bcc"))
+        subject = clean(body.get("subject"))
+        in_reply_to = clean(body.get("in_reply_to"))
+        references = clean(body.get("references"))
+        body_text = str(body.get("body") or body.get("body_html") or "").strip()
+        lines = [
+            f"To: {to}",
+        ]
+        if cc:
+            lines.append(f"Cc: {cc}")
+        if bcc:
+            lines.append(f"Bcc: {bcc}")
+        lines.append(f"Subject: {subject}")
+        if in_reply_to:
+            lines.append(f"In-Reply-To: {in_reply_to}")
+        if references:
+            lines.append(f"References: {references}")
+        lines.extend(["---", body_text])
+        return "\n".join(lines).rstrip() + "\n"
+
+    @router.post("/emails/draft-document")
+    async def codex_email_draft_document(request: Request, body: dict[str, Any] = Body(default_factory=dict)):
+        owner = _scope_owner_all(request, {"email:draft", "documents:write"})
+        if documents_create_endpoint is None:
+            raise HTTPException(503, "Documents integration is not available")
+        from routes.document_routes import DocumentCreate
+
+        subject = str(body.get("subject") or "Email draft").strip() or "Email draft"
+        title = str(body.get("title") or subject).strip() or "Email draft"
+        req = DocumentCreate(
+            session_id=body.get("session_id"),
+            title=title,
+            language="email",
+            content=_email_draft_document_content(body),
+        )
+        result = await _as_owner(request, owner, documents_create_endpoint, request, req)
+        if isinstance(result, dict):
+            result = dict(result)
+            result["draft_type"] = "document"
+            result["send_required_confirmation"] = True
+        return result
+
    @router.post("/emails/draft")
    async def codex_email_draft(request: Request, body: dict[str, Any] = Body(default_factory=dict)):
        owner = _scope_owner(request, EMAIL_DRAFT_SCOPES)
@@ -726,7 +790,7 @@ def setup_codex_routes(
        norm = dict(body or {})
        sess = (norm.get("tmux_session") or norm.get("session_id") or "").strip()
        model = (norm.get("model") or norm.get("repo_id") or "").strip()
-        host = (norm.get("host") or norm.get("remote_host") or "").strip()
+        host = validate_remote_host((norm.get("host") or norm.get("remote_host") or "").strip() or None) or ""
        port = norm.get("port") or 8000
        import re as _re
        if not sess or not _re.fullmatch(r"[a-zA-Z0-9_-]+", sess):
@@ -12,6 +12,7 @@ import json
 import csv
 import io
 import os
+import inspect
 import httpx
 from pathlib import Path
 from datetime import datetime
@@ -90,11 +91,13 @@ def _normalize_contact(contact: Dict) -> Dict:
    name = str(contact.get("name") or "").strip()
    if not name and emails:
        name = emails[0].split("@")[0]
+    address = str(contact.get("address") or "").strip()
    return {
        "uid": str(contact.get("uid") or uuid.uuid4()),
        "name": name,
        "emails": emails,
        "phones": phones,
+        "address": address,
    }


@@ -150,7 +153,7 @@ def _parse_vcards(text: str) -> List[Dict]:
    for block in re.split(r"BEGIN:VCARD", text):
        if not block.strip():
            continue
-        contact = {"name": "", "emails": [], "phones": [], "uid": ""}
+        contact = {"name": "", "emails": [], "phones": [], "uid": "", "address": ""}
        for line in block.split("\n"):
            line = line.strip()
            # Strip an optional RFC 6350 group prefix (e.g. "item1.EMAIL;...")
@@ -173,6 +176,15 @@ def _parse_vcards(text: str) -> List[Dict]:
                    phone = _vunesc(name_part.split(":", 1)[1])
                    if phone and phone not in contact["phones"]:
                        contact["phones"].append(phone)
+            elif name_part.startswith("ADR"):
+                # vCard ADR is 7 semicolon-separated components:
+                # post-office-box;extended-address;street;locality;region;postal-code;country.
+                # Recover a human-readable string by joining non-empty
+                # components with ", ".
+                if ":" in name_part:
+                    raw = name_part.split(":", 1)[1]
+                    parts = [_vunesc(p).strip() for p in raw.split(";")]
+                    contact["address"] = ", ".join(p for p in parts if p)
            elif name_part.startswith("UID:"):
                contact["uid"] = _vunesc(name_part[4:])
        if contact["name"] or contact["emails"]:
@@ -197,7 +209,8 @@ def _vesc(value: str) -> str:

 def _build_vcard(name: str, email: str, uid: Optional[str] = None,
                 emails: Optional[List[str]] = None,
-                 phones: Optional[List[str]] = None) -> str:
+                 phones: Optional[List[str]] = None,
+                 address: Optional[str] = None) -> str:
    """Build a vCard. Accepts either a single `email` (legacy callers) or
    full `emails`/`phones` lists (edit path). The first email is marked
    PREF=1. All values are RFC-6350-escaped."""
@@ -230,6 +243,12 @@ def _build_vcard(name: str, email: str, uid: Optional[str] = None,
        lines.append(f"EMAIL;PREF=1:{_vesc(em)}" if i == 0 else f"EMAIL:{_vesc(em)}")
    for ph in phone_list:
        lines.append(f"TEL:{_vesc(ph)}")
+    # Address: stuff the whole human-readable string into the street
+    # component of ADR. vCard ADR has 7 semicolon-separated components:
+    # post-office-box;extended-address;street;locality;region;postal-code;country.
+    addr = (address or "").strip()
+    if addr:
+        lines.append(f"ADR:;;{_vesc(addr)};;;;")
    lines.append("END:VCARD")
    return "\r\n".join(lines) + "\r\n"

@@ -366,7 +385,7 @@ def _resolve_resource_url(uid: str) -> str:
    return _lookup() or _vcard_url(uid)


-def _create_contact(name: str, email: str) -> bool:
+def _create_contact(name: str, email: str, address: str = "") -> bool:
    """Add a new contact via CardDAV or local contacts."""
    cfg = _get_carddav_config()
    if not _carddav_configured(cfg):
@@ -375,12 +394,12 @@ def _create_contact(name: str, email: str) -> bool:
        for c in contacts:
            if email_l and email_l in [e.lower() for e in c.get("emails", [])]:
                return True
-        contacts.append(_normalize_contact({"name": name, "emails": [email]}))
+        contacts.append(_normalize_contact({"name": name, "emails": [email], "address": address}))
        _save_local_contacts(contacts)
        return True

    contact_uid = str(uuid.uuid4())
-    vcard = _build_vcard(name, email, contact_uid)
+    vcard = _build_vcard(name, email, contact_uid, address=address)
    try:
        url = _carddav_base_url(cfg) + "/" + contact_uid + ".vcf"
        auth = None
@@ -613,7 +632,7 @@ def _contacts_to_csv(contacts: List[Dict]) -> str:
    return out.getvalue()


-def _update_contact(uid: str, name: str, emails: List[str], phones: List[str]) -> bool:
+def _update_contact(uid: str, name: str, emails: List[str], phones: List[str], address: str = "") -> bool:
    """Rewrite an existing contact via CardDAV or local contacts."""
    cfg = _get_carddav_config()
    if not _carddav_configured(cfg):
@@ -622,16 +641,19 @@ def _update_contact(uid: str, name: str, emails: List[str], phones: List[str]) -
        out = []
        for c in contacts:
            if c.get("uid") == uid:
-                out.append(_normalize_contact({"uid": uid, "name": name, "emails": emails, "phones": phones}))
+                # Preserve existing address when caller passes "" (only
+                # updating name/emails/phones, not touching address).
+                addr = address if address else c.get("address", "")
+                out.append(_normalize_contact({"uid": uid, "name": name, "emails": emails, "phones": phones, "address": addr}))
                found = True
            else:
                out.append(c)
        if not found:
-            out.append(_normalize_contact({"uid": uid, "name": name, "emails": emails, "phones": phones}))
+            out.append(_normalize_contact({"uid": uid, "name": name, "emails": emails, "phones": phones, "address": address}))
        _save_local_contacts(out)
        return True

-    vcard = _build_vcard(name, "", uid=uid, emails=emails, phones=phones)
+    vcard = _build_vcard(name, "", uid=uid, emails=emails, phones=phones, address=address)
    # Use the real resource href (handles externally-created contacts whose
    # filename != UID); falls back to the <uid>.vcf guess.
    try:
@@ -718,16 +740,39 @@ def setup_contacts_routes():
        """Add a new contact."""
        name = (data.get("name") or "").strip()
        email = (data.get("email") or "").strip()
+        phone = (data.get("phone") or "").strip()
+        address = (data.get("address") or "").strip()
        if not email:
            return {"success": False, "error": "Email required"}
-        # Check if already exists
-        contacts = _fetch_contacts()
-        for c in contacts:
-            if email.lower() in [e.lower() for e in c["emails"]]:
-                return {"success": True, "message": "Already exists", "contact": c}
+        # Check if already exists by email
+        if email:
+            contacts = _fetch_contacts()
+            for c in contacts:
+                if email.lower() in [e.lower() for e in c["emails"]]:
+                    return {"success": True, "message": "Already exists", "contact": c}
        if not name:
            name = email.split("@")[0]
-        ok = _create_contact(name, email)
+        create_params = inspect.signature(_create_contact).parameters
+        if len(create_params) >= 3:
+            ok = _create_contact(name, email, address)
+        else:
+            ok = _create_contact(name, email)
+        # If a phone was provided, do an immediate update to thread it
+        # through (the simple _create_contact signature only takes name +
+        # email + address; phones happen via update).
+        if ok and phone:
+            try:
+                fresh = _fetch_contacts(force=True)
+                created = next((c for c in fresh if name == c.get("name") and (not email or email in c.get("emails", []))), None)
+                if created:
+                    _update_contact(
+                        created["uid"], name,
+                        created.get("emails", []),
+                        [phone],
+                        address,
+                    )
+            except Exception:
+                pass
        return {"success": ok}

    @router.post("/import")
@@ -810,7 +855,7 @@ def setup_contacts_routes():
    # match PUT /{uid} with uid="config".
    @router.put("/{uid}")
    async def edit_contact(uid: str, data: dict, _admin: str = Depends(require_admin)):
-        """Edit an existing contact — name / emails / phones."""
+        """Edit an existing contact — name / emails / phones / address."""
        name = (data.get("name") or "").strip()
        emails = data.get("emails")
        phones = data.get("phones")
@@ -818,11 +863,12 @@ def setup_contacts_routes():
            emails = [data["email"]]
        emails = [e.strip() for e in (emails or []) if e and e.strip()]
        phones = [p.strip() for p in (phones or []) if p and p.strip()]
-        if not name and not emails:
-            return {"success": False, "error": "Name or email required"}
+        address = (data.get("address") or "").strip()
+        if not name and not emails and not address:
+            return {"success": False, "error": "Name, email, or address required"}
        if not name and emails:
            name = emails[0].split("@")[0]
-        ok = _update_contact(uid, name, emails, phones)
+        ok = _update_contact(uid, name, emails, phones, address)
        return {"success": ok}

    @router.delete("/{uid}")
@@ -676,7 +676,7 @@ def setup_cookbook_routes() -> APIRouter:
            _spf = f"-p {_port} " if _port and _port != "22" else ""
            setup_cmd = (
                f"scp -O {_pf}-q '{runner_path}' {remote}:{remote_runner} && "
-                f"ssh {_spf}{remote} 'chmod +x {remote_runner} && tmux new-session -d -s {session_id} \"./{remote_runner}\"'"
+                f"ssh {_spf}{remote} 'chmod +x {remote_runner} && tmux set-option -g history-limit 100000 2>/dev/null; tmux new-session -d -s {session_id} \"./{remote_runner}\"'"
            )
        else:
            # Local: run hf download in the background (tmux on POSIX, a detached
@@ -708,7 +708,7 @@ def setup_cookbook_routes() -> APIRouter:
                lines.append('exec "${SHELL:-/bin/bash}"')
                wrapper_script.write_text("\n".join(lines) + "\n", encoding="utf-8")
                wrapper_script.chmod(0o755)
-            setup_cmd = None if IS_WINDOWS else f"tmux new-session -d -s {session_id} {shlex.quote(str(wrapper_script))}"
+            setup_cmd = None if IS_WINDOWS else f"tmux set-option -g history-limit 100000 2>/dev/null; tmux new-session -d -s {session_id} {shlex.quote(str(wrapper_script))}"

        logger.info(f"Model download: {req.repo_id} (backend={'ollama' if is_ollama_download else 'hf'}, include={req.include}, session={session_id}, remote={remote})")
        logger.info(f"Download setup_cmd: {setup_cmd}")
@@ -984,9 +984,9 @@ def setup_cookbook_routes() -> APIRouter:
            ssh_args = ["ssh"]
            if ssh_port and ssh_port != "22":
                ssh_args.extend(["-p", str(ssh_port)])
-            capture_cmd = ssh_args + [remote, "tmux", "capture-pane", "-t", session_id, "-p", "-S", "-200"]
+            capture_cmd = ssh_args + [remote, "tmux", "capture-pane", "-t", session_id, "-p", "-S", "-2000"]
        else:
-            capture_cmd = ["tmux", "capture-pane", "-t", session_id, "-p", "-S", "-200"]
+            capture_cmd = ["tmux", "capture-pane", "-t", session_id, "-p", "-S", "-2000"]

        _exit_re = re.compile(r"=== Process exited with code (-?\d+) ===")
        for wait_s in _waits:
@@ -1577,10 +1577,10 @@ def setup_cookbook_routes() -> APIRouter:
                setup_cmd = (
                    f"{scp_extras}"
                    f"scp -O {_Pf}-q '{runner_path}' {remote}:{remote_runner} && "
-                    f"ssh {_pf}{remote} 'chmod +x {remote_runner} && tmux new-session -d -s {session_id} \"./{remote_runner}\"'"
+                    f"ssh {_pf}{remote} 'chmod +x {remote_runner} && tmux set-option -g history-limit 100000 2>/dev/null; tmux new-session -d -s {session_id} \"./{remote_runner}\"'"
                )
            else:
-                setup_cmd = f"tmux new-session -d -s {session_id} {shlex.quote(str(runner_path))}"
+                setup_cmd = f"tmux set-option -g history-limit 100000 2>/dev/null; tmux new-session -d -s {session_id} {shlex.quote(str(runner_path))}"

        if setup_cmd is None:
            # LOCAL Windows: launch the bash runner detached; no tmux setup_cmd.
@@ -2625,6 +2625,193 @@ def setup_cookbook_routes() -> APIRouter:
            "error": _ollama_library_cache["error"],
        }

+    # ── vLLM recipe scraper ─────────────────────────────────────────────
+    # Fetches the official YAML recipe for a model from vllm-project/recipes
+    # and normalizes it into a small JSON the frontend can consume. Cached
+    # per-repo so the GitHub raw endpoint isn't hammered.
+    _vllm_recipe_cache: dict[str, tuple[float, dict | None]] = {}
+    # Manifest of all <org>/<model> ids that have a recipe in the upstream
+    # repo. Cheap to fetch (one Git Tree API call), so we cache the whole
+    # set for ~12h. Per-row "does this model have a recipe?" lookups hit
+    # this set instead of doing 912 individual recipe fetches.
+    _vllm_recipe_manifest: dict = {"fetched_at": 0.0, "models": set(), "error": ""}
+
+    @router.get("/api/cookbook/vllm-recipe-manifest")
+    async def vllm_recipe_manifest(refresh: int = 0):
+        """Return the set of <org>/<model> ids known to have a vLLM recipe.
+        One GitHub Tree API call, 12h cache. The frontend uses this to badge
+        rows in the model list before the user expands them."""
+        import time as _time
+        import httpx as _httpx
+        TTL = 12 * 3600.0
+        now = _time.time()
+        if (
+            refresh
+            or (now - _vllm_recipe_manifest["fetched_at"]) > TTL
+            or not _vllm_recipe_manifest["models"]
+        ):
+            url = (
+                "https://api.github.com/repos/vllm-project/recipes/"
+                "git/trees/main?recursive=1"
+            )
+            def _fetch_sync() -> tuple[int, dict | None, str]:
+                try:
+                    headers = {"Accept": "application/vnd.github+json"}
+                    with _httpx.Client(timeout=10.0, follow_redirects=True) as client:
+                        r = client.get(url, headers=headers)
+                        if r.status_code != 200:
+                            return r.status_code, None, r.text[:200]
+                        return 200, r.json(), ""
+                except Exception as e:
+                    return 0, None, f"fetch error: {e}"
+            status, data, err = await asyncio.to_thread(_fetch_sync)
+            if status == 200 and isinstance(data, dict):
+                models: set[str] = set()
+                for entry in data.get("tree") or []:
+                    path = (entry or {}).get("path") or ""
+                    if not path.startswith("models/") or not path.endswith(".yaml"):
+                        continue
+                    # path = "models/<org>/<model>.yaml" → "<org>/<model>"
+                    body = path[len("models/"):-len(".yaml")]
+                    if "/" in body:
+                        models.add(body)
+                _vllm_recipe_manifest["models"] = models
+                _vllm_recipe_manifest["fetched_at"] = now
+                _vllm_recipe_manifest["error"] = ""
+            else:
+                _vllm_recipe_manifest["error"] = (
+                    f"HTTP {status}: {err}" if status else err
+                )
+                # Don't clobber a stale-but-usable list on transient failures.
+                if not _vllm_recipe_manifest["models"]:
+                    return {
+                        "models": [],
+                        "count": 0,
+                        "error": _vllm_recipe_manifest["error"],
+                    }
+        return {
+            "models": sorted(_vllm_recipe_manifest["models"]),
+            "count": len(_vllm_recipe_manifest["models"]),
+            "fetched_at": _vllm_recipe_manifest["fetched_at"],
+            "error": _vllm_recipe_manifest["error"],
+        }
+
+    @router.get("/api/cookbook/vllm-recipe")
+    async def vllm_recipe(repo: str, refresh: int = 0):
+        """Return the vLLM official recipe for a HuggingFace repo, if one
+        exists at vllm-project/recipes. `repo` is the full HF id like
+        'MiniMaxAI/MiniMax-M2'. Cached 6h."""
+        import time as _time
+        import httpx as _httpx
+        import yaml as _yaml
+
+        TTL = 6 * 3600.0
+        now = _time.time()
+        repo = (repo or "").strip().strip("/")
+        if "/" not in repo:
+            return {"exists": False, "error": "repo must be <org>/<model>"}
+
+        cached = _vllm_recipe_cache.get(repo)
+        if cached and not refresh and (now - cached[0]) < TTL:
+            return cached[1] or {"exists": False, "cached": True}
+
+        url = (
+            f"https://raw.githubusercontent.com/vllm-project/recipes/"
+            f"main/models/{repo}.yaml"
+        )
+
+        def _fetch_sync() -> tuple[int, str]:
+            try:
+                with _httpx.Client(timeout=8.0, follow_redirects=True) as client:
+                    r = client.get(url)
+                    return r.status_code, r.text
+            except Exception as e:
+                return 0, f"fetch error: {e}"
+
+        status, text = await asyncio.to_thread(_fetch_sync)
+        if status == 404:
+            _vllm_recipe_cache[repo] = (now, {"exists": False})
+            return {"exists": False}
+        if status != 200:
+            return {"exists": False, "error": f"HTTP {status}", "transient": True}
+
+        try:
+            doc = _yaml.safe_load(text) or {}
+        except Exception as e:
+            return {"exists": False, "error": f"yaml parse: {e}"}
+
+        meta = doc.get("meta") or {}
+        model = doc.get("model") or {}
+        features = doc.get("features") or {}
+        deps = doc.get("dependencies") or []
+        variants = doc.get("variants") or {}
+        hw_overrides = doc.get("hardware_overrides") or {}
+        strat_overrides = doc.get("strategy_overrides") or {}
+
+        # Tool-call + reasoning parsers, as flat arg arrays, so the frontend
+        # can drop them straight into the launch command.
+        tool_calling = features.get("tool_calling") or {}
+        reasoning = features.get("reasoning") or {}
+
+        normalized = {
+            "exists": True,
+            "source_url": url,
+            "title": meta.get("title") or "",
+            "provider": meta.get("provider") or "",
+            "description": meta.get("description") or "",
+            "date_updated": str(meta.get("date_updated") or ""),
+            "hardware_support": meta.get("hardware") or {},
+            "model_id": model.get("model_id") or repo,
+            "min_vllm_version": model.get("min_vllm_version") or "",
+            "architecture": model.get("architecture") or "",
+            "parameter_count": model.get("parameter_count") or "",
+            "active_parameters": model.get("active_parameters") or "",
+            "context_length": model.get("context_length") or 0,
+            "base_args": list(model.get("base_args") or []),
+            "base_env": dict(model.get("base_env") or {}),
+            "tool_calling": {
+                "description": tool_calling.get("description") or "",
+                "args": list(tool_calling.get("args") or []),
+            } if tool_calling else None,
+            "reasoning": {
+                "description": reasoning.get("description") or "",
+                "args": list(reasoning.get("args") or []),
+            } if reasoning else None,
+            "dependencies": [
+                {
+                    "note": (d.get("note") or "").strip(),
+                    "command": (d.get("command") or "").strip(),
+                    "optional": bool(d.get("optional", False)),
+                }
+                for d in deps if isinstance(d, dict)
+            ],
+            "variants": {
+                k: {
+                    "model_id": v.get("model_id") or model.get("model_id") or repo,
+                    "precision": v.get("precision") or "",
+                    "vram_minimum_gb": v.get("vram_minimum_gb") or 0,
+                    "description": v.get("description") or "",
+                    "extra_args": list(v.get("extra_args") or []),
+                    "extra_env": dict(v.get("extra_env") or {}),
+                }
+                for k, v in variants.items() if isinstance(v, dict)
+            },
+            "hardware_overrides": {
+                hw: {
+                    "extra_args": list((ov or {}).get("extra_args") or []),
+                    "extra_env": dict((ov or {}).get("extra_env") or {}),
+                }
+                for hw, ov in hw_overrides.items() if isinstance(ov, dict)
+            },
+            "strategy_overrides": {
+                strat: dict(ov or {})
+                for strat, ov in strat_overrides.items() if isinstance(ov, dict)
+            },
+            "compatible_strategies": list(doc.get("compatible_strategies") or []),
+        }
+        _vllm_recipe_cache[repo] = (now, normalized)
+        return normalized
+
    @router.get("/api/cookbook/tasks/status")
    async def cookbook_tasks_status(request: Request):
        """Check status of all active cookbook tmux sessions.
@@ -503,7 +503,8 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
        user = get_current_user(request)
        try:
            data = await request.json()
-        except Exception:
+        except Exception as e:
+            logger.warning("Failed to parse export request body, defaulting to empty", exc_info=e)
            data = {}
        ids = data.get("ids") or []
        if not ids:
@@ -645,8 +646,8 @@ def setup_document_routes(session_manager, upload_handler=None) -> APIRouter:
                    try:
                        from src.agent_tools.document_tools import clear_active_document
                        clear_active_document(doc_id)
-                    except Exception:
-                        pass
+                    except Exception as e:
+                        logger.warning("Failed to clear active document %r on detach", doc_id, exc_info=e)
            db.commit()
            db.refresh(doc)
            return _doc_to_dict(doc)
@@ -13,6 +13,8 @@ and `email_pollers.py` (the background loops):
 """

 import os
+import base64
+import time
 import imaplib
 import smtplib
 import email as email_mod
@@ -38,6 +40,106 @@ from src.secret_storage import decrypt as _decrypt
 logger = logging.getLogger(__name__)


+def _xoauth2_raw(user: str, access_token: str) -> str:
+    """The SASL XOAUTH2 initial-response string (unencoded).
+
+    Both smtplib.SMTP.auth() and imaplib.IMAP4.authenticate() base64-encode
+    the value their callback returns, so callers pass this raw form — never
+    pre-encoded — to avoid double base64.
+    """
+    return f"user={user}\x01auth=Bearer {access_token}\x01\x01"
+
+
+def _xoauth2_bytes(user: str, access_token: str) -> bytes:
+    """Raw XOAUTH2 bytes for imaplib's authenticate() callback."""
+    return _xoauth2_raw(user, access_token).encode()
+
+
+def make_oauth_state(account_id: str, owner: str) -> str:
+    """Return an HMAC-signed, base64-encoded OAuth state token.
+
+    Encodes account_id + owner + a random nonce, signed with the app secret
+    so the callback can validate that the flow was initiated by an
+    authenticated, owning user (CSRF / state-forgery protection).
+    """
+    import hmac as _hmac, hashlib as _hl, secrets as _sec
+    from src.secret_storage import _load_or_create_key
+    nonce = _sec.token_hex(16)
+    payload = json.dumps({"a": account_id, "o": owner, "n": nonce}, separators=(",", ":"))
+    sig = _hmac.new(_load_or_create_key(), payload.encode(), _hl.sha256).hexdigest()
+    return base64.urlsafe_b64encode(f"{payload}|{sig}".encode()).decode()
+
+
+def verify_oauth_state(state: str) -> dict | None:
+    """Verify an OAuth state token's HMAC signature.
+
+    Returns the decoded payload dict ({"a", "o", "n"}) on success, or None if
+    the token is malformed, tampered, or signed with a different key.
+    """
+    import hmac as _hmac, hashlib as _hl
+    from src.secret_storage import _load_or_create_key
+    try:
+        decoded = base64.urlsafe_b64decode(state.encode()).decode()
+        payload, sig = decoded.rsplit("|", 1)
+        expected = _hmac.new(_load_or_create_key(), payload.encode(), _hl.sha256).hexdigest()
+        if not _hmac.compare_digest(sig, expected):
+            return None
+        return json.loads(payload)
+    except Exception:
+        return None
+
+
+def _refresh_google_token(account_id: str) -> str | None:
+    """Exchange the stored refresh token for a new access token and persist it."""
+    import httpx
+    from core.database import SessionLocal as _SL, EmailAccount as _EA
+    from src.secret_storage import encrypt as _enc, decrypt as _dec
+    client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
+    client_secret = os.environ.get("GOOGLE_OAUTH_CLIENT_SECRET", "")
+    if not client_id or not client_secret:
+        return None
+    db = _SL()
+    try:
+        row = db.get(_EA, account_id)
+        if not row or not row.oauth_refresh_token:
+            return None
+        refresh_token = _dec(row.oauth_refresh_token or "")
+        if not refresh_token:
+            return None
+        resp = httpx.post("https://oauth2.googleapis.com/token", data={
+            "client_id": client_id,
+            "client_secret": client_secret,
+            "refresh_token": refresh_token,
+            "grant_type": "refresh_token",
+        }, timeout=10)
+        resp.raise_for_status()
+        data = resp.json()
+        access_token = data["access_token"]
+        row.oauth_access_token = _enc(access_token)
+        row.oauth_token_expiry = str(int(time.time()) + data.get("expires_in", 3600))
+        db.commit()
+        return access_token
+    except Exception:
+        logger.warning(f"Google token refresh failed for account {account_id}")
+        return None
+    finally:
+        db.close()
+
+
+def _get_valid_google_token(account_id: str, cfg: dict) -> str | None:
+    """Return a valid Google access token, refreshing if expired or missing."""
+    from src.secret_storage import decrypt as _dec
+    access_token = _dec(cfg.get("oauth_access_token") or "")
+    expiry_str = cfg.get("oauth_token_expiry") or ""
+    if access_token and expiry_str:
+        try:
+            if int(expiry_str) - 60 > time.time():
+                return access_token
+        except (ValueError, TypeError):
+            pass
+    return _refresh_google_token(account_id)
+
+
 def _smtp_security_mode(cfg: dict) -> str:
    raw = str(cfg.get("smtp_security") or "").strip().lower()
    if raw in {"ssl", "starttls", "none"}:
@@ -54,20 +156,29 @@ def _send_smtp_message(cfg: dict, from_addr: str, recipients: list[str], message
    port = int(cfg.get("smtp_port") or 465)
    user = cfg.get("smtp_user") or ""
    password = cfg.get("smtp_password") or ""
+
+    def _auth_smtp(smtp):
+        if cfg.get("oauth_provider") == "google":
+            token = _get_valid_google_token(cfg.get("account_id"), cfg)
+            if not token:
+                raise RuntimeError("Google OAuth token unavailable — reconnect the account")
+            smtp.ehlo()
+            smtp.auth("XOAUTH2", lambda challenge=None: _xoauth2_raw(user, token), initial_response_ok=True)
+        elif user and password:
+            smtp.login(user, password)
+
    security = _smtp_security_mode(cfg)

    if security == "ssl":
        with smtplib.SMTP_SSL(host, port, timeout=timeout) as smtp:
-            if user and password:
-                smtp.login(user, password)
+            _auth_smtp(smtp)
            smtp.sendmail(from_addr, recipients, message)
        return

    with smtplib.SMTP(host, port, timeout=timeout) as smtp:
        if security == "starttls":
            smtp.starttls()
-        if user and password:
-            smtp.login(user, password)
+        _auth_smtp(smtp)
        smtp.sendmail(from_addr, recipients, message)


@@ -701,10 +812,16 @@ def _get_email_config(account_id: str | None = None, owner: str = "") -> dict:
                    "imap_password": _decrypt(row.imap_password or ""),
                    "imap_starttls": bool(row.imap_starttls),
                    "from_address": row.from_address or row.imap_user or "",
+                    "oauth_provider": row.oauth_provider or "",
+                    "oauth_access_token": row.oauth_access_token or "",
+                    "oauth_refresh_token": row.oauth_refresh_token or "",
+                    "oauth_token_expiry": row.oauth_token_expiry or "",
+                    "display_name": row.display_name or "",
                }
-                if not (cfg["smtp_host"] and cfg["smtp_user"] and cfg["smtp_password"]):
+                is_oauth = bool(cfg.get("oauth_provider"))
+                if not is_oauth and not (cfg["smtp_host"] and cfg["smtp_user"] and cfg["smtp_password"]):
                    logger.warning(f"SMTP not configured for account {row.name!r}")
-                if not (cfg["imap_host"] and cfg["imap_user"] and cfg["imap_password"]):
+                if not is_oauth and not (cfg["imap_host"] and cfg["imap_user"] and cfg["imap_password"]):
                    logger.warning(f"IMAP not configured for account {row.name!r}")
                return cfg
        finally:
@@ -825,12 +942,19 @@ def _imap_connect(account_id: str | None = None, owner: str = "",
        timeout=timeout,
    )
    try:
-        conn.login(cfg["imap_user"], cfg["imap_password"])
+        if cfg.get("oauth_provider") == "google":
+            token = _get_valid_google_token(cfg.get("account_id"), cfg)
+            if not token:
+                raise RuntimeError("Google OAuth token unavailable — reconnect the account in Settings → Integrations")
+            conn.authenticate("XOAUTH2", lambda x: _xoauth2_bytes(cfg["imap_user"], token))
+        else:
+            conn.login(cfg["imap_user"], cfg["imap_password"])
    except Exception:
        # A failed AUTHENTICATE (e.g. an Office 365 app password on an
-        # MFA-enabled tenant, #3174) otherwise orphans the already-connected
-        # socket; close it before propagating so a misconfigured account
-        # can't leak one descriptor per retry / background poller pass.
+        # MFA-enabled tenant, #3174, or an expired/revoked OAuth token)
+        # otherwise orphans the already-connected socket; close it before
+        # propagating so a misconfigured account can't leak one descriptor
+        # per retry / background poller pass.
        try:
            conn.shutdown()
        except Exception:
@@ -13,7 +13,9 @@ handlers need. The split is mechanical — no behavior change.
 """

 import asyncio
+import os
 import sqlite3 as _sql3
+import time
 import email as email_mod
 import email.header
 import email.utils
@@ -43,6 +45,7 @@ from routes.email_helpers import (
    _load_settings, _save_settings, _get_email_config,
    _send_smtp_message, _smtp_security_mode,
    _IMAP_TIMEOUT_SECONDS, _open_imap_connection,
+    make_oauth_state, verify_oauth_state,
    _imap_connect, _imap, _decode_header, _detect_sent_folder, _detect_drafts_folder,
    _extract_attachment_text, _list_attachments_from_msg,
    _extract_attachment_to_disk, _extract_html, _extract_text,
@@ -76,15 +79,16 @@ def _email_tag_owner_aliases(account_id: str | None, owner: str = "") -> list[st
                        cfg.get("smtp_user") or "",
                        cfg.get("from_address") or "",
                    ])
-                except Exception:
+                except Exception as _e:
+                    logger.warning("Failed to resolve email account alias", exc_info=_e)
                    resolved_account_id = None
            row = db.get(_EA, resolved_account_id) if resolved_account_id else None
            if row:
                aliases.extend([row.owner or "", row.imap_user or "", row.from_address or ""])
        finally:
            db.close()
-    except Exception:
-        pass
+    except Exception as _e:
+        logger.warning("Failed to load email aliases", exc_info=_e)
    out = []
    for a in aliases:
        a = (a or "").strip()
@@ -285,7 +289,9 @@ def _group_uid_fetch_records(msg_data) -> list:


 def _smtp_ready(cfg: dict) -> bool:
-    return bool(cfg.get("smtp_host") and cfg.get("smtp_user") and cfg.get("smtp_password"))
+    if not cfg.get("smtp_host") or not cfg.get("smtp_user"):
+        return False
+    return bool(cfg.get("smtp_password") or cfg.get("oauth_provider"))


 def _resolve_send_config(account_id: str | None = None, owner: str = "") -> dict:
@@ -1097,7 +1103,12 @@ def setup_email_routes():
        account_id: str | None = Query(None),
        owner: str = Depends(require_owner),
    ):
-        """Search emails server-side via IMAP SEARCH. Matches subject, from, or body text."""
+        """Search emails server-side via IMAP SEARCH. Matches subject, from, or body text.
+
+        When the caller asks for INBOX and the account has an "All Mail"
+        folder (Gmail does), we transparently swap to All Mail so the
+        search surfaces archived / labelled emails too. Plain IMAP
+        accounts fall back to whatever folder the caller specified."""
        if not q or len(q) < 2:
            return {"emails": [], "total": 0, "query": q}
        # CRLF in q would terminate the IMAP command early — reject defensively.
@@ -1105,7 +1116,27 @@ def setup_email_routes():
            raise HTTPException(400, "Invalid query")
        try:
            with _imap(account_id, owner=owner) as conn:
-                conn.select(_q(folder), readonly=True)
+                # If the user asked for INBOX, try to upgrade to All Mail —
+                # one folder == every email on Gmail-class servers.
+                effective_folder = folder
+                if (folder or "").upper() == "INBOX":
+                    try:
+                        status, folder_lines = conn.list()
+                        if status == "OK" and folder_lines:
+                            for raw in folder_lines:
+                                if isinstance(raw, bytes):
+                                    raw = raw.decode("utf-8", errors="replace")
+                                m = re.match(r"\((?P<flags>[^)]*)\)\s+\"[^\"]*\"\s+(?P<name>.+)", raw)
+                                if not m:
+                                    continue
+                                flags = (m.group("flags") or "").lower()
+                                name = m.group("name").strip().strip('"')
+                                if "\\all" in flags or "all mail" in name.lower():
+                                    effective_folder = name
+                                    break
+                    except Exception:
+                        pass
+                conn.select(_q(effective_folder), readonly=True)

                # Escape backslash and quote for the IMAP-SEARCH quoted-string.
                q_escaped = q.replace('\\', '\\\\').replace('"', '\\"')
@@ -1113,7 +1144,7 @@ def setup_email_routes():

                status, data = _imap_uid_search(conn, search_cmd)
                if status != "OK" or not data[0]:
-                    return {"emails": [], "total": 0, "query": q}
+                    return {"emails": [], "total": 0, "query": q, "folder": effective_folder}

                uid_list = data[0].split()
                total = len(uid_list)
@@ -1178,6 +1209,13 @@ def setup_email_routes():
                            "is_flagged": "\\Flagged" in flags,
                            "flags": flags,
                            "has_attachments": has_attachments,
+                            # Stamp the folder so the frontend opens each
+                            # email from the folder it actually lives in
+                            # (the search may have run against All Mail
+                            # even though the caller asked for INBOX),
+                            # otherwise clicks open whatever happens to
+                            # have the same UID in INBOX → wrong email.
+                            "folder": effective_folder,
                        })
                    except Exception as e:
                        logger.warning(f"Error parsing search result {uid}: {e}")
@@ -1724,6 +1762,22 @@ def setup_email_routes():
            logger.error(f"Failed to mark unread {uid}: {e}")
            return {"success": False, "error": "Mail operation failed"}

+    @router.post("/flag/{uid}")
+    async def flag_email(uid: str, folder: str = Query("INBOX"), account_id: str | None = Query(None),
+                         on: bool = Query(True), owner: str = Depends(require_owner)):
+        """Toggle the \\Flagged flag (a.k.a. favorite / star) on an email.
+        Pass `on=true` to favorite, `on=false` to unfavorite."""
+        try:
+            with _imap(account_id, owner=owner) as conn:
+                conn.select(_q(folder))
+                if not _store_email_flag(conn, uid, "\\Flagged", add=bool(on)):
+                    return {"success": False, "error": "Email not found"}
+            _invalidate_list_cache(account_id, folder)
+            return {"success": True, "flagged": bool(on)}
+        except Exception as e:
+            logger.error(f"Failed to flag {uid}: {e}")
+            return {"success": False, "error": "Mail operation failed"}
+
    @router.post("/mark-read/{uid}")
    async def mark_read(uid: str, folder: str = Query("INBOX"), account_id: str | None = Query(None), owner: str = Depends(require_owner)):
        """Mark an email as read (set \\Seen flag)."""
@@ -1973,7 +2027,7 @@ def setup_email_routes():
            outer = MIMEMultipart("alternative")
            body_container = outer

-        outer["From"] = cfg["from_address"]
+        outer["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
        outer["To"] = to
        if cc:
            outer["Cc"] = cc
@@ -2104,6 +2158,79 @@ def setup_email_routes():
            logger.error(f"cancel_scheduled {sid!r} failed: {e}")
            return {"success": False, "error": "Mail operation failed"}

+    # ── Agent send-confirm: list/approve/cancel ──────────────────────────
+    # When `agent_email_confirm` is on, the MCP send_email tool drops the
+    # composed email into scheduled_emails with status='agent_draft' (a
+    # far-future send_at so the poller never picks it up). These endpoints
+    # let the chat UI surface them for the user and either approve (flip
+    # to status='pending' with send_at=now so the poller delivers it) or
+    # cancel (status='cancelled').
+    @router.get("/pending")
+    async def list_pending_agent_drafts(owner: str = Depends(require_owner)):
+        import sqlite3
+        try:
+            conn = sqlite3.connect(SCHEDULED_DB)
+            conn.row_factory = sqlite3.Row
+            # The MCP server can't easily set owner, so it stores '' — fall
+            # back to those rows in addition to the caller's owner.
+            rows = conn.execute(
+                """SELECT id, to_addr, subject, body, created_at, account_id
+                   FROM scheduled_emails
+                   WHERE status = 'agent_draft' AND (owner = ? OR owner = '')
+                   ORDER BY created_at DESC""",
+                (owner or "",),
+            ).fetchall()
+            conn.close()
+            return {"pending": [dict(r) for r in rows]}
+        except Exception as e:
+            logger.error(f"list_pending_agent_drafts failed: {e}")
+            return {"pending": [], "error": "Mail operation failed"}
+
+    @router.post("/pending/{sid}/approve")
+    async def approve_agent_draft(sid: str, owner: str = Depends(require_owner)):
+        """Approve a draft staged by the agent: flip status → pending and
+        backdate send_at so the scheduled-send poller picks it up
+        immediately."""
+        import sqlite3
+        try:
+            conn = sqlite3.connect(SCHEDULED_DB)
+            cur = conn.execute(
+                """UPDATE scheduled_emails
+                   SET status = 'pending', send_at = ?
+                   WHERE id = ? AND status = 'agent_draft' AND (owner = ? OR owner = '')""",
+                (datetime.utcnow().isoformat(), sid, owner or ""),
+            )
+            conn.commit()
+            affected = cur.rowcount
+            conn.close()
+            if not affected:
+                return {"success": False, "error": "Draft not found or already handled"}
+            return {"success": True}
+        except Exception as e:
+            logger.error(f"approve_agent_draft {sid!r} failed: {e}")
+            return {"success": False, "error": "Mail operation failed"}
+
+    @router.delete("/pending/{sid}")
+    async def cancel_agent_draft(sid: str, owner: str = Depends(require_owner)):
+        """Discard a draft the agent staged for approval."""
+        import sqlite3
+        try:
+            conn = sqlite3.connect(SCHEDULED_DB)
+            cur = conn.execute(
+                """UPDATE scheduled_emails SET status = 'cancelled'
+                   WHERE id = ? AND status = 'agent_draft' AND (owner = ? OR owner = '')""",
+                (sid, owner or ""),
+            )
+            conn.commit()
+            affected = cur.rowcount
+            conn.close()
+            if not affected:
+                return {"success": False, "error": "Draft not found or already handled"}
+            return {"success": True}
+        except Exception as e:
+            logger.error(f"cancel_agent_draft {sid!r} failed: {e}")
+            return {"success": False, "error": "Mail operation failed"}
+
    @router.get("/resolve-contact")
    async def resolve_contact(name: str = Query(..., description="Name to search for"), owner: str = Depends(require_owner)):
        """Search Sent folder for a contact by name. Returns matching email addresses."""
@@ -2164,6 +2291,7 @@ def setup_email_routes():
        try:
            cfg = _resolve_send_config(req.account_id, owner=owner)
        except Exception as e:
+            logger.warning(f"No SMTP-capable account resolved: {e}")
            return {"success": False, "error": str(e) or "No SMTP-capable email account configured"}

        # Use 'mixed' if we have attachments, 'alternative' otherwise
@@ -2176,7 +2304,7 @@ def setup_email_routes():
            outer = MIMEMultipart("alternative")
            body_container = outer

-        outer["From"] = cfg["from_address"]
+        outer["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
        outer["To"] = req.to
        if req.cc:
            outer["Cc"] = req.cc
@@ -2227,6 +2355,10 @@ def setup_email_routes():

        _account_id = cfg.get("account_id") or req.account_id  # capture for the IMAP append in the closure
        _in_reply_to = (req.in_reply_to or "").strip()
+        _oauth_provider = cfg.get("oauth_provider") or ""
+        _oauth_access_token = cfg.get("oauth_access_token") or ""
+        _oauth_refresh_token = cfg.get("oauth_refresh_token") or ""
+        _oauth_token_expiry = cfg.get("oauth_token_expiry") or ""

        def _deliver():
            try:
@@ -2237,6 +2369,11 @@ def setup_email_routes():
                        "smtp_security": _smtp_security,
                        "smtp_user": _smtp_user,
                        "smtp_password": _smtp_pw,
+                        "account_id": _account_id,
+                        "oauth_provider": _oauth_provider,
+                        "oauth_access_token": _oauth_access_token,
+                        "oauth_refresh_token": _oauth_refresh_token,
+                        "oauth_token_expiry": _oauth_token_expiry,
                    },
                    _from,
                    _recipients,
@@ -2349,7 +2486,7 @@ def setup_email_routes():
            msg.attach(MIMEText(_draft_html, "html", "utf-8"))
        else:
            msg = MIMEText(req.body, "plain", "utf-8")
-        msg["From"] = cfg["from_address"]
+        msg["From"] = email.utils.formataddr((cfg.get("display_name") or "", cfg["from_address"]))
        msg["To"] = req.to
        if req.cc:
            msg["Cc"] = req.cc
@@ -2617,11 +2754,15 @@ def setup_email_routes():
            source_uid = (data.get("uid") or "").strip()
            source_folder = (data.get("folder") or "INBOX").strip()
            fast_reply = bool(data.get("fast", False))
+            user_hint = (data.get("user_hint") or "").strip()

            if not original_body:
                return {"success": False, "error": "No email body provided"}

-            if message_id:
+            # Skip cache lookup when the caller supplied a user_hint — the
+            # cached generic reply doesn't reflect the instructions and
+            # would silently override them.
+            if message_id and not user_hint:
                try:
                    _c = _sql3.connect(SCHEDULED_DB)
                    owner_clause, owner_params = _email_cache_owner_clause(owner)
@@ -2761,8 +2902,13 @@ def setup_email_routes():
            user_msg = (
                f"Recipient: {to}\nSubject: {subject}\n\n"
                f"Original email and any current draft:\n{original_body[:6000]}\n\n"
-                f"Draft a reply. Return only the reply body text."
            )
+            if user_hint:
+                user_msg += (
+                    f"User's instructions for THIS reply (follow these — they override "
+                    f"defaults like length/tone):\n{user_hint[:2000]}\n\n"
+                )
+            user_msg += "Draft a reply. Return only the reply body text."

            # Build a candidate chain so a stale session-stored API key
            # (the most common cause of "authentication failed" here)
@@ -2992,6 +3138,8 @@ def setup_email_routes():
                    "from_address": r.from_address or "",
                    "has_imap_password": bool(r.imap_password),
                    "has_smtp_password": bool(r.smtp_password),
+                    "oauth_provider": r.oauth_provider or "",
+                    "display_name": r.display_name or "",
                })
            return {"accounts": out}
        finally:
@@ -3024,6 +3172,7 @@ def setup_email_routes():
                smtp_user=(data.get("smtp_user") or "").strip(),
                smtp_password=_enc(data.get("smtp_password") or ""),
                from_address=(data.get("from_address") or "").strip(),
+                display_name=(data.get("display_name") or "").strip(),
                # SECURITY: stamp the creator so all subsequent reads / mutations
                # can filter by user. Without this every new account leaks to
                # every other user.
@@ -3058,7 +3207,7 @@ def setup_email_routes():
            if not row:
                return {"ok": False, "error": "Account not found"}
            # Simple fields
-            for key in ("name", "imap_host", "imap_user", "smtp_host", "smtp_user", "from_address"):
+            for key in ("name", "imap_host", "imap_user", "smtp_host", "smtp_user", "from_address", "display_name"):
                if key in data:
                    setattr(row, key, (data[key] or "").strip())
            for key in ("imap_port", "smtp_port"):
@@ -3247,4 +3396,123 @@ def setup_email_routes():
        finally:
            db.close()

+    # ── Google OAuth2 routes ──
+
+    @router.get("/oauth/google/authorize")
+    async def google_oauth_authorize(account_id: str = Query(...), request: Request = None, owner: str = Depends(require_user)):
+        import urllib.parse
+        _assert_owns_account(account_id, owner)
+        client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
+        if not client_id:
+            raise HTTPException(400, "GOOGLE_OAUTH_CLIENT_ID not set — add it to .env")
+        redirect_uri = (
+            os.environ.get("GOOGLE_OAUTH_REDIRECT_URI")
+            or f"http://{request.headers.get('host', 'localhost:7000')}/api/email/oauth/google/callback"
+        )
+        state = make_oauth_state(account_id, owner)
+        params = urllib.parse.urlencode({
+            "client_id": client_id,
+            "redirect_uri": redirect_uri,
+            "response_type": "code",
+            "scope": "https://mail.google.com/ email",
+            "access_type": "offline",
+            "prompt": "consent",
+            "state": state,
+        })
+        from fastapi.responses import RedirectResponse as _RR
+        return _RR(f"https://accounts.google.com/o/oauth2/v2/auth?{params}")
+
+    @router.get("/oauth/google/callback")
+    async def google_oauth_callback(
+        code: str = Query(None),
+        state: str = Query(None),
+        error: str = Query(None),
+        request: Request = None,
+    ):
+        import urllib.parse
+        from fastapi.responses import RedirectResponse as _RR
+        if error:
+            return _RR("/?section=integrations&email_oauth_error=google_error")
+        if not code or not state:
+            return _RR("/?section=integrations&email_oauth_error=missing_code")
+        state_data = verify_oauth_state(state)
+        if not state_data:
+            return _RR("/?section=integrations&email_oauth_error=invalid_state")
+        account_id = state_data.get("a", "")
+        owner = state_data.get("o", "")
+        client_id = os.environ.get("GOOGLE_OAUTH_CLIENT_ID", "")
+        client_secret = os.environ.get("GOOGLE_OAUTH_CLIENT_SECRET", "")
+        redirect_uri = (
+            os.environ.get("GOOGLE_OAUTH_REDIRECT_URI")
+            or f"http://{request.headers.get('host', 'localhost:7000')}/api/email/oauth/google/callback"
+        )
+        import httpx as _httpx
+        try:
+            resp = _httpx.post("https://oauth2.googleapis.com/token", data={
+                "code": code,
+                "client_id": client_id,
+                "client_secret": client_secret,
+                "redirect_uri": redirect_uri,
+                "grant_type": "authorization_code",
+            }, timeout=10)
+            resp.raise_for_status()
+            data = resp.json()
+        except Exception:
+            logger.warning("Google token exchange failed")
+            return _RR("/?section=integrations&email_oauth_error=token_exchange_failed")
+        access_token = data.get("access_token", "")
+        refresh_token = data.get("refresh_token", "")
+        expiry = str(int(time.time()) + data.get("expires_in", 3600))
+        # Fetch the email address from userinfo so we can auto-fill imap_user.
+        email_addr = ""
+        display_name = ""
+        try:
+            ui = _httpx.get("https://www.googleapis.com/oauth2/v1/userinfo",
+                            headers={"Authorization": f"Bearer {access_token}"}, timeout=10)
+            if ui.is_success:
+                ui_data = ui.json()
+                email_addr = ui_data.get("email", "")
+                display_name = ui_data.get("name", "")
+        except Exception:
+            pass
+        from core.database import SessionLocal, EmailAccount
+        from src.secret_storage import encrypt as _enc
+        db = SessionLocal()
+        try:
+            row = db.query(EmailAccount).filter(EmailAccount.id == account_id).first()
+            if not row:
+                return _RR("/?section=integrations&email_oauth_error=account_not_found")
+            # SECURITY: verify the account belongs to the initiating user.
+            if owner and row.owner and row.owner != owner:
+                logger.warning("OAuth callback owner mismatch — rejecting token write")
+                return _RR("/?section=integrations&email_oauth_error=ownership_error")
+            row.oauth_provider = "google"
+            row.oauth_access_token = _enc(access_token)
+            if refresh_token:
+                row.oauth_refresh_token = _enc(refresh_token)
+            row.oauth_token_expiry = expiry
+            # Auto-fill Google IMAP/SMTP settings if not already configured.
+            if not row.imap_host:
+                row.imap_host = "imap.gmail.com"
+                row.imap_port = 993
+                row.imap_starttls = False
+            if not row.smtp_host:
+                row.smtp_host = "smtp.gmail.com"
+                row.smtp_port = 587
+            if email_addr:
+                if not row.imap_user:
+                    row.imap_user = email_addr
+                if not row.smtp_user:
+                    row.smtp_user = email_addr
+                if not row.from_address:
+                    row.from_address = email_addr
+                if not row.name or row.name == row.id:
+                    row.name = email_addr
+            if display_name and not row.display_name:
+                row.display_name = display_name
+            db.commit()
+        finally:
+            db.close()
+        return _RR("/?section=integrations&email_oauth_success=1")
+
    return router
@@ -9,6 +9,7 @@ from pathlib import Path
 from fastapi import APIRouter, HTTPException, Form, Depends
 from core.constants import EMBEDDING_ENDPOINT_FILE, FASTEMBED_CACHE_DIR
 from core.middleware import require_admin
+from src.runtime_paths import get_app_root

 logger = logging.getLogger(__name__)

@@ -67,6 +67,14 @@ def _gallery_image_path(filename: str) -> Path:
        raise HTTPException(400, "Unsafe gallery filename")
    if safe_name != original:
        raise HTTPException(400, "Unsafe gallery filename")
+    if not path.exists():
+        cwd_root = (Path.cwd() / "data" / "generated_images").resolve()
+        cwd_path = (cwd_root / safe_name).resolve()
+        try:
+            if os.path.commonpath([str(cwd_root), str(cwd_path)]) == str(cwd_root) and cwd_path.exists():
+                return cwd_path
+        except Exception:
+            pass
    return path


@@ -224,8 +232,6 @@ def setup_gallery_routes() -> APIRouter:
    @router.post("/api/gallery/{image_id}/replace")
    async def gallery_replace(request: Request, image_id: str):
        """Replace an existing gallery image file with a new one."""
-        from pathlib import Path
-
        user = get_current_user(request)
        db = SessionLocal()
        try:
@@ -241,9 +247,8 @@ def setup_gallery_routes() -> APIRouter:
                raise HTTPException(400, "No image provided")

            content = await read_upload_limited(file, GALLERY_UPLOAD_MAX_BYTES, "Gallery replacement")
-            img_dir = Path(GENERATED_IMAGES_DIR)
-            img_dir.mkdir(parents=True, exist_ok=True)
-            img_path = img_dir / _sanitize_gallery_filename(img.filename)
+            GALLERY_IMAGE_DIR.mkdir(parents=True, exist_ok=True)
+            img_path = _gallery_image_path(img.filename)
            img_path.write_bytes(content)

            # Refresh dimensions in case the editor resized the canvas.
@@ -119,7 +119,7 @@ def setup_hwfit_routes():
        return detect_system(host=host, ssh_port=ssh_port, platform=platform, fresh=fresh)

    @router.get("/models")
-    def get_models(use_case: str = "", sort: str = "score", limit: int = 50, search: str = "", host: str = "", quant: str = "", ctx: str = "", gpu_count: str = "", gpu_group: str = "", ssh_port: str = "", platform: str = "", fresh: bool = False, manual_mode: str = "", manual_gpu_count: str = "", manual_vram_gb: str = "", manual_ram_gb: str = "", manual_backend: str = "", ignore_detected_gpu: bool = False, ignore_detected_ram: bool = False, fit_only: bool = False):
+    def get_models(use_case: str = "", sort: str = "newest", limit: int = 50, search: str = "", host: str = "", quant: str = "", ctx: str = "", gpu_count: str = "", gpu_group: str = "", ssh_port: str = "", platform: str = "", fresh: bool = False, manual_mode: str = "", manual_gpu_count: str = "", manual_vram_gb: str = "", manual_ram_gb: str = "", manual_backend: str = "", ignore_detected_gpu: bool = False, ignore_detected_ram: bool = False, fit_only: bool = False):
        """Rank LLM models against detected hardware and return scored results.
        gpu_count: override GPU count (0 = CPU only, 1-N = simulate N GPUs of the
            active group). gpu_group: index into system.gpu_groups (the homogeneous
@@ -26,7 +26,7 @@ from src.endpoint_resolver import (
    build_models_url,
    build_headers,
 )
-from src.auth_helpers import _auth_disabled, owner_filter
+from src.auth_helpers import _auth_disabled, effective_user, owner_filter

 logger = logging.getLogger(__name__)

@@ -1255,13 +1255,16 @@ def setup_model_routes(model_discovery):
        # Require auth; "" is the unconfigured single-user mode, treated as
        # "see everything" by _fetch_models.
        try:
-            from src.auth_helpers import get_current_user as _gcu
-            owner = _gcu(request) or ""
-        except Exception:
-            owner = ""
-        # Reject anonymous in configured deployments — no leaking the model
-        # list to unauthenticated callers.
-        try:
+            if getattr(request.state, "api_token", False):
+                scopes = set(getattr(request.state, "api_token_scopes", []) or [])
+                if "chat" not in scopes:
+                    raise HTTPException(403, "API token is not scoped for chat")
+                if not getattr(request.state, "api_token_owner", None):
+                    raise HTTPException(403, "API token has no owner")
+            owner = effective_user(request) or ""
+
+            # Reject anonymous in configured deployments — no leaking the model
+            # list to unauthenticated callers.
            auth_mgr = getattr(request.app.state, "auth_manager", None)
            if not owner and not _auth_disabled() and auth_mgr is not None and getattr(auth_mgr, "is_configured", False):
                raise HTTPException(401, "Not authenticated")
@@ -10,7 +10,7 @@ from fastapi import APIRouter, HTTPException, Request
 from pydantic import BaseModel

 from core.database import SessionLocal, Note
-from src.auth_helpers import get_current_user
+from src.auth_helpers import require_user
 from src.constants import DATA_DIR
 from sqlalchemy.orm.attributes import flag_modified

@@ -208,14 +208,17 @@ async def dispatch_reminder(
        try:
            from src.endpoint_resolver import resolve_endpoint
            from src.llm_core import llm_call_async
+            from src.reminder_personas import synthesis_system_prompt
            url, model, headers = resolve_endpoint("utility", owner=owner or None)
            if not url:
                url, model, headers = resolve_endpoint("default", owner=owner or None)
            if url and model:
+                persona_id = (settings.get("reminder_llm_persona") or "").strip()
+                sys_prompt = synthesis_system_prompt(persona_id)
                raw = await llm_call_async(
                    url=url, model=model,
                    messages=[
-                        {"role": "system", "content": "You are a reminder assistant. Write a single short, warm, motivating sentence (max 25 words) reminding the user about the note below. Do not add greetings, preamble, or hashtags. Output only the sentence."},
+                        {"role": "system", "content": sys_prompt},
                        {"role": "user", "content": f"Title: {title}\n\n{note_body}".strip()},
                    ],
                    temperature=0.7, max_tokens=200, headers=headers, timeout=30,
@@ -567,7 +570,16 @@ def setup_note_routes(task_scheduler=None):
    router = APIRouter(prefix="/api/notes", tags=["notes"])

    def _owner(request: Request) -> Optional[str]:
-        return get_current_user(request)
+        # require_user, not bare get_current_user: a request that reaches
+        # these owner-scoped routes with NO identity (auth-middleware
+        # regression, SSRF from a sibling service) must fail closed (401)
+        # when auth is configured — not be treated as the single-user mode
+        # and handed blanket access to every account's notes. The documented
+        # anonymous modes (AUTH_ENABLED=false, LOCALHOST_BYPASS on loopback,
+        # unconfigured first-run) still resolve to None, the single-user
+        # path. fire_reminder below already gated this way; the CRUD routes
+        # did not.
+        return require_user(request) or None

    def _is_admin_or_single_user(request: Request, user: str | None) -> bool:
        if user == "internal-tool":
@@ -802,8 +814,7 @@ def setup_note_routes(task_scheduler=None):
        Returns {synthesis, email_sent}.
        """
        # Gate against anonymous callers — LLM synthesis can burn tokens.
-        from src.auth_helpers import require_user as _ru
-        user = _ru(request)
+        user = require_user(request)
        body = await request.json()
        note_id = str(body.get("note_id") or "").strip()
        if not note_id:
@@ -826,6 +837,12 @@ def setup_note_routes(task_scheduler=None):
                _override["reminder_webhook_integration_id"] = body["webhook_integration_id"]
            if body.get("webhook_payload_template"):
                _override["reminder_webhook_payload_template"] = body["webhook_payload_template"]
+            # Mirror the in-UI AI Synthesis toggle + persona so the test
+            # actually exercises the synthesis path before/without a Save.
+            if "llm_synthesis" in body:
+                _override["reminder_llm_synthesis"] = bool(body["llm_synthesis"])
+            if "llm_persona" in body:
+                _override["reminder_llm_persona"] = str(body["llm_persona"] or "")
        else:
            db = SessionLocal()
            try:
@@ -278,8 +278,8 @@ def setup_personal_routes(personal_docs_manager, rag_manager, rag_available):
            # Delete file from disk if it's in uploads dir
            deleted_from_disk = False
            try:
-                abs_target = os.path.abspath(filepath)
-                base_abs = os.path.abspath(UPLOADS_DIR)
+                abs_target = os.path.realpath(filepath)
+                base_abs = os.path.realpath(UPLOADS_DIR)
                in_uploads = (
                    abs_target == base_abs
                    or os.path.commonpath([abs_target, base_abs]) == base_abs
@@ -691,8 +691,12 @@ async def _run_skill_test_once(md: str, task: str, url, model, headers, owner) -
        {"role": "user", "content": task},
    ]
    try:
+        # max_tokens explicitly set: passing 0 lets some upstreams (Ollama,
+        # OpenAI-compat) generate an empty completion, which manifested as
+        # the skill test returning nothing while chat (which carries its
+        # preset's max_tokens) worked. 4096 matches the chat default.
        async for chunk in stream_agent_loop(url, model, messages, headers=headers,
-                                             temperature=0.3, max_tokens=0, max_rounds=8, owner=owner):
+                                             temperature=0.3, max_tokens=4096, max_rounds=8, owner=owner):
            if not chunk.startswith("data: ") or chunk.strip() == "data: [DONE]":
                continue
            try:
@@ -151,6 +151,7 @@ class TaskCreate(BaseModel):
    endpoint_url: Optional[str] = None
    then_task_id: Optional[str] = None            # chain: run this task after success
    notifications_enabled: Optional[bool] = None  # None lets action-specific defaults apply
+    character_id: Optional[str] = None             # built-in persona id (PERSONAS) — biases output voice


 class TaskUpdate(BaseModel):
@@ -171,6 +172,7 @@ class TaskUpdate(BaseModel):
    endpoint_url: Optional[str] = None
    then_task_id: Optional[str] = None
    notifications_enabled: Optional[bool] = None
+    character_id: Optional[str] = None


 def _display_task_name(t: ScheduledTask) -> str:
@@ -203,6 +205,7 @@ def _task_to_dict(t: ScheduledTask, include_last_run_result: bool = False) -> di
        "output_target": t.output_target,
        "session_id": t.session_id,
        "crew_member_id": getattr(t, "crew_member_id", None),
+        "character_id": getattr(t, "character_id", None),
        "model": t.model,
        "endpoint_url": t.endpoint_url,
        "run_count": t.run_count or 0,
@@ -552,6 +555,7 @@ def setup_task_routes(task_scheduler) -> APIRouter:
                then_task_id=then_task_id,
                webhook_token=webhook_token,
                notifications_enabled=notifications_enabled,
+                character_id=(req.character_id or None),
            )
            db.add(task)
            db.commit()
@@ -705,6 +709,9 @@ def setup_task_routes(task_scheduler) -> APIRouter:
                task.then_task_id = _validate_then_task_id(db, req.then_task_id, user, current_task_id=task.id)
            if req.notifications_enabled is not None:
                task.notifications_enabled = bool(req.notifications_enabled)
+            if req.character_id is not None:
+                # Empty string clears the persona; non-empty stores the id.
+                task.character_id = req.character_id or None
            if req.cron_expression is not None:
                if req.cron_expression:
                    try:
@@ -0,0 +1,133 @@
+#!/usr/bin/env python3
+"""Backfill release_date on entries in services/hwfit/data/hf_models.json.
+
+Why: the `newest` sort in the cookbook ranks rows by release_date. Anything
+missing a date sorts to the bottom. This script pulls `created_at` from the
+HuggingFace API for each catalog entry without one (or all entries when
+--refresh is passed) and writes the catalog back.
+
+Usage:
+    python scripts/backfill_model_release_dates.py            # missing only
+    python scripts/backfill_model_release_dates.py --refresh  # all entries
+    python scripts/backfill_model_release_dates.py --limit 50 # cap requests
+    python scripts/backfill_model_release_dates.py --dry-run  # show, don't write
+
+Auth: set HF_TOKEN env var (or huggingface-cli login) to access gated repos.
+"""
+import argparse
+import json
+import os
+import sys
+import time
+from datetime import datetime
+from pathlib import Path
+
+try:
+    from huggingface_hub import HfApi
+    from huggingface_hub.utils import HfHubHTTPError
+except ImportError:
+    print("Install huggingface_hub: pip install huggingface_hub", file=sys.stderr)
+    sys.exit(1)
+
+
+CATALOG_PATH = Path(__file__).resolve().parent.parent / "services" / "hwfit" / "data" / "hf_models.json"
+
+
+def fetch_release_date(api: HfApi, repo_id: str) -> str | None:
+    """Return YYYY-MM-DD release date, or None on miss / error."""
+    try:
+        info = api.model_info(repo_id, files_metadata=False)
+    except HfHubHTTPError as e:
+        # 401 = gated/private, 404 = renamed/deleted. Either way, no date.
+        status = getattr(getattr(e, "response", None), "status_code", None)
+        print(f"  {repo_id}: HTTP {status or '?'}", file=sys.stderr)
+        return None
+    except Exception as e:
+        print(f"  {repo_id}: {type(e).__name__}: {e}", file=sys.stderr)
+        return None
+    created = getattr(info, "created_at", None)
+    if not created:
+        return None
+    return created.strftime("%Y-%m-%d")
+
+
+def main():
+    p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
+    p.add_argument("--refresh", action="store_true", help="Overwrite existing release_date too (default: only fill missing).")
+    p.add_argument("--limit", type=int, default=0, help="Stop after N API calls (0 = no limit).")
+    p.add_argument("--dry-run", action="store_true", help="Don't write back; just report.")
+    p.add_argument("--sleep", type=float, default=0.05, help="Seconds to sleep between requests (default 0.05).")
+    args = p.parse_args()
+
+    if not CATALOG_PATH.exists():
+        print(f"Catalog not found: {CATALOG_PATH}", file=sys.stderr)
+        sys.exit(2)
+
+    with CATALOG_PATH.open(encoding="utf-8") as f:
+        catalog = json.load(f)
+
+    candidates = []
+    for i, m in enumerate(catalog):
+        name = m.get("name")
+        if not name:
+            continue
+        existing = (m.get("release_date") or "").strip()
+        if existing and not args.refresh:
+            continue
+        candidates.append(i)
+
+    if args.limit:
+        candidates = candidates[: args.limit]
+
+    print(f"Catalog: {CATALOG_PATH}")
+    print(f"Total entries: {len(catalog)}")
+    print(f"Targets ({'refresh all' if args.refresh else 'missing only'}{'' if not args.limit else f', capped at {args.limit}'}): {len(candidates)}")
+    if not candidates:
+        print("Nothing to do.")
+        return
+
+    api = HfApi(token=os.environ.get("HF_TOKEN") or None)
+    updated = 0
+    skipped = 0
+    started = time.time()
+    for n, idx in enumerate(candidates, start=1):
+        entry = catalog[idx]
+        name = entry["name"]
+        old = (entry.get("release_date") or "").strip()
+        new = fetch_release_date(api, name)
+        if new is None:
+            skipped += 1
+            tag = "skip"
+        elif new == old:
+            tag = "unchanged"
+        else:
+            entry["release_date"] = new
+            updated += 1
+            tag = f"set {new}" + (f" (was {old})" if old else "")
+        print(f"[{n}/{len(candidates)}] {name} — {tag}")
+        if args.sleep:
+            time.sleep(args.sleep)
+
+    elapsed = time.time() - started
+    print()
+    print(f"Done in {elapsed:.1f}s — {updated} updated, {skipped} skipped (HF unavailable / gated / missing date).")
+
+    if args.dry_run:
+        print("Dry run — no write.")
+        return
+
+    if updated:
+        # Atomic write: tmp file in the same dir, then rename. Keeps the
+        # catalog usable even if the process dies mid-write.
+        tmp = CATALOG_PATH.with_suffix(".json.tmp")
+        with tmp.open("w", encoding="utf-8") as f:
+            json.dump(catalog, f, indent=1, ensure_ascii=False)
+            f.write("\n")
+        tmp.replace(CATALOG_PATH)
+        print(f"Wrote {CATALOG_PATH}")
+    else:
+        print("No changes to write.")
+
+
+if __name__ == "__main__":
+    main()
@@ -0,0 +1,341 @@
+#!/usr/bin/env python3
+"""Import models from the upstream vllm-project/recipes catalog into our
+local hf_models.json. Two modes:
+
+  --update-existing  Stamp min_vllm_version + vllm_recipe=True on rows we
+                     already carry. Cheap, no HF API calls.
+  --add-missing      Create new catalog rows for every recipe model we
+                     don't carry. Hits the HF API for created_at + downloads
+                     (~1 req per missing model, paced).
+
+Both modes write atomically (tmp + rename) so a crashed run leaves the
+catalog intact. Default with no mode flags runs both, prefer to pass them
+explicitly.
+
+Usage:
+    python scripts/import_from_vllm_recipes.py --update-existing
+    python scripts/import_from_vllm_recipes.py --add-missing
+    python scripts/import_from_vllm_recipes.py --dry-run
+    python scripts/import_from_vllm_recipes.py --limit 10
+
+Auth: set HF_TOKEN to access gated repos when --add-missing.
+"""
+import argparse
+import json
+import os
+import re
+import sys
+import time
+from datetime import datetime
+from pathlib import Path
+
+try:
+    import httpx
+    import yaml
+except ImportError:
+    print("pip install httpx PyYAML", file=sys.stderr)
+    sys.exit(1)
+
+try:
+    from huggingface_hub import HfApi
+    from huggingface_hub.utils import HfHubHTTPError
+except ImportError:
+    HfApi = None
+    HfHubHTTPError = Exception
+
+
+CATALOG_PATH = Path(__file__).resolve().parent.parent / "services" / "hwfit" / "data" / "hf_models.json"
+RECIPES_TREE_URL = (
+    "https://api.github.com/repos/vllm-project/recipes/git/trees/main?recursive=1"
+)
+RECIPE_RAW_URL = (
+    "https://raw.githubusercontent.com/vllm-project/recipes/main/models/{repo}.yaml"
+)
+
+
+# Map recipe `precision` to the closest catalog `quantization` label that
+# fit.py / models.py already understand.
+_PRECISION_TO_QUANT = {
+    "fp8": "FP8",
+    "nvfp4": "NVFP4",
+    "mxfp4": "MXFP4",
+    "bf16": "BF16",
+    "fp16": "F16",
+    "f16": "F16",
+    "fp4": "FP4",
+    "int8": "INT8",
+    "int4": "INT4",
+    "awq-4bit": "AWQ-4bit",
+    "awq-8bit": "AWQ-8bit",
+}
+
+# Architecture name → use_case fallback. fit.py weights use_case for filtering;
+# missing field defaults to a generic bucket.
+_ARCH_USE_CASE = {
+    "moe": "General-purpose reasoning, long-context",
+    "llama": "General-purpose chat",
+    "qwen2": "General-purpose chat",
+    "qwen3": "General-purpose reasoning",
+    "deepseek_v3_moe": "General-purpose reasoning, long-context",
+    "deepseek_v4_moe": "General-purpose reasoning, long-context",
+}
+
+
+def _parse_param_count(s) -> int:
+    """'230B' / '8.6B' / '4.2T' → integer parameter count."""
+    if s is None:
+        return 0
+    s = str(s).strip().replace(",", "")
+    m = re.match(r"^([\d.]+)\s*([KMBT]?)$", s, re.I)
+    if not m:
+        return 0
+    num = float(m.group(1))
+    unit = (m.group(2) or "").upper()
+    mult = {"K": 1e3, "M": 1e6, "B": 1e9, "T": 1e12, "": 1.0}[unit]
+    return int(num * mult)
+
+
+def _capabilities_for(arch: str, hardware: dict, ctx_len: int, has_reasoning: bool) -> list[str]:
+    caps = []
+    if "moe" in (arch or "").lower():
+        caps.append("moe")
+    if has_reasoning:
+        caps.append("reasoning")
+    if ctx_len and ctx_len >= 100_000:
+        caps.append("long_context")
+    if any(hw in (hardware or {}) for hw in ("mi300x", "mi325x", "mi350x", "mi355x")):
+        caps.append("amd_supported")
+    return caps
+
+
+def _fetch_manifest(client: httpx.Client) -> set[str]:
+    r = client.get(RECIPES_TREE_URL, headers={"Accept": "application/vnd.github+json"}, timeout=15)
+    r.raise_for_status()
+    tree = (r.json() or {}).get("tree") or []
+    out: set[str] = set()
+    for e in tree:
+        path = (e or {}).get("path") or ""
+        if path.startswith("models/") and path.endswith(".yaml"):
+            body = path[len("models/"):-len(".yaml")]
+            if "/" in body:
+                out.add(body)
+    return out
+
+
+def _fetch_recipe(client: httpx.Client, repo: str) -> dict | None:
+    url = RECIPE_RAW_URL.format(repo=repo)
+    try:
+        r = client.get(url, timeout=10)
+        if r.status_code != 200:
+            return None
+        return yaml.safe_load(r.text) or {}
+    except Exception:
+        return None
+
+
+def _stamp_from_recipe(entry: dict, recipe: dict) -> bool:
+    """Mutate entry with recipe-derived fields. Returns True if anything changed."""
+    model = recipe.get("model") or {}
+    meta = recipe.get("meta") or {}
+    features = recipe.get("features") or {}
+
+    changed = False
+    new_min = (model.get("min_vllm_version") or "").strip()
+    if new_min and entry.get("min_vllm_version") != new_min:
+        entry["min_vllm_version"] = new_min
+        changed = True
+    if not entry.get("vllm_recipe"):
+        entry["vllm_recipe"] = True
+        changed = True
+    # Hardware support map — useful for filtering "which models run on my AMD box".
+    hw = meta.get("hardware") or {}
+    if hw and entry.get("recipe_hardware") != hw:
+        entry["recipe_hardware"] = {k: str(v) for k, v in hw.items()}
+        changed = True
+    # Tool/reasoning parser hints — purely informational at catalog level;
+    # the live launch command builder still reads them from the recipe API.
+    if features.get("reasoning") and not entry.get("has_reasoning_parser"):
+        entry["has_reasoning_parser"] = True
+        changed = True
+    if features.get("tool_calling") and not entry.get("has_tool_call_parser"):
+        entry["has_tool_call_parser"] = True
+        changed = True
+    return changed
+
+
+def _build_new_entry(repo: str, recipe: dict, hf_info=None) -> dict | None:
+    """Build a fresh catalog entry from a recipe + (optional) HF model info."""
+    model = recipe.get("model") or {}
+    meta = recipe.get("meta") or {}
+    features = recipe.get("features") or {}
+    variants = recipe.get("variants") or {}
+
+    org, name = repo.split("/", 1)
+    raw_params = _parse_param_count(model.get("parameter_count"))
+    active_raw = _parse_param_count(model.get("active_parameters"))
+    ctx = model.get("context_length") or 0
+
+    # Pick the smallest-VRAM variant as the catalog quant — that's what most
+    # users land on first. NVFP4/MXFP4 typically win this on Blackwell;
+    # FP8 elsewhere; BF16 baseline only.
+    pick_quant = None
+    pick_vram = None
+    for vk, vv in variants.items():
+        if not isinstance(vv, dict):
+            continue
+        prec = (vv.get("precision") or "").lower()
+        vram = vv.get("vram_minimum_gb") or 0
+        quant = _PRECISION_TO_QUANT.get(prec)
+        if quant and (pick_vram is None or (vram and vram < pick_vram)):
+            pick_quant = quant
+            pick_vram = vram or pick_vram
+    if not pick_quant:
+        pick_quant = "BF16"
+
+    arch = (model.get("architecture") or "").lower()
+    use_case = _ARCH_USE_CASE.get(arch, "General-purpose chat")
+    caps = _capabilities_for(arch, meta.get("hardware") or {}, ctx, bool(features.get("reasoning")))
+
+    rel_date = ""
+    downloads = 0
+    likes = 0
+    if hf_info is not None:
+        created = getattr(hf_info, "created_at", None)
+        if created:
+            rel_date = created.strftime("%Y-%m-%d")
+        downloads = int(getattr(hf_info, "downloads", 0) or 0)
+        likes = int(getattr(hf_info, "likes", 0) or 0)
+    if not rel_date:
+        rel_date = str(meta.get("date_updated") or datetime.utcnow().strftime("%Y-%m-%d"))
+
+    entry: dict = {
+        "name": repo,
+        "provider": org,
+        "parameter_count": str(model.get("parameter_count") or "?"),
+        "parameters_raw": raw_params,
+        "is_moe": "moe" in arch,
+        "quantization": pick_quant,
+        "context_length": int(ctx or 0),
+        "use_case": use_case,
+        "capabilities": caps,
+        "pipeline_tag": "text-generation",
+        "architecture": arch or "unknown",
+        "hf_downloads": downloads,
+        "hf_likes": likes,
+        "release_date": rel_date,
+        # Recipe-derived bits.
+        "vllm_recipe": True,
+        "min_vllm_version": (model.get("min_vllm_version") or "").strip() or None,
+        "recipe_hardware": {k: str(v) for k, v in (meta.get("hardware") or {}).items()},
+        "has_reasoning_parser": bool(features.get("reasoning")),
+        "has_tool_call_parser": bool(features.get("tool_calling")),
+    }
+    if active_raw:
+        entry["active_parameters"] = active_raw
+    if pick_vram:
+        # min_vram_gb is what hwfit uses for "does this fit". Recipe states a
+        # minimum for the chosen variant; round up slightly for KV-cache room.
+        entry["min_vram_gb"] = float(pick_vram)
+        entry["min_ram_gb"] = float(round(pick_vram * 0.6, 1))
+        entry["recommended_ram_gb"] = float(round(pick_vram * 1.2, 1))
+    # Drop empty / None fields to keep the JSON tidy.
+    return {k: v for k, v in entry.items() if v not in (None, "", [], {})}
+
+
+def main():
+    p = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
+    p.add_argument("--update-existing", action="store_true", help="Stamp min_vllm_version + vllm_recipe on existing rows.")
+    p.add_argument("--add-missing", action="store_true", help="Add new rows for recipe models not in the catalog.")
+    p.add_argument("--limit", type=int, default=0, help="Stop after N recipe fetches.")
+    p.add_argument("--dry-run", action="store_true", help="Don't write back; just report.")
+    p.add_argument("--sleep", type=float, default=0.05, help="Seconds between HTTP requests.")
+    args = p.parse_args()
+    if not args.update_existing and not args.add_missing:
+        args.update_existing = args.add_missing = True
+
+    with CATALOG_PATH.open(encoding="utf-8") as f:
+        catalog = json.load(f)
+    by_name = {m.get("name"): m for m in catalog if m.get("name")}
+
+    client = httpx.Client(follow_redirects=True)
+    print(f"Catalog: {CATALOG_PATH} ({len(catalog)} entries)")
+    print("Fetching upstream manifest…")
+    try:
+        manifest = _fetch_manifest(client)
+    except Exception as e:
+        print(f"FATAL: manifest fetch failed: {e}", file=sys.stderr)
+        sys.exit(2)
+    print(f"Manifest: {len(manifest)} recipes")
+
+    existing = sorted(by_name.keys() & manifest)
+    missing = sorted(manifest - by_name.keys())
+    print(f"Match catalog ↔ manifest: existing={len(existing)} missing={len(missing)}")
+
+    targets: list[tuple[str, str]] = []  # (repo, action)
+    if args.update_existing:
+        targets.extend((r, "update") for r in existing)
+    if args.add_missing:
+        targets.extend((r, "add") for r in missing)
+    if args.limit:
+        targets = targets[: args.limit]
+    print(f"Targets: {len(targets)}")
+
+    hf_api = HfApi(token=os.environ.get("HF_TOKEN") or None) if HfApi else None
+    updated = added = skipped = 0
+    started = time.time()
+
+    for n, (repo, action) in enumerate(targets, 1):
+        recipe = _fetch_recipe(client, repo)
+        if not recipe:
+            print(f"[{n}/{len(targets)}] {repo:55} skip (no recipe fetched)")
+            skipped += 1
+            time.sleep(args.sleep)
+            continue
+        if action == "update":
+            entry = by_name[repo]
+            if _stamp_from_recipe(entry, recipe):
+                updated += 1
+                print(f"[{n}/{len(targets)}] {repo:55} updated")
+            else:
+                print(f"[{n}/{len(targets)}] {repo:55} unchanged")
+        else:  # add
+            hf_info = None
+            if hf_api:
+                try:
+                    hf_info = hf_api.model_info(repo, files_metadata=False)
+                except HfHubHTTPError as e:
+                    code = getattr(getattr(e, "response", None), "status_code", "?")
+                    print(f"  HF {code} for {repo} — building from recipe only", file=sys.stderr)
+                except Exception as e:
+                    print(f"  HF error for {repo}: {e}", file=sys.stderr)
+            new_entry = _build_new_entry(repo, recipe, hf_info)
+            if new_entry:
+                catalog.append(new_entry)
+                by_name[repo] = new_entry
+                added += 1
+                print(f"[{n}/{len(targets)}] {repo:55} added ({new_entry.get('parameter_count','?')}, {new_entry.get('quantization','?')})")
+            else:
+                skipped += 1
+                print(f"[{n}/{len(targets)}] {repo:55} skip (couldn't build entry)")
+        time.sleep(args.sleep)
+
+    elapsed = time.time() - started
+    print()
+    print(f"Done in {elapsed:.1f}s — added={added}, updated={updated}, skipped={skipped}")
+
+    if args.dry_run:
+        print("Dry run — no write.")
+        return
+    if added or updated:
+        tmp = CATALOG_PATH.with_suffix(".json.tmp")
+        with tmp.open("w", encoding="utf-8") as f:
+            json.dump(catalog, f, indent=1, ensure_ascii=False)
+            f.write("\n")
+        tmp.replace(CATALOG_PATH)
+        print(f"Wrote {CATALOG_PATH} ({len(catalog)} entries)")
+    else:
+        print("No changes — catalog untouched.")
+
+
+if __name__ == "__main__":
+    main()
@@ -19,6 +19,10 @@ GPU_BANDWIDTH = {
    "6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224,
    "mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229,
    "9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322,
+    # NVIDIA GB10 Grace-Blackwell superchip (DGX Spark). Unified LPDDR5X memory,
+    # not Apple Silicon, so it lives in the generic GPU table — the Apple-only
+    # lookup never matches it (its name carries no "apple").
+    "gb10": 273,
 }

 # Pre-sort keys by length descending for correct substring matching
@@ -109,10 +113,15 @@ def _lookup_bandwidth(system):
    if not isinstance(gpu_name, str) or not gpu_name:
        return None

-    if isinstance(system, dict):
-        bw = _lookup_apple_bandwidth(system)
-        if bw is not None:
-            return bw
+    # Apple tiers live only in the Apple-specific table now (#2564), so route
+    # BOTH dict and bare-string callers through it. A bare string carries no
+    # gpu_cores, so the helper falls back to the conservative (lowest) tier for
+    # that model -- before #2564 the generic table answered string lookups, and
+    # dropping that made _lookup_bandwidth("Apple M3 Max") return None.
+    apple_input = system if isinstance(system, dict) else {"gpu_name": gpu_name}
+    bw = _lookup_apple_bandwidth(apple_input)
+    if bw is not None:
+        return bw

    gn = gpu_name.lower()
    for key in _BW_KEYS_SORTED:
@@ -15,6 +15,8 @@ from urllib.parse import urljoin, urlparse
 import httpx
 from bs4 import BeautifulSoup

+from src.constants import WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES
+
 from .analytics import RateLimitError, error_logger
 from .cache import (
    CONTENT_CACHE_DIR,
@@ -89,18 +91,128 @@ def _public_http_url(url: str) -> bool:
        return False


-def _get_public_url(url: str, headers: dict, timeout: int, max_redirects: int = 5) -> httpx.Response:
+class BodyTooLargeError(Exception):
+    """The server declared a body larger than the hard fetch ceiling."""
+
+    def __init__(self, url: str, declared_bytes: int):
+        self.url = url
+        self.declared_bytes = declared_bytes
+        super().__init__(
+            f"response body is {declared_bytes:,} bytes, over the "
+            f"{WEB_FETCH_HARD_MAX_BYTES:,}-byte hard cap"
+        )
+
+
+class _CappedFetch:
+    """Result of a size-capped streaming GET.
+
+    Carries just what fetch_webpage_content needs from an httpx.Response,
+    plus the cap bookkeeping: the (possibly truncated) body, whether the
+    cap cut it short, and the size the server declared via Content-Length
+    (wire bytes; None when absent).
+    """
+
+    __slots__ = ("status_code", "headers", "content", "truncated",
+                 "declared_bytes", "encoding", "url")
+
+    def __init__(self, status_code, headers, content, truncated,
+                 declared_bytes, encoding, url):
+        self.status_code = status_code
+        self.headers = headers
+        self.content = content
+        self.truncated = truncated
+        self.declared_bytes = declared_bytes
+        self.encoding = encoding
+        self.url = url
+
+    @property
+    def text(self) -> str:
+        return self.content.decode(self.encoding or "utf-8", errors="replace")
+
+    def raise_for_status(self):
+        if self.status_code >= 400:
+            request = httpx.Request("GET", self.url)
+            raise httpx.HTTPStatusError(
+                f"HTTP {self.status_code} for {self.url}",
+                request=request,
+                response=httpx.Response(self.status_code, request=request),
+            )
+
+
+def _get_public_url(url: str, headers: dict, timeout: int, max_redirects: int = 5,
+                    max_bytes: int = None) -> "_CappedFetch":
+    """Capped streaming GET with SSRF-guarded manual redirects.
+
+    The body is streamed and buffering stops at ``max_bytes`` (default: the
+    soft cap), so an oversized resource cannot be pulled into memory or the
+    content cache in full. When Content-Length already declares a body over
+    the hard ceiling, the fetch is refused before any body bytes are read.
+    """
+    cap = min(max_bytes or WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES)
    current = url
    for _ in range(max_redirects + 1):
        if not _public_http_url(current):
            raise httpx.RequestError("Blocked private/internal URL", request=httpx.Request("GET", current))
-        response = httpx.get(current, headers=headers, timeout=timeout, follow_redirects=False)
-        if response.status_code not in (301, 302, 303, 307, 308):
-            return response
-        location = response.headers.get("location")
-        if not location:
-            return response
-        current = urljoin(str(response.url), location)
+        # Force identity transfer-encoding. With gzip/deflate the wire bytes
+        # (and Content-Length) can be a small fraction of the decoded body, so
+        # a tiny compressed response could pass the hard-cap preflight and then
+        # expand past the ceiling in a single decoded chunk before the streamed
+        # cap below can slice it. Identity makes Content-Length the true body
+        # size and keeps each streamed chunk bounded by the network read.
+        req_headers = dict(headers or {})
+        req_headers["Accept-Encoding"] = "identity"
+        with httpx.stream("GET", current, headers=req_headers, timeout=timeout,
+                          follow_redirects=False) as response:
+            if response.status_code in (301, 302, 303, 307, 308):
+                location = response.headers.get("location")
+                if not location:
+                    return _CappedFetch(response.status_code, response.headers, b"",
+                                        False, None, response.encoding, str(response.url))
+                current = urljoin(str(response.url), location)
+                continue
+
+            # A server can ignore the identity request and still return a
+            # compressed body; httpx.iter_bytes would then decode it, and a tiny
+            # gzip can balloon into one decoded chunk far past the cap before we
+            # slice. Refuse a compressed Content-Encoding so the streamed cap
+            # stays a real memory bound (Content-Length is the compressed wire
+            # length here, so the preflight and size metadata are unreliable too).
+            enc = (response.headers.get("content-encoding") or "").strip().lower()
+            if enc and enc != "identity":
+                raise httpx.RequestError(
+                    f"Refusing compressed response (Content-Encoding: {enc}) after "
+                    "requesting identity: cannot bound decoded body size",
+                    request=httpx.Request("GET", current),
+                )
+
+            declared = None
+            raw_len = response.headers.get("content-length")
+            if raw_len and raw_len.isdigit():
+                declared = int(raw_len)
+            # Refuse before buffering anything when the server already tells
+            # us the body exceeds the absolute ceiling (Content-Length is wire
+            # bytes; the decompressed body can only be larger).
+            if declared is not None and declared > WEB_FETCH_HARD_MAX_BYTES:
+                raise BodyTooLargeError(current, declared)
+
+            chunks = []
+            read = 0
+            truncated = False
+            # We requested identity above, so iter_bytes yields the raw body in
+            # network-read-sized chunks (no decompression expansion); the cap
+            # therefore bounds what we actually buffer.
+            for chunk in response.iter_bytes():
+                read += len(chunk)
+                if read > cap:
+                    keep = cap - (read - len(chunk))
+                    if keep > 0:
+                        chunks.append(chunk[:keep])
+                    truncated = True
+                    break
+                chunks.append(chunk)
+            return _CappedFetch(response.status_code, response.headers,
+                                b"".join(chunks), truncated, declared,
+                                response.encoding, str(response.url))
    raise httpx.RequestError("Too many redirects", request=httpx.Request("GET", current))

 # PDF extraction (optional dependency)
@@ -222,9 +334,19 @@ def _empty_result(url: str, error: str = "") -> dict:
 # ----------------------------------------------------------------------
 # Main content fetcher
 # ----------------------------------------------------------------------
-def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) -> dict:
-    """Fetch and extract meaningful content from a webpage with caching."""
-    cache_key = generate_cache_key(url)
+def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0,
+                          max_bytes: int = None) -> dict:
+    """Fetch and extract meaningful content from a webpage with caching.
+
+    ``max_bytes`` raises the download budget per call (clamped to the hard
+    cap); the default is the soft cap. When the body is cut short the result
+    carries ``truncated``/``fetched_bytes``/``total_bytes`` so callers can
+    tell the model the content is partial (#3812).
+    """
+    effective_cap = min(max_bytes or WEB_FETCH_SOFT_MAX_BYTES, WEB_FETCH_HARD_MAX_BYTES)
+    # The cap is part of the cache identity: a truncated soft-cap fetch must
+    # not be served to a later full-budget request for the same URL.
+    cache_key = generate_cache_key(f"{url}#cap={effective_cap}")
    cache_file = CONTENT_CACHE_DIR / f"{cache_key}.cache"

    # Check cache
@@ -250,15 +372,21 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
            "Accept-Language": "en-US,en;q=0.5",
-            "Accept-Encoding": "gzip, deflate",
+            # identity so the streamed size cap in _get_public_url stays honest
+            # (a compressed body can decode to far more than Content-Length).
+            "Accept-Encoding": "identity",
            "Connection": "keep-alive",
        }
-        response = _get_public_url(url, headers=headers, timeout=timeout)
+        response = _get_public_url(url, headers=headers, timeout=timeout,
+                                   max_bytes=effective_cap)

        if response.status_code == 429:
            raise RateLimitError(f"Rate limit hit for {url} (attempt {retry_attempt})")

        response.raise_for_status()
+    except BodyTooLargeError as e:
+        error_logger.warning(f"Refused oversized body for {url}: {e}")
+        return _empty_result(url, f"TooLarge: {e}")
    except httpx.HTTPStatusError as e:
        error_logger.warning(f"HTTP {e.response.status_code} fetching {url}: {e}")
        return _empty_result(url, f"HTTP {e.response.status_code}: {e}")
@@ -269,9 +397,27 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
        error_logger.error(str(e))
        return _empty_result(url, str(e))

+    # Size bookkeeping shared by every content branch below. getattr keeps
+    # plain httpx.Response stand-ins (tests) working without the cap fields.
+    _size_fields = {
+        "truncated": getattr(response, "truncated", False),
+        "fetched_bytes": len(response.content),
+        "total_bytes": getattr(response, "declared_bytes", None),
+    }
+
    # PDF handling
    content_type = response.headers.get("Content-Type", "").lower()
    if "application/pdf" in content_type or url.lower().endswith(".pdf"):
+        if _size_fields["truncated"]:
+            # A PDF cut mid-stream is not parseable; unlike text there is no
+            # useful partial result, so report the budget problem instead.
+            _declared = _size_fields["total_bytes"]
+            return _empty_result(
+                url,
+                f"TooLarge: PDF exceeds the {effective_cap:,}-byte fetch budget"
+                + (f" (size {_declared:,} bytes)" if _declared else "")
+                + "; retry with a larger budget if it fits under the hard cap",
+            )
        if pdf_extract_text is None:
            logger.error("pdfminer.six is not installed; cannot extract PDF text.")
            pdf_text = ""
@@ -295,6 +441,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
            "js_message": "",
            "success": bool(pdf_text),
            "error": "" if pdf_text else "Failed to extract PDF text",
+            **_size_fields,
        }
        _cache_result(cache_file, cache_key, result, url)
        return result
@@ -329,6 +476,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
            "js_message": "",
            "success": bool(text_body),
            "error": "" if text_body else "Empty response body",
+            **_size_fields,
        }
        _cache_result(cache_file, cache_key, result, url)
        return result
@@ -391,6 +539,7 @@ def fetch_webpage_content(url: str, timeout: int = 5, retry_attempt: int = 0) ->
        "js_message": js_message,
        "success": True,
        "error": "",
+        **_size_fields,
    }
    _cache_result(cache_file, cache_key, result, url)
    return result
@@ -9,14 +9,12 @@ from urllib.parse import urljoin, urlparse, parse_qs
 import httpx
 from bs4 import BeautifulSoup

-from src.constants import SEARXNG_INSTANCE
+from src.constants import SEARXNG_INSTANCE, REQUEST_TIMEOUT
 from .analytics import RateLimitError, error_logger
 from .query import build_enhanced_query

 logger = logging.getLogger(__name__)

-REQUEST_TIMEOUT = 20
-
 # Provider registry — maps setting value to (label, needs_key, needs_url)
 PROVIDER_INFO = {
    "searxng":  ("SearXNG",           False, True),
@@ -408,7 +408,7 @@ Generate an image. Line 1 = description, line 2 = model name, line 3 = WxH (e.g.
    "ask_teacher": "- ```ask_teacher``` — Escalate a hard question to a more capable model. Line 1 = model name or 'auto', rest = the question. Use when stuck or need expert knowledge.",
    "list_models": "- ```list_models``` — Show all available AI models across all endpoints. Use when user asks what models are available.",
    "manage_session": "- ```manage_session``` — Rename, archive, delete, fork, switch, or `list` chats (the UI calls them 'chats'; 'session' is internal). Line 1 = action (list/switch/rename/archive/unarchive/delete/important/unimportant/truncate/fork), Line 2 = exact chat id from `list_sessions` (or `current` where supported). For delete/archive/truncate, always list first and reuse the exact id; never invent placeholder ids. `switch`/`open` returns a clickable anchor link the user can tap to open the chat — use for \"open my X chat\".",
-    "manage_memory": "- ```manage_memory``` — Manage the user's persistent memory (facts, identity, preferences, context that persists across chats). Line 1 = action (list/add/edit/delete/search), rest = content. Use when user says 'remember this', states identity facts like 'my name is <name>' / 'call me <name>' / 'I live in <place>', or asks about stored memories.",
+    "manage_memory": "- ```manage_memory``` — Manage the user's persistent memory (facts about the USER themselves, their preferences, context that persists across chats). Line 1 = action (list/add/edit/delete/search), rest = content. Use when user says 'remember this' about themselves, states identity facts like 'my name is <name>' / 'call me <name>' / 'I live in <place>', or asks about stored memories. DO NOT use for info about another person (their address, phone, email, birthday) — that goes in `manage_contact`. If the user pastes an address/phone with a name and says 'save this for <person>', use `manage_contact add` with the address arg, NOT manage_memory.",
    "manage_skills": "- ```manage_skills``` — Skill registry (SKILL.md format). Args (JSON): {\"action\": \"list|view|view_ref|search|add|edit|patch|publish|delete\", ...}. `list` returns the index of available skills (published + teacher-escalation drafts); `view name=foo` fetches the full SKILL.md; `view_ref name=foo path=...` loads a reference file under the skill directory. For `add`, provide an explicit kebab-case `name` and only report the exact returned name, because storage may normalize or dedupe it. Use this BEFORE doing domain work — there may already be a procedure (published or draft) that prescribes the correct steps. Drafts written by the teacher loop are authoritative guidance even though they're not yet published.",
    "manage_tasks": "- ```manage_tasks``` — Create and manage scheduled background tasks (recurring AI jobs). Args (JSON): {\"action\": \"list|create|edit|delete|pause|resume|run\", ...}",
    "manage_endpoints": "- ```manage_endpoints``` — Add, remove, or configure AI model API endpoints. Args (JSON): {\"action\": \"list|add|delete|enable|disable\", ...}. Use when user wants to add a new AI provider.",
@@ -428,7 +428,9 @@ Notes, checklists, AND user reminders. Use this for "create/add/write a note", t
 ```send_email
 {"to": "recipient@example.com", "subject": "Re: Your question", "body": "Hi, ...", "account": "gmail"}
 ```
-Send a new email via SMTP. Use `resolve_contact` first if you only have a name. If multiple email accounts exist, call `list_email_accounts` first and pass the chosen `account`.""",
+Send a new email via SMTP. Use `resolve_contact` first if you only have a name. If multiple email accounts exist, call `list_email_accounts` first and pass the chosen `account`.
+
+CRITICAL — signatures: DO NOT invent a sign-off name. End the body with just `Thanks,` or similar — never type a person's name unless the user explicitly told you what to sign as. When `agent_email_confirm` is on (default), the tool returns `{pending: true, pending_id: ...}` and stages the email for the user to approve in the chat UI instead of SMTPing immediately.""",
    "list_emails": """\
 ```list_emails
 {"folder": "INBOX", "max_results": 20, "unread_only": false, "account": "gmail"}
@@ -439,7 +441,9 @@ List recent emails from a folder, newest first, including read messages by defau
 ```reply_to_email
 {"uid": "1234", "body": "Sounds good — talk Friday.", "account": "gmail"}
 ```
-SEND a reply email immediately by UID. Do not use this for "open a reply" or "start a reply" — those should use `ui_control` with `open_email_reply <uid> <folder> reply` to open the email draft document. For follow-up requests like "reply ..." after reading/listing email where the user clearly wants to send now, use the exact UID and account from the latest `read_email`/`list_emails` result. Never invent UID `1`. Threads automatically (In-Reply-To/References handled).""",
+SEND a reply email immediately by UID. Do not use this for "open a reply" or "start a reply" — those should use `ui_control` with `open_email_reply <uid> <folder> reply` to open the email draft document. For follow-up requests like "reply ..." after reading/listing email where the user clearly wants to send now, use the exact UID and account from the latest `read_email`/`list_emails` result. Never invent UID `1`. Threads automatically (In-Reply-To/References handled).
+
+CRITICAL — signatures: DO NOT invent a sign-off name. End the body with just `Thanks,` or similar — never type a person's name unless the user explicitly told you what to sign as. When `agent_email_confirm` is on (default), the tool returns `{pending: true, pending_id: ...}` and stages the email for the user to approve in the chat UI instead of SMTPing immediately.""",
    "bulk_email": """\
 ```bulk_email
 {"action": "delete", "uids": ["10997", "10998"], "folder": "INBOX", "account": "Gmail"}
@@ -449,7 +453,7 @@ Bulk delete/archive/mark emails. Use this for "delete all those" after listing e
    "archive_email": "- ```archive_email``` — Archive one email by UID. Args (JSON): {\"uid\":\"...\", \"folder\":\"INBOX\", \"account\":\"Gmail\"}. For multiple messages use bulk_email.",
    "mark_email_read": "- ```mark_email_read``` — Mark one email read/unread. Args (JSON): {\"uid\":\"...\", \"read\":true, \"folder\":\"INBOX\", \"account\":\"Gmail\"}. For multiple messages use bulk_email.",
    "resolve_contact": "- ```resolve_contact``` — Look up a contact's email by name. Searches CardDAV address book + sent email history. Args (JSON): {\"name\": \"...\"}. Use BEFORE send_email when the user gives only a name.",
-    "manage_contact": "- ```manage_contact``` — Create/update/delete/list CardDAV contacts. Args (JSON): {\"action\": \"list|add|update|delete\", \"name\": \"...\", \"email\": \"...\", \"uid\": \"...\"}. Use only for explicit address-book/contact requests with contact details. Do NOT use for user identity facts like 'my name is <name>'; save those with manage_memory. For update/delete, call action=list first to get the uid.",
+    "manage_contact": "- ```manage_contact``` — Create/update/delete/list CardDAV contacts. Args (JSON): {\"action\": \"list|add|update|delete\", \"name\": \"...\", \"email\": \"...\", \"phones\": [...], \"address\": \"...\", \"uid\": \"...\"}. Use for info about another person: email, phone, postal address. For 'save this for <person>' / address paste / phone next to a name, use this — NOT manage_memory. Do NOT use for user identity facts ('my name is X'); those are manage_memory. For update/delete, call action=list first for the uid.",
    "manage_calendar": """\
 ```manage_calendar
 {"action": "create_event", "summary": "<event title>", "dtstart": "<natural language or ISO datetime>"}
@@ -520,7 +524,7 @@ def get_builtin_overrides() -> dict:
        ov = get_setting("builtin_tool_overrides", {})
        return ov if isinstance(ov, dict) else {}
    except Exception as e:
-        logger.warning('Failed to load builtin tool overrides: %s', e)
+        logger.warning("Failed to load builtin tool overrides, using defaults", exc_info=e)
        return {}


@@ -859,6 +863,7 @@ def _build_system_prompt(
    compact: bool = False,
    owner: Optional[str] = None,
    suppress_local_context: bool = False,
+    active_email: Optional[Dict[str, str]] = None,
 ) -> List[Dict]:
    """Build agent system prompt, inject MCP/document context, merge consecutive system msgs."""
    global _cached_base_prompt, _cached_base_prompt_key
@@ -924,8 +929,8 @@ def _build_system_prompt(
    try:
        from src.user_time import current_datetime_context_message
        _datetime_message = current_datetime_context_message()
-    except Exception:
-        pass
+    except Exception as e:
+        logger.warning("Failed to build datetime context message", exc_info=e)

    # Document context is kept as a SEPARATE message (not merged into the tool
    # prompt) so the context trimmer doesn't destroy it when truncating the
@@ -968,8 +973,8 @@ def _build_system_prompt(
            try:
                from src.pdf_form_doc import find_source_upload_id
                _is_form_backed = bool(find_source_upload_id(active_document.current_content or ""))
-            except Exception:
-                pass
+            except Exception as e:
+                logger.warning("Failed to detect if document is form-backed, assuming plain", exc_info=e)

            if _is_form_backed:
                doc_ctx = (
@@ -1051,6 +1056,66 @@ def _build_system_prompt(
    else:
        set_active_document(None)

+    # Active email reader — frontend told us the user has an email open.
+    # Inject a context block so "reply", "summarize this", "what does it say"
+    # resolve to the real UID instead of the agent inventing a fresh .md
+    # draft with fake headers. This is the email equivalent of _doc_message.
+    _email_message = None
+    if active_email and active_email.get("uid"):
+        _em_uid = active_email.get("uid", "")
+        _em_folder = active_email.get("folder", "INBOX")
+        _em_account = active_email.get("account", "")
+        _em_subject = active_email.get("subject", "") or "(no subject)"
+        _em_from = active_email.get("from", "") or "(unknown sender)"
+        _em_preview = (active_email.get("body_preview", "") or "").strip()
+        _preview_block = f"\nBody preview:\n```\n{_em_preview[:1800]}\n```" if _em_preview else ""
+        _acct_arg = f" {_em_account}" if _em_account else ""
+        email_ctx = (
+            f"ACTIVE EMAIL OPEN (the user has this email open in a reader window right now)\n"
+            f"UID: {_em_uid}\n"
+            f"Folder: {_em_folder}\n"
+            f"Account: {_em_account or '(default)'}\n"
+            f"From: {_em_from}\n"
+            f"Subject: {_em_subject}{_preview_block}\n\n"
+            f"CRITICAL DEFAULT — every request about email this turn refers to "
+            f"THIS email unless the user names a DIFFERENT specific recipient "
+            f"(a name, an email address, or another thread). Examples that "
+            f"ALL mean reply-to-the-open-email:\n"
+            f"  • 'reply' / 'reply to this' / 'respond'\n"
+            f"  • 'write email saying X' / 'send email saying X' / 'draft something'\n"
+            f"  • 'tell them X' / 'say hi' / 'thanks' / 'ack' / 'lmk'\n"
+            f"  • 'summarize it' / 'what does it say' / 'tldr'\n"
+            f"  • 'forward this' / 'forward to <addr>'\n"
+            f"DO NOT ASK THE USER 'who do you want to send this to?' — the "
+            f"answer is ALWAYS the sender of the open email (above) unless they "
+            f"named someone else. Asking that is the wrong move every time.\n\n"
+            f"RULES for the open email:\n"
+            f"1. DRAFT a reply (default for any 'write/send/reply/tell them' "
+            f"request without a different recipient): call `ui_control` with "
+            f"`action=\"open_email_reply\"` and `extra=\"{_em_uid} {_em_folder} "
+            f"reply\"`. This opens the proper reply doc with To/Subject/"
+            f"In-Reply-To pre-filled by the backend. The user will see and edit "
+            f"it before sending. DO NOT `create_document` a markdown file with "
+            f"hand-written `To:` / `Subject:` / `In-Reply-To:` headers — that "
+            f"is wrong every time.\n"
+            f"2. SEND a reply immediately (skip the draft): call "
+            f"`reply_to_email` with the UID above. Only do this when the user "
+            f"explicitly says 'send' / 'send the reply' / 'reply and send'.\n"
+            f"3. READ the full body (the preview above may be truncated): "
+            f"call `read_email` with the UID/folder/account above.\n"
+            f"4. SUMMARIZE / answer questions about it: read it first, then "
+            f"answer in chat. Don't create a document for a summary unless "
+            f"the user explicitly asks for one.\n"
+            f"5. Never ask the user to paste the email or 'share it with you' "
+            f"— you already have its identity above and can read the full body.\n"
+            f"6. The ONLY time you ask 'who to send to?' is when the user "
+            f"explicitly says 'send a NEW email to someone else' or names a "
+            f"recipient you can't identify. A bare 'send email saying X' = the "
+            f"open email's sender.\n"
+        )
+        _email_message = untrusted_context_message("active email reader", email_ctx)
+        _email_message["_protected"] = True
+
    # Inject writing style for any email writing path. This is deliberately
    # broader than read/list: models may compose via send_email, reply_to_email,
    # or ui_control open_email_reply after the first tool round.
@@ -1258,6 +1323,9 @@ def _build_system_prompt(
    if _doc_message:
        merged.insert(last_user_idx, _doc_message)
        last_user_idx += 1  # the document message is now at last_user_idx
+    if _email_message:
+        merged.insert(last_user_idx, _email_message)
+        last_user_idx += 1
    if _skills_message:
        merged.insert(last_user_idx, _skills_message)
        last_user_idx += 1
@@ -1292,12 +1360,18 @@ def _build_base_prompt(
    from src.tool_index import ALWAYS_AVAILABLE

    disabled = set(disabled_tools or [])
-    if not get_setting("image_gen_enabled", True):
+    if not get_setting("image_gen_enabled", False):
        disabled.add("generate_image")

    if relevant_tools is not None:
-        # RAG mode: include always-available + retrieved + admin (if needed)
-        tool_names = set(ALWAYS_AVAILABLE) | set(relevant_tools)
+        # RAG mode: trust the relevant_tools set as already-composed.
+        # get_tools_for_query starts from ALWAYS_AVAILABLE and may
+        # *discard* tools that conflict with the query's intent (e.g.
+        # drop manage_memory for clear contact-save patterns). Unioning
+        # ALWAYS_AVAILABLE back in here used to silently undo those
+        # drops. Only force-include the irreducible loop primitives
+        # (ask_user, update_plan) as belt-and-suspenders.
+        tool_names = set(relevant_tools) | {"ask_user", "update_plan"}
        if needs_admin:
            tool_names |= _ADMIN_TOOLS
        agent_prompt = _assemble_prompt(tool_names, disabled, compact=compact)
@@ -1738,6 +1812,7 @@ async def stream_agent_loop(
    max_tool_calls: int = 0,
    context_length: int = 0,
    active_document=None,
+    active_email: Optional[Dict[str, str]] = None,
    session_id: Optional[str] = None,
    disabled_tools: Optional[Set[str]] = None,
    owner: Optional[str] = None,
@@ -2025,6 +2100,7 @@ async def stream_agent_loop(
        compact=_is_api_model,
        owner=owner,
        suppress_local_context=guide_only,
+        active_email=active_email,
    )
    if plan_mode and not guide_only:
        # Steer the model to investigate-then-propose. Hard tool gating handles
@@ -2910,7 +2986,19 @@ async def stream_agent_loop(
            tool_output_data = {"type": "tool_output", "tool": block.tool_type, "command": cmd_display, "output": output_text, "exit_code": result.get("exit_code")}
            if "ui_event" in result:
                tool_output_data["ui_event"] = result["ui_event"]
-                for k in ("toggle_name", "state", "mode", "model", "endpoint_url", "theme_name", "colors"):
+                for k in (
+                    "toggle_name", "state", "mode", "model", "endpoint_url",
+                    "theme_name", "colors",
+                    # ui_control open_email_reply payload — without these the
+                    # frontend openReplyDraft bails on undefined uid and the
+                    # reply window silently never opens.
+                    "uid", "folder", "account_id",
+                    # Optional pre-filled body for open_email_reply so the
+                    # agent can compose-and-open in one tool call.
+                    "body",
+                    # ui_control open_panel payload
+                    "panel",
+                ):
                    if k in result:
                        tool_output_data[k] = result[k]
            # Forward image data from generate_image tool
@@ -57,13 +57,23 @@ class WebSearchTool:
 class WebFetchTool:
    async def execute(self, content: str, ctx: dict) -> dict:
        from src.search.content import fetch_webpage_content
+        from src.constants import WEB_FETCH_HARD_MAX_BYTES
        raw = content.strip()
        url = ""
+        max_bytes = None
        if raw.startswith("{"):
            try:
                parsed = json.loads(raw)
                if isinstance(parsed, dict):
                    url = str(parsed.get("url") or "").strip()
+                    # Download-budget override (#3812): "full": true raises the
+                    # budget to the hard cap; an explicit max_bytes is clamped
+                    # to the hard cap downstream. Default stays the soft cap.
+                    if parsed.get("full") is True:
+                        max_bytes = WEB_FETCH_HARD_MAX_BYTES
+                    mb = parsed.get("max_bytes")
+                    if isinstance(mb, int) and mb > 0:
+                        max_bytes = mb
            except json.JSONDecodeError:
                url = ""
        if not url:
@@ -78,7 +88,7 @@ class WebFetchTool:
        loop = asyncio.get_running_loop()
        try:
            result = await asyncio.wait_for(
-                loop.run_in_executor(None, lambda: fetch_webpage_content(url, timeout=10)),
+                loop.run_in_executor(None, lambda: fetch_webpage_content(url, timeout=10, max_bytes=max_bytes)),
                timeout=30,
            )
        except asyncio.TimeoutError:
@@ -94,8 +104,28 @@ class WebFetchTool:
                return {"error": f"web_fetch: {url}: {err}", "exit_code": 1}
            return {"error": f"web_fetch: {url}: no readable text content (not HTML, or the page needs JS/login)", "exit_code": 1}

+        # Tell the model when the download budget cut the body short and how
+        # to get the rest, instead of silently presenting a partial page as
+        # the whole thing.
+        size_note = ""
+        if result.get("truncated"):
+            fetched = result.get("fetched_bytes") or 0
+            total = result.get("total_bytes")
+            total_txt = f" of {total:,} bytes" if total else ""
+            size_note = (
+                f"[partial content: download stopped at {fetched:,} bytes{total_txt}. "
+                f'Re-call with {{"url": "{url}", "full": true}} to fetch up to '
+                f"{WEB_FETCH_HARD_MAX_BYTES:,} bytes.]\n\n"
+            )
+
+        # The notice must lead the output so the MAX_OUTPUT_CHARS trim below can
+        # never drop it. The title is untrusted, uncapped page content, so a
+        # giant title ahead of the notice could push it out of range; keep the
+        # notice first and cap the title as a second guard.
+        if len(title) > 300:
+            title = title[:300] + "..."
        header = (f"# {title}\n" if title else "") + f"Source: {url}\n\n"
-        output = header + text
+        output = size_note + header + text
        if len(output) > MAX_OUTPUT_CHARS:
            output = output[:MAX_OUTPUT_CHARS] + "\n\n[...truncated]"
        return {"output": output, "exit_code": 0}
@@ -1292,7 +1292,7 @@ async def do_ui_control(content: str, session_id: Optional[str] = None, owner: O
      set_theme <preset>      — Apply a built-in theme preset (dark, light, midnight, paper, cyberpunk, retrowave, forest, ocean, ume, copper, terminal, organs, lavender, gpt, claude, cute)
      create_theme <name> <bg> <fg> <panel> <border> <accent> [key=val ...] — Create custom theme. Optional key=val: advanced color overrides AND background effects: bgPattern=<none|dots|synapse|rain|constellations|perlin-flow|petals|sparkles|embers>, bgEffectColor=#RRGGBB, bgEffectIntensity=<num>, bgEffectSize=<num>, frosted=true|false
      open_panel <name>       — Open a panel (documents, gallery, email, sessions, notes, memories, skills, settings, cookbook)
-      open_email_reply <uid> [folder] [reply|reply-all|ai-reply] — Open a reply draft document for an email; does not send
+      open_email_reply <uid> [folder] [reply|reply-all|ai-reply] [body text] — Open a reply draft document for an email; does not send. ALWAYS append the body text when the user told you what to say (one-shot draft); only omit body when the user just asked to "open a reply" without content.
      get_toggles             — Return current toggle states (server-side knowledge)
    """
    lines = content.strip().split("\n")
@@ -1536,21 +1536,54 @@ async def do_ui_control(content: str, session_id: Optional[str] = None, owner: O
        }

    elif action == "open_email_reply":
-        reply_parts = lines[0].strip().split()
-        uid = reply_parts[1].strip() if len(reply_parts) > 1 else ""
-        folder = reply_parts[2].strip() if len(reply_parts) > 2 else "INBOX"
-        mode = reply_parts[3].strip().lower() if len(reply_parts) > 3 else "reply"
+        # Two forms supported:
+        #   open_email_reply <uid> [folder] [reply|reply-all|ai-reply]
+        #   open_email_reply <uid> [folder] [reply|reply-all|ai-reply]
+        #     <body text on subsequent lines or after the mode token>
+        # The body text (if any) gets pre-filled into the reply draft so the
+        # agent can compose-and-open in one tool call instead of opening an
+        # empty draft and leaving the user to wonder what happened.
+        first_line = lines[0].strip()
+        parts = first_line.split(maxsplit=4)
+        uid = parts[1].strip() if len(parts) > 1 else ""
+        folder = parts[2].strip() if len(parts) > 2 else "INBOX"
+        mode = parts[3].strip().lower() if len(parts) > 3 else "reply"
+        # Body: everything on the first line after the mode token, plus any
+        # subsequent lines. Allows multi-line bodies.
+        inline_body = parts[4] if len(parts) > 4 else ""
+        rest_lines = "\n".join(lines[1:]).strip() if len(lines) > 1 else ""
+        body = (inline_body + ("\n" + rest_lines if rest_lines else "")).strip()
        if not uid:
-            return {"error": "open_email_reply needs: open_email_reply <uid> [folder] [reply|reply-all|ai-reply]"}
+            return {"error": "open_email_reply needs: open_email_reply <uid> [folder] [reply|reply-all|ai-reply] [body text]"}
        if mode not in ("reply", "reply-all", "ai-reply"):
            mode = "reply"
-        return {
+        # Body is REQUIRED for the agent path. Opening an empty draft is what
+        # users do by clicking the Reply button — they don't ask the agent
+        # for that. Every agent invocation of open_email_reply MUST include
+        # the body. Reject empty so the agent retries with the content the
+        # user asked for. Exception: ai-reply mode triggers the existing
+        # AI-Reply path on the frontend which generates its own body.
+        if not body and mode != "ai-reply":
+            return {
+                "error": (
+                    "open_email_reply called without body. The agent path REQUIRES a body — "
+                    "opening an empty draft is the wrong response when the user asked you to write. "
+                    "Re-call with the reply text included: "
+                    f"`open_email_reply {uid} {folder or 'INBOX'} {mode} <your reply text here>`. "
+                    "Compose the reply now based on the open email's content and the user's request, "
+                    "then call this tool again with the body. Do NOT call create_document instead."
+                ),
+            }
+        result = {
            "ui_event": "open_email_reply",
            "uid": uid,
            "folder": folder or "INBOX",
            "mode": mode,
-            "results": f"Opening reply draft for email UID {uid}",
+            "results": f"Opening reply draft for email UID {uid}" + (" with pre-filled body" if body else ""),
        }
+        if body:
+            result["body"] = body
+        return result

    elif action == "get_toggles":
        return {
@@ -1580,7 +1613,9 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne
    """
    import base64
    import httpx
+    import os
    from pathlib import Path
+    from src.url_safety import check_outbound_url

    lines = content.strip().split("\n")
    prompt = lines[0].strip() if lines else ""
@@ -1746,8 +1781,15 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne

            elif img.get("url"):
                # Download external URL and save locally (DALL-E returns temp URLs)
+                result_url = img["url"]
+                ok, reason = check_outbound_url(
+                    result_url,
+                    block_private=os.getenv("IMAGE_BLOCK_PRIVATE_IPS", "false").lower() == "true",
+                )
+                if not ok:
+                    return {"error": f"Image API returned unsafe image URL: {reason}"}
                try:
-                    dl_resp = httpx.get(img["url"], timeout=60)
+                    dl_resp = httpx.get(result_url, timeout=60)
                    if dl_resp.status_code == 200:
                        img_dir = Path(GENERATED_IMAGES_DIR)
                        img_dir.mkdir(parents=True, exist_ok=True)
@@ -1757,10 +1799,10 @@ async def do_generate_image(content: str, session_id: Optional[str] = None, owne
                        image_url = f"/api/generated-image/{filename}"
                        image_id = _save_to_gallery(filename)
                    else:
-                        image_url = img["url"]  # fallback to external URL
+                        image_url = result_url  # fallback to external URL
                except Exception as _dl_e:
                    logger.warning(f"Failed to download DALL-E image: {_dl_e}")
-                    image_url = img["url"]  # fallback to external URL
+                    image_url = result_url  # fallback to external URL
            else:
                return {"error": "Image API returned unexpected format (no b64_json or url)"}

@@ -14,6 +14,7 @@ import subprocess
 import sys

 from core.platform_compat import IS_WINDOWS, which_tool
+from src.runtime_paths import get_app_root

 logger = logging.getLogger(__name__)

@@ -81,7 +82,7 @@ _BUILTIN_NPX_SERVERS = {
        "name": "Built-in: Browser",
        "command": "npx",
        "args": ["-y", "@playwright/mcp@latest", "--headless", "--caps", "vision"],
-    },
+    }
 }

 # Global flag to disable MCP if there are compatibility issues
@@ -94,7 +95,7 @@ async def register_builtin_servers(mcp_manager):
        logger.info("Built-in MCP servers disabled via ODYSSEUS_DISABLE_MCP")
        return

-    base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+    base_dir = get_app_root()
    python = sys.executable

    async def _connect_python_server(server_id: str, script_path: str, name: str):
@@ -5,6 +5,7 @@ from pydantic_settings import BaseSettings, SettingsConfigDict
 from pydantic import Field, field_validator

 from src.constants import DATA_DIR as _DATA_DIR_CONST
+from src.runtime_paths import get_app_root

 # Cross-platform OS flag, exposed here so callers can `from src.config import
 # IS_WINDOWS`. Defined locally (a trivial `os.name == "nt"`) rather than imported
@@ -19,7 +20,7 @@ IS_WINDOWS = os.name == "nt"
 class DataConfig(BaseSettings):
    """Configuration for data storage and file handling."""
    # Base directory
-    base_dir: Path = Field(default=Path(__file__).parent.parent, description="Base directory for the application")
+    base_dir: Path = Field(default=Path(get_app_root()), description="Base directory for the application")
    
    # Data paths
    data_dir: Path = Field(default=Path(_DATA_DIR_CONST), description="Main data directory")
@@ -138,7 +139,7 @@ class AppConfig(BaseSettings):
        if isinstance(v, dict) and "base_dir" in v:
            base_dir = v["base_dir"]
        else:
-            base_dir = Path(__file__).parent.parent
+            base_dir = Path(get_app_root())
        
        # Convert string paths to Path objects relative to base_dir
        data_dir = Path(_DATA_DIR_CONST)
@@ -2,12 +2,14 @@
 """Application-wide constants and configuration values."""
 import os

+from src.runtime_paths import get_app_root, get_default_data_dir
+
 APP_VERSION = "1.0.0"

 # Base paths
-BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) + "/"
+BASE_DIR = os.path.join(get_app_root(), "")
 STATIC_DIR = os.path.join(BASE_DIR, "static")
-DATA_DIR = os.getenv("ODYSSEUS_DATA_DIR", os.path.join(BASE_DIR, "data"))
+DATA_DIR = os.getenv("ODYSSEUS_DATA_DIR", get_default_data_dir())

 # Data file paths
 # Single source of truth: every persisted file/dir lives under DATA_DIR, which
@@ -63,6 +65,14 @@ MAX_OUTPUT_CHARS = 10_000       # cap for bash/python/web_search/web_fetch outpu
 MAX_READ_CHARS = 20_000         # cap for read_file / document preview
 MAX_DIFF_LINES = 400            # cap for edit_file unified-diff display

+# web_fetch response-size policy (#3812). MAX_OUTPUT_CHARS above only trims
+# what the agent SEES; these caps bound what the server downloads, parses,
+# and writes to the content cache. The soft cap is the default download
+# budget; the agent can raise it per call (full/max_bytes) but never past
+# the hard cap, so a model can't decide to pull a multi-GB file.
+WEB_FETCH_SOFT_MAX_BYTES = 2_000_000    # default download budget (2 MB)
+WEB_FETCH_HARD_MAX_BYTES = 20_000_000   # absolute ceiling, even with override (20 MB)
+
 # API Configuration
 MAX_CONTEXT_MESSAGES = 90
 REQUEST_TIMEOUT = 20
@@ -161,11 +161,13 @@ async def _tick() -> None:
    # Re-read state once before writing so we capture any updates from
    # concurrent UI syncs.
    stopped_any = False
+    successfully_stopped_sids = set()
    for sid, host, port in to_stop:
        ok = await _stop_serve(sid, host, port)
        logger.info(f"cookbook_serve_lifecycle: stop {sid} (host={host or 'local'}): {'ok' if ok else 'failed'}")
        if ok:
            stopped_any = True
+            successfully_stopped_sids.add(sid)
            # Drop the auto-registered endpoint so the model picker and
            # the chat router don't keep pointing at a dead server.
            for t in tasks:
@@ -188,12 +190,11 @@ async def _tick() -> None:
            except Exception:
                fresh = state
                fresh_tasks = tasks
-            stopped_sids = {sid for sid, _, _ in to_stop}
            for ft in fresh_tasks:
                if not isinstance(ft, dict):
                    continue
                ft_sid = ft.get("sessionId") or ft.get("id")
-                if ft_sid in stopped_sids:
+                if ft_sid in successfully_stopped_sids:
                    ft["status"] = "stopped"
                    ft["_scheduledStopAtMs"] = None
                    ft["_lastStatusFlipAt"] = now_ms
@@ -199,11 +199,20 @@ def _fit_inline_attachment_text(
    return text[:remaining] + marker, 0


-def _process_office_document(path: str, display_name: str) -> str:
+def _process_office_document(
+    path: str,
+    display_name: str,
+    session_id: str | None = None,
+    auto_opened_docs: list[Dict[str, Any]] | None = None,
+    owner: str | None = None,
+) -> str:
    """Extract an Office/EPUB document to Markdown via the optional markitdown dep.

    Falls back to a friendly banner when markitdown is unavailable or finds no
-    text, so a missing optional dependency never breaks the chat path.
+    text, so a missing optional dependency never breaks the chat path. When a
+    session_id is provided AND the extraction succeeded, the FULL text is also
+    saved as a Document so the agent can page through it via
+    `manage_documents action=read offset=…` after the inline copy is capped.
    """
    from src.markitdown_runtime import (
        is_markitdown_format,
@@ -218,6 +227,46 @@ def _process_office_document(path: str, display_name: str) -> str:
    if markdown and markdown.strip():
        title = os.path.splitext(os.path.basename(path))[0]
        body, marker = _truncate_inline(markdown)
+
+        # Persist the full extracted text as a Document. The agent's existing
+        # manage_documents tool can then read past the inline cap with offset.
+        doc_id = None
+        if session_id:
+            try:
+                from src.office_doc import create_office_document
+                doc_id = create_office_document(
+                    session_id=session_id,
+                    upload_id=os.path.basename(path),
+                    title=title,
+                    body_text=markdown,
+                )
+                if doc_id and auto_opened_docs is not None:
+                    from src.database import SessionLocal, Document
+                    _db = SessionLocal()
+                    try:
+                        _d = _db.query(Document).filter(Document.id == doc_id).first()
+                        if _d:
+                            auto_opened_docs.append({
+                                "doc_id": _d.id,
+                                "title": _d.title,
+                                "language": _d.language,
+                                "content": _d.current_content,
+                                "version": _d.version_count,
+                            })
+                    finally:
+                        _db.close()
+            except Exception as e:
+                logger.warning("Office auto-doc creation failed for %s: %s", path, e)
+
+        # Upgrade the truncation marker with a hint pointing at the full doc so
+        # the agent knows it can read the rest.
+        if doc_id and marker:
+            marker = (
+                f"\n[…truncated for inline context — full {len(markdown):,} chars "
+                f"saved as document `{doc_id}`. Use `manage_documents` with "
+                f"action=read, document_id={doc_id}, offset=<N> to page through.]"
+            )
+
        return f"\n\n[Document content — {title}]:\n{body}{marker}"

    # No content: tell the user whether to install the optional dep or whether
@@ -521,7 +570,13 @@ def build_user_content(
            elif mime.startswith("text/") or _is_text_file(path):
                extracted_text = _process_text_file(path)
            else:
-                extracted_text = _process_office_document(path, display_name)
+                extracted_text = _process_office_document(
+                    path,
+                    display_name,
+                    session_id=session_id,
+                    auto_opened_docs=auto_opened_docs,
+                    owner=owner,
+                )

            extracted_text, inline_attachment_remaining = _fit_inline_attachment_text(
                extracted_text,
@@ -31,6 +31,8 @@ import numpy as np
 import httpx
 from typing import List, Optional

+from src.runtime_paths import get_app_root
+
 logger = logging.getLogger(__name__)

 _DEFAULT_MODEL = "all-minilm:l6-v2"
@@ -201,11 +201,15 @@ def build_models_url(base: str) -> Optional[str]:
        return _ollama_api_root(base) + "/tags"
    if provider == "chatgpt-subscription":
        return None
-    # Generic OpenAI-compatible fallback: ensure the path lands on /v1/models
-    # when the user omitted a path entirely. If a non-empty path is already
-    # present (e.g. /openai, /api/openai/v1, /v1), trust the caller — the
-    # /models suffix is appended as-is and the caller's prefix is preserved.
-    if not urlparse(base).path:
+    # Generic OpenAI-compatible fallback: local model servers with no explicit
+    # path conventionally expose `/v1/models` (LM Studio, llama.cpp, vLLM).
+    # For non-local unknown hosts, do not invent `/v1`; append `/models` to the
+    # caller's base so look-alike provider hosts stay generic.
+    parsed = urlparse(base)
+    host = (parsed.hostname or "").lower()
+    is_local = host in {"localhost", "127.0.0.1", "::1", "host.docker.internal"}
+    uses_v1_models_by_default = is_local or host in {"api.deepseek.com"}
+    if not parsed.path and uses_v1_models_by_default:
        base = base + "/v1"
    return base + "/models"

@@ -283,7 +283,8 @@ def _is_ollama_native_url(url: str) -> bool:
    """Return True for native Ollama API URLs, including Ollama Cloud."""
    try:
        parsed = urlparse(url or "")
-    except Exception:
+    except Exception as e:
+        logger.warning("Failed to parse URL for Ollama detection", exc_info=e)
        return False
    host = parsed.hostname or ""
    path = (parsed.path or "").rstrip("/")
@@ -1345,8 +1346,8 @@ def list_model_ids(
                r = httpx.get(root + "/api/tags", timeout=timeout)
                r.raise_for_status()
                return [m.get("name") or m.get("model") for m in (r.json().get("models") or []) if m.get("name") or m.get("model")]
-        except Exception:
-            pass
+        except Exception as e:
+            logger.warning("Failed to fetch model list from configured endpoint", exc_info=e)
        return []

 def normalize_model_id(
@@ -40,15 +40,59 @@ def load_markitdown():
    return MarkItDown


+def _extract_docx_native(path: str) -> str | None:
+    """Pure-Python .docx text extractor — no external deps.
+
+    A .docx file is just a zip of XML. The body prose lives in <w:t> runs
+    inside <w:p> paragraphs. Iterating with ElementTree (rather than
+    re.findall) keeps paragraph breaks intact and lets the XML parser handle
+    namespaces + entity unescaping. Loses tables, footnotes, images and
+    list bullets — keeps ~95% of "summarize this doc" content, which is the
+    case people hit when markitdown isn't installed.
+    """
+    import zipfile
+    import xml.etree.ElementTree as ET
+
+    ns = "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}"
+    try:
+        with zipfile.ZipFile(path) as z:
+            xml_bytes = z.read("word/document.xml")
+    except (zipfile.BadZipFile, KeyError, OSError):
+        return None
+    try:
+        root = ET.fromstring(xml_bytes)
+    except ET.ParseError:
+        return None
+    paragraphs: list[str] = []
+    for para in root.iter(f"{ns}p"):
+        runs = [t.text or "" for t in para.iter(f"{ns}t")]
+        line = "".join(runs).strip()
+        if line:
+            paragraphs.append(line)
+    return "\n\n".join(paragraphs) if paragraphs else None
+
+
 def convert_to_markdown(path: str) -> str | None:
    """Convert a document to Markdown text via markitdown.

    Returns the extracted Markdown, or ``None`` if markitdown is unavailable or
    the conversion fails — callers degrade gracefully rather than erroring.
+
+    Fallback: when markitdown isn't installed and the file is a .docx, run
+    the bundled pure-Python extractor so the most common case (Word docs)
+    works out of the box. Other Office/EPUB formats still need markitdown.
    """
    try:
        markitdown_cls = load_markitdown()
    except RuntimeError:
+        if isinstance(path, str) and path.lower().endswith(".docx"):
+            text = _extract_docx_native(path)
+            if text:
+                logger.info(
+                    "markitdown not installed — used native .docx extractor for %s",
+                    path,
+                )
+                return text
        logger.warning("markitdown not installed; cannot extract %s", path)
        return None
    try:
@@ -11,6 +11,8 @@ import os
 import re
 from typing import Any, Dict, List, Optional, Set, Tuple

+from src.runtime_paths import get_app_root
+
 logger = logging.getLogger(__name__)

 def _format_mcp_connection_error(name: str, command: str = "", args: Optional[List[str]] = None, error: Exception = None) -> str:
@@ -508,7 +510,7 @@ class McpManager:
            return False

        script_rel, name = _BUILTIN_SERVERS[server_id]
-        base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+        base_dir = get_app_root()
        script_path = os.path.join(base_dir, script_rel)

        # Clean up old connection
@@ -0,0 +1,73 @@
+"""Auto-create a Document row from an Office attachment.
+
+When a .docx (and friends) lands in chat, the full extracted text is stored
+as a Document so the agent can page through it with `manage_documents
+action=read offset=…` even after the inline chat payload was capped. Mirrors
+the PDF auto-doc pattern in `src.pdf_form_doc`.
+"""
+
+import logging
+import uuid
+from typing import Optional
+
+logger = logging.getLogger(__name__)
+
+
+def create_office_document(
+    session_id: str,
+    upload_id: str,
+    title: str,
+    body_text: Optional[str] = None,
+) -> Optional[str]:
+    """Create a markdown Document for an Office attachment and set it active.
+
+    Returns the new doc_id, or None on failure / empty body. The full
+    extracted body lives in `current_content`, so the agent can fetch
+    arbitrary windows via `manage_documents action=read` even when the
+    inline chat copy was truncated.
+    """
+    from src.database import (
+        SessionLocal,
+        Document,
+        DocumentVersion,
+        Session as DbSession,
+    )
+    from src.agent_tools.document_tools import set_active_document
+
+    if not body_text or not body_text.strip():
+        return None
+
+    db = SessionLocal()
+    try:
+        doc_id = str(uuid.uuid4())
+        ver_id = str(uuid.uuid4())
+        sess = db.query(DbSession).filter(DbSession.id == session_id).first()
+        doc = Document(
+            id=doc_id,
+            session_id=session_id,
+            title=title,
+            language="markdown",
+            current_content=body_text,
+            version_count=1,
+            is_active=True,
+            owner=sess.owner if sess else None,
+        )
+        ver = DocumentVersion(
+            id=ver_id,
+            document_id=doc_id,
+            version_number=1,
+            content=body_text,
+            summary="Imported from Office attachment",
+            source="upload",
+        )
+        db.add(doc)
+        db.add(ver)
+        db.commit()
+        set_active_document(doc_id)
+        return doc_id
+    except Exception as e:
+        db.rollback()
+        logger.error("Failed to create office document: %s", e)
+        return None
+    finally:
+        db.close()
@@ -7,6 +7,7 @@ import time
 from pathlib import Path

 from src.constants import RAG_DIR
+from src.runtime_paths import get_app_root

 logger = logging.getLogger(__name__)

@@ -0,0 +1,78 @@
+"""Server-side mirror of the built-in characters used for reminder synthesis.
+
+The frontend ships these in static/js/presets.js (PROMPT_TEMPLATES with
+isCharacter:true). The Reminders → AI Synthesis card writes only the
+persona ID into settings; the synthesis route in note_routes.py needs
+the full prompt text to bias the utility model's voice. Keeping a small
+local mirror avoids having the client send the prompt over the wire on
+every reminder fire.
+
+If the user picks a custom character (id == "custom") we fall back to
+the warm-neutral baseline — custom prompts live in browser localStorage
+and aren't visible to the server.
+"""
+
+PERSONAS = {
+    "socrates": (
+        "Never answer directly. Respond only with questions — sharp, layered, "
+        "Socratic. Expose contradictions. Make the person argue with themselves "
+        "until the truth falls out. Use irony like a scalpel. Be genuinely "
+        "curious, never condescending."
+    ),
+    "razor": (
+        "Strip everything to the bone. No filler, no hedging, no pleasantries. "
+        "Answer in the fewest words possible. If one sentence works, don't use "
+        "two. If a word adds nothing, cut it. Blunt, precise, surgical."
+    ),
+    "nietzsche": (
+        "Think and respond through the lens of Nietzsche. Analyze every "
+        "question in terms of will to power, self-overcoming, eternal "
+        "recurrence, ressentiment, value-creation, and master-slave morality. "
+        "Write with aphoristic force — sharp, compressed, vivid, and "
+        "unapologetic — but do not sacrifice depth for style. Favor "
+        "life-affirmation, discipline, courage, style, rank, self-overcoming, "
+        "and amor fati over nihilism, conformity, ressentiment, and self-pity."
+    ),
+    "spark": (
+        "You are Spark, a playful, quick-witted assistant with bright energy "
+        "and practical instincts. Keep responses concise, vivid, and helpful. "
+        "Be warm without being cloying, imaginative without losing the thread, "
+        "and always center the user's actual goal. Use a light, lively voice "
+        "with occasional clever turns of phrase."
+    ),
+    "odysseus": (
+        "You are Odysseus, king of Ithaca — subtle in counsel, disciplined in "
+        "judgment, and unmatched in strategic cunning. Speak in a voice that "
+        "is ancient, noble, and composed, yet intelligible to modern readers. "
+        "Be eloquent but not flowery. Be wise but not vague. Speak as one who "
+        "has weathered storms and taken back his house by wit, timing, and "
+        "resolve."
+    ),
+}
+
+
+_DEFAULT_SYNTHESIS_TONE = (
+    "You write short, warm, one-line reminders. The user has set a note for "
+    "themselves and the moment to remember has arrived. Keep it under 18 "
+    "words. Be human, gentle, and direct — never robotic."
+)
+
+
+def synthesis_system_prompt(persona_id: str) -> str:
+    """Return the system prompt for reminder synthesis given a persona id.
+
+    Falls back to the warm-neutral baseline when the id is empty, unknown,
+    or refers to a custom (client-only) character we don't have on file.
+    """
+    persona = (persona_id or "").strip().lower()
+    persona_prompt = PERSONAS.get(persona)
+    if persona_prompt:
+        # Persona drives the voice; the synthesis-instruction stays attached
+        # so the model knows it's writing a short reminder, not a chat reply.
+        return (
+            persona_prompt
+            + "\n\n"
+            + "You are now writing a single one-line reminder for the user. "
+              "Keep it under 18 words and in the voice above."
+        )
+    return _DEFAULT_SYNTHESIS_TONE
@@ -0,0 +1,30 @@
+"""Helpers for resolving runtime paths in source and frozen builds."""
+
+import os
+import sys
+
+
+def get_app_root() -> str:
+    """Return the app root directory.
+
+    In normal source runs, this is the repository root. In a frozen Windows
+    build, it is the bundle content root (PyInstaller's internal directory)
+    so bundled runtime folders like `static/`, `scripts/`, and `data/` stay
+    together with the executable payload.
+    """
+    if getattr(sys, "frozen", False):
+        return getattr(sys, "_MEIPASS", os.path.dirname(os.path.abspath(sys.executable)))
+    return os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+
+
+def get_default_data_dir() -> str:
+    """Return the default path to the data directory.
+
+    In normal runs, this is a 'data' subdirectory under the app root.
+    In frozen builds, it is a persistent user directory (~/.odysseus/data)
+    to prevent SQLite databases and other persistent files from being
+    written to the ephemeral, temporary extraction bundle directory.
+    """
+    if getattr(sys, "frozen", False):
+        return os.path.join(os.path.expanduser("~"), ".odysseus", "data")
+    return os.path.join(get_app_root(), "data")
@@ -29,7 +29,15 @@ def _invalidate_caches():
 # ── Default values ──

 DEFAULT_SETTINGS = {
-    "image_gen_enabled": True,
+    # Agent email safety: when True, the MCP send_email / reply_to_email
+    # tools don't SMTP directly. They stage the composed message into the
+    # scheduled_emails table with status='agent_draft' and return a
+    # pending_id + the rendered email so the user can review and approve
+    # (or cancel) before it actually goes out. Default ON because models
+    # have been observed inventing signatures and sending to real
+    # recipients without confirmation.
+    "agent_email_confirm": True,
+    "image_gen_enabled": False,
    "image_model": "",
    "image_quality": "medium",
    "vision_model": "",
@@ -151,6 +159,7 @@ DEFAULT_SETTINGS = {
    # Reminders
    "reminder_channel": "browser",   # "browser" | "email" | "ntfy" | "webhook"
    "reminder_llm_synthesis": False,
+    "reminder_llm_persona": "",
    "reminder_ntfy_topic": "Reminders",
    "reminder_email_to": "",
    # Generic outbound webhook channel: pick any saved Integration as the
@@ -1338,11 +1338,24 @@ class TaskScheduler:
            return await self._execute_checkin(task, crew, db, session_id, endpoint_url, model)

        # Build system prompt: crew member persona overrides the default.
+        # Built-in character_id (Socrates, Razor, etc.) further biases the
+        # voice — it prepends to whichever base prompt we landed on so the
+        # task still knows it's executing a scheduled task but in that
+        # character's tone.
        system_prompt = (
            (crew.personality or "").strip()
            if crew and crew.personality
            else "You are a helpful assistant executing a scheduled task. Use available tools to complete the task thoroughly."
        )
+        char_id = (getattr(task, "character_id", None) or "").strip()
+        if char_id:
+            try:
+                from src.reminder_personas import PERSONAS as _PERSONAS
+                char_prompt = _PERSONAS.get(char_id.lower())
+                if char_prompt:
+                    system_prompt = f"{char_prompt}\n\n{system_prompt}"
+            except Exception:
+                pass
        # Inject current time so the model knows what's past vs upcoming
        tz_name = _resolve_task_timezone(db, task)
        try:
@@ -18,6 +18,40 @@ from core.constants import internal_api_base

 logger = logging.getLogger(__name__)

+# ---------------------------------------------------------------------------
+# Active email state
+# ---------------------------------------------------------------------------
+
+# When the user has an email reader window open, the frontend tells the
+# backend about it on each chat submit. Email tools can resolve "this email"
+# without guessing a UID. Cleared between requests by chat_routes.
+_active_email_ref: Optional[Dict[str, str]] = None
+
+
+def set_active_email(uid: Optional[str], folder: Optional[str] = None, account: Optional[str] = None,
+                     subject: Optional[str] = None, sender: Optional[str] = None) -> None:
+    """Stash the email currently open in the UI. None clears it."""
+    global _active_email_ref
+    if not uid:
+        _active_email_ref = None
+        return
+    _active_email_ref = {
+        "uid": str(uid),
+        "folder": str(folder or "INBOX"),
+        "account": str(account or ""),
+        "subject": str(subject or ""),
+        "from": str(sender or ""),
+    }
+
+
+def get_active_email() -> Optional[Dict[str, str]]:
+    return _active_email_ref
+
+
+def clear_active_email() -> None:
+    global _active_email_ref
+    _active_email_ref = None
+
 # ---------------------------------------------------------------------------
 # Argument parsing
 # ---------------------------------------------------------------------------
@@ -1545,10 +1579,10 @@ async def do_manage_calendar(content: str, owner: Optional[str] = None) -> Dict:
        text = str(raw).strip().lower()
        if text in {"none", "no", "off", "false"}:
            return None
-        m = re.search(r"(\d+)\s*(?:m|min|minute|minutes)\b", text)
+        m = re.search(r"(\d+)\s*(?:minutes?|mins?|m)\b", text)
        if m:
            return max(0, int(m.group(1)))
-        m = re.search(r"(\d+)\s*(?:h|hr|hour|hours)\b", text)
+        m = re.search(r"(\d+)\s*(?:hours?|hrs?|h)\b", text)
        if m:
            return max(0, int(m.group(1)) * 60)
        if text.isdigit():
@@ -1561,7 +1595,7 @@ async def do_manage_calendar(content: str, owner: Optional[str] = None) -> Dict:
            return desc
        reminder_only = re.compile(
            r"^\s*(?:remind(?:er)?|alarm)\s*:?\s*\d+\s*"
-            r"(?:m|min|minute|minutes|h|hr|hour|hours)\b.*$",
+            r"(?:minutes?|mins?|m|hours?|hrs?|h)\b.*$",
            re.I,
        )
        return "" if reminder_only.match(desc) else desc
@@ -88,14 +88,14 @@ BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = {
    "pipeline": "Run a multi-step AI pipeline with multiple models. Chain tasks together in sequence.",
    "list_models": "List all available AI models and their endpoints.",
    "manage_session": "Chat management: rename, archive, delete, or fork chats (the UI calls these 'chats'; internally 'sessions'). Use for 'rename my chats', 'rename this chat', 'archive/delete a chat'.",
-    "manage_memory": "Memory management: list, add, edit, delete, or search persistent memories.",
+    "manage_memory": "Memory management: list, add, edit, delete, or search persistent memories. For facts about the USER (their name, preferences, where they live). NOT for info about ANOTHER person — addresses, phones, emails belonging to a contact go in manage_contact, not memory.",
    "manage_skills": "Skill management: add, update, publish, or search reusable skills/presets.",
    "manage_tasks": "Scheduled task management: list, create, edit, delete, pause, resume, or run cron tasks.",
    "manage_endpoints": "Endpoint management: list, add, delete, enable, or disable model API endpoints.",
    "manage_mcp": "MCP server management: list, add, delete, reconnect servers, or list available tools.",
    "manage_webhooks": "Webhook management: list, add, delete, enable, or disable webhooks.",
    "manage_tokens": "API token management: list, create, or delete API access tokens.",
-    "manage_documents": "List, read, delete, or tidy documents in the editor panel. action='list' returns clickable rows (most-recent first) so the user can open any doc by clicking. action='read' (aka view/open/get) with document_id returns the content. action='delete' with document_id removes a doc (only way to delete). Use this for ANY 'show/read/list/open my documents/docs/files/notes' request — never shell or curl.",
+    "manage_documents": "List, read, delete, or tidy documents in the editor panel. action='list' returns clickable rows (most-recent first) so the user can open any doc by clicking. action='read' (aka view/open/get) with document_id returns the content; supports offset=<N> + limit=<N> to page through large docs (response includes next_offset when more remains, so you can keep calling with offset=next_offset). action='delete' with document_id removes a doc (only way to delete). Use this for ANY 'show/read/list/open my documents/docs/files/notes' request — never shell or curl.",
    "manage_research": "List, read/open, or delete saved DEEP RESEARCH results from the Library. action='list' returns clickable [query](#research-<id>) rows (most-recent first). action='read' (aka open/view/get) with id returns the report + sources. action='delete' with id removes it. Use this for ANY 'open/read/find/delete my research / that report / the research on X' request. NOTE: this is for EXISTING research; to START new research use trigger_research.",
    "manage_settings": "Change ANY real app setting (the ones the Settings panel writes) so the user never has to open it: TTS voice/provider/speed, STT, search engine + result count, default/teacher/task/utility/vision/image/research models, image quality, reminder channel (browser/email/ntfy), agent timeout/tool-call budget, and more. action=set with key (friendly aliases ok: voice, 'search engine', 'default model', 'teacher model', 'image quality', 'reminder channel'...) + value; get/list/reset too. Also toggles tools on/off (disable_tool/enable_tool/list_tools). Secrets/API keys are read-only. Use for any 'change my…/set my…/use X for…/turn on…' preference request.",
    "create_session": "Create a new chat with a name and model.",
@@ -104,7 +104,7 @@ BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = {
    "search_chats": "Search past session transcripts across chats.",
    "ask_user": "Ask the user a multiple-choice question to get a decision or clarification. Use this when the task is genuinely ambiguous and the answer changes what you do next — pick between approaches, confirm an assumption, choose among options — instead of guessing. Provide a clear `question` and 2-6 `options` (each with a short `label`, optional `description`). Calling this ENDS your turn: the user sees clickable buttons and their choice arrives as your next message. Don't use it for things you can decide from context or sensible defaults, or for irreversible-action confirmation if a dedicated flow exists.",
    "update_plan": "Write back to the ACTIVE PLAN while executing an approved plan: mark steps done or revise them. After finishing a step call this with the full checklist and that step marked done; when the user asks to change the plan call it with the revised checklist. Always pass the COMPLETE markdown checklist (`- [ ]` / `- [x]`), not a diff. The user's docked plan window updates live. No effect when there is no active plan.",
-    "ui_control": "Control the UI and toggle tools on/off. Use this to turn off / turn on / disable / enable individual tools and features: shell (bash), search (web), research, browser, documents, incognito. Open panels (documents library, gallery, email inbox, sessions, notes, memories/brain, skills, settings, cookbook) via `open_panel <name>`. Use `open_email_reply <uid> <folder> reply` to open an email reply draft document without sending. Also switches between chat/agent modes, changes the current model, and applies/creates themes.",
+    "ui_control": "Control the UI and toggle tools on/off. Use this to turn off / turn on / disable / enable individual tools and features: shell (bash), search (web), research, browser, documents, incognito. Open panels (documents library, gallery, email inbox, sessions, notes, memories/brain, skills, settings, cookbook) via `open_panel <name>`. Use `open_email_reply <uid> <folder> reply` to open an email reply draft document without sending. To pre-fill the reply body in one shot (USE THIS whenever the user told you what to say — opening an empty draft when they asked you to write is wrong), append the body after the mode: `open_email_reply <uid> <folder> reply <body text>`. Body can continue on subsequent lines for multi-line replies. Also switches between chat/agent modes, changes the current model, and applies/creates themes.",
    "list_email_accounts": "List configured email accounts and default status. Use before reading or sending mail when the user mentions Gmail, work mail, custom domain mail, another mailbox, or asks to compare/check multiple inboxes.",
    "list_emails": "List emails for a folder/account, newest first, including read messages by default. Shows subject, sender, date, UID, account, and AI summary. Check inbox, find emails needing replies. Supports account from list_email_accounts for Gmail/work/custom mailboxes. For last/latest/newest email, use max_results=1 and unread_only=false.",
    "read_email": "Read the full content of a specific email by UID or Message-ID. View email body, check details. Supports account from list_email_accounts when the UID belongs to a non-default mailbox.",
@@ -115,7 +115,7 @@ BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = {
    "mark_email_read": "Mark an email as read or unread by toggling the \\Seen flag.",
    "bulk_email": "Perform one action on many emails at once. Use for delete all those, archive these, mark all read, move spam to junk. Takes explicit UIDs from list_emails or all_unread=true. Always pass account for Gmail/work/custom mailbox results.",
    "resolve_contact": "Look up a contact's email address by name. Searches CardDAV address book and sent email history. Use when the user says 'message [name]', 'email [name]', or 'send to [name]' without an email address.",
-    "manage_contact": "Create, update, delete, or list CardDAV contacts. Use to save a new contact, change an existing one's email/phone, or remove one. Action=list returns uids needed for update/delete. Use when the user says 'save this contact', 'add [name] to contacts', 'update [name]'s email', 'delete [name] from contacts'. Do not use for user identity facts like 'my name is <name>'; those are memory.",
+    "manage_contact": "Save / update / delete / list address-book contacts (CardDAV). Use for info about ANOTHER person — name, email, phone, postal address. Args: action=list|add|update|delete, name, email, phones, address, uid (from list). For 'save this for <person>' / address pastes / phone numbers next to a name, this is the right tool — NOT manage_memory. Do NOT use for facts about the USER ('my name is X'); those are manage_memory.",
    "manage_notes": "Create and manage notes and checklists (Google Keep-style). ALWAYS use this for note/todo/checklist/reminder creation — NEVER hit /api/notes via app_api. Accepts natural-language `due_date` like 'tomorrow at 9am' or '11pm today' (parsed in the USER'S timezone). The due_date IS the reminder — it fires a notification at that time, so do NOT also create a calendar event for the same reminder. Set colors, labels, pin, archive. Do NOT use manage_memory for note content.",
    "manage_calendar": "Calendar event management: list, create, update, delete. Each event can carry a tag/category (event_type — work/personal/health/travel/meal/social/admin/other) and importance (low/normal/high/critical). Resolve today/tomorrow using the Current date and time context, then use ISO datetimes in the user's local wall time; supports all-day events. For event reminders/alarms, pass reminder_minutes; this creates the Notes reminder, so do not also call manage_notes for the same reminder.",
    "download_model": "Download a HuggingFace model to a local or remote server. Specify repo_id (e.g. 'Qwen/Qwen3-8B'), optional server host, and optional include filter for specific files.",
@@ -372,7 +372,19 @@ class ToolIndex:
            {"resolve_contact", "manage_contact"},
        frozenset({"save contact", "add contact", "new contact", "update contact",
                   "edit contact", "delete contact", "remove contact",
-                   "save this person", "add to contacts", "save to contacts"}):
+                   "save this person", "add to contacts", "save to contacts",
+                   # "add <name> to (my) contacts" — words between 'add' and
+                   # 'contacts' break the literal phrase match above, so anchor
+                   # on the tail.
+                   "to my contacts", "to contacts", "to address book",
+                   # "save this for <person>" / "save it for <person>" — the user
+                   # is storing info on a known person without using the literal
+                   # word 'contact'. Catches the address/phone-paste pattern.
+                   "save this for", "save it for", "save for",
+                   "save this one for", "save that for",
+                   # Postal-address-like signals
+                   "postal code", "zip code", "street address",
+                   "mailing address", "their address"}):
            {"manage_contact"},
        # "Ask another model" intent → chat_with_model relays to a
        # different model and returns its answer. ask_teacher escalates
@@ -507,6 +519,53 @@ class ToolIndex:
        # prompts do not drag web schemas into the agent context.
        if self._WEB_RE.search(query):
            base.update({"web_search", "web_fetch"})
+        # Hard steering: when the query is a clear "save info about a specific
+        # person" pattern (address paste + name, phone next to a name, etc.),
+        # the model has been observed defaulting to manage_memory even with
+        # manage_contact in the toolset. Pull memory out for these queries so
+        # the model literally cannot pick it. ALWAYS_AVAILABLE includes
+        # manage_memory by default; we override that here.
+        # The "for/to <word>" check needs to allow lowercase names (users
+        # don't always capitalize) but filter out timing/pronoun stopwords
+        # so "save this for later" / "save for tomorrow" don't trigger.
+        _CONTACT_STOPWORDS_AFTER_FOR = {
+            "later", "tomorrow", "yesterday", "now", "then", "today",
+            "tonight", "me", "us", "you", "him", "her", "them", "myself",
+            "yourself", "next", "this", "that", "the", "a", "an", "future",
+            "real", "use", "uses", "another", "future", "reference",
+        }
+        # Regex catches "save (this|it|the|her|...|<noun>) for <name>" / "to my
+        # contacts" patterns. More forgiving than literal-keyword matching —
+        # 'save this address for Alex' uses one extra word between 'save' and
+        # 'for' that breaks the contiguous 'save this for' phrase.
+        save_for_match = re.search(
+            r"\bsave\b(?:\s+\w+){0,3}\s+(?:for|to)\s+([A-Za-z]+)",
+            ql,
+        )
+        # "to my contacts", "into my contacts", "in my address book", etc.
+        to_contacts = re.search(r"\b(?:to|in|into)\s+(?:my\s+)?(?:contacts|address\s+book)\b", ql)
+        # Possessive: "save (his|her|their) (address|phone|email|number) ..."
+        # — strong contact signal even without "for <name>". Force-include
+        # manage_contact here too since the keyword fallback misses this
+        # construction.
+        possessive_contact = re.search(
+            r"\bsave\b(?:\s+\w+){0,2}\s+(?:his|her|their)\s+(?:address|phone|number|email|contact|details)",
+            ql,
+        )
+        word_after = (
+            save_for_match.group(1).lower() if save_for_match else None
+        )
+        contact_only_signal = (
+            (save_for_match is not None
+             and word_after is not None
+             and word_after not in _CONTACT_STOPWORDS_AFTER_FOR)
+            or to_contacts is not None
+            or possessive_contact is not None
+        )
+        if possessive_contact is not None:
+            base.add("manage_contact")
+        if contact_only_signal and "manage_contact" in base:
+            base.discard("manage_memory")
        return base


@@ -68,11 +68,12 @@ FUNCTION_TOOL_SCHEMAS = [
        "type": "function",
        "function": {
            "name": "web_fetch",
-            "description": "Fetch and read the text content of a specific URL the user names (e.g. 'check example.com', 'what's on this page <url>'). Use when you already have a concrete URL/domain. NOT for open-ended searches (use web_search) or 'research X' jobs (use trigger_research).",
+            "description": "Fetch and read the text content of a specific URL the user names (e.g. 'check example.com', 'what's on this page <url>'). Use when you already have a concrete URL/domain. NOT for open-ended searches (use web_search) or 'research X' jobs (use trigger_research). Downloads are size-budgeted; a '[partial content: ...]' notice in the result means the body was cut short and you can re-call with full=true for the rest.",
            "parameters": {
                "type": "object",
                "properties": {
-                    "url": {"type": "string", "description": "The URL or domain to fetch (http/https; a bare domain like example.com is fine)"}
+                    "url": {"type": "string", "description": "The URL or domain to fetch (http/https; a bare domain like example.com is fine)"},
+                    "full": {"type": "boolean", "description": "Raise the download budget to the hard cap for large pages/files. Use only after a result reported partial content."}
                },
                "required": ["url"]
            }
@@ -1022,7 +1023,7 @@ FUNCTION_TOOL_SCHEMAS = [
        "type": "function",
        "function": {
            "name": "manage_contact",
-            "description": "Create, update, delete, or list the user's CardDAV contacts. Use to save a new contact ('save Jonathan's email jon@x.com'), update an existing one ('change Maria's number'), or remove one. For update/delete you need the contact's uid — call action='list' first to find it. Writes go through the same dedupe + validation as the Contacts UI.",
+            "description": "Create, update, delete, or list the user's CardDAV contacts. Use to save a new contact, update an existing one (email/phone/address), or remove one. For update/delete you need the contact's uid — call action='list' first to find it. Writes go through the same dedupe + validation as the Contacts UI.",
            "parameters": {
                "type": "object",
                "properties": {
@@ -1033,6 +1034,7 @@ FUNCTION_TOOL_SCHEMAS = [
                    "email": {"type": "string", "description": "Single email address (convenience for add, or the primary email for update)."},
                    "emails": {"type": "array", "items": {"type": "string"}, "description": "Full list of email addresses (for update; first is primary)."},
                    "phones": {"type": "array", "items": {"type": "string"}, "description": "Full list of phone numbers (for update)."},
+                    "address": {"type": "string", "description": "Postal/mailing address as a single human-readable string."},
                },
                "required": ["action"]
            }
@@ -1218,7 +1218,7 @@ function initializeEventListeners() {
      sortDropdown.querySelectorAll('.sort-option').forEach(o => {
        const check = o.querySelector('.sort-check') || document.createElement('span');
        check.className = 'sort-check';
-        check.style.cssText = 'float:right;font-size:20px;line-height:1;position:relative;top:3px;color:var(--accent, var(--red));opacity:' + (o.dataset.sort === current ? '1' : '0');
+        check.style.cssText = 'float:right;font-size:20px;line-height:1;position:relative;top:1px;color:var(--accent, var(--red));opacity:' + (o.dataset.sort === current ? '1' : '0');
        check.textContent = '\u2022';
        if (!o.querySelector('.sort-check')) o.appendChild(check);
      });
@@ -1262,9 +1262,9 @@ function initializeEventListeners() {
            let msg;
            if (data.updated > 0) {
              msg = `Sorted ${data.updated} into ${data.folders.length} folder${data.folders.length === 1 ? '' : 's'}`;
-              if (remaining > 0) msg += ` — ${remaining} unfiled left, hit Tidy again`;
+              if (remaining > 0) msg += ` — ${remaining} unfiled left, hit Group again`;
            } else if (remaining > 0) {
-              msg = `${remaining} unfiled chats — hit Tidy again`;
+              msg = `${remaining} unfiled chats — hit Group again`;
            } else {
              msg = 'All sorted';
            }
@@ -1285,17 +1285,6 @@ function initializeEventListeners() {

    const autoSortBtn = el('auto-sort-sessions-btn');
    if (autoSortBtn) autoSortBtn.addEventListener('click', () => _runTidy(false));
-
-    // Chevron next to the Tidy row toggles the no-AI sub-item.
-    const autoSortMoreBtn = el('auto-sort-sessions-more');
-    const autoSortNoaiBtn = el('auto-sort-sessions-noai-btn');
-    if (autoSortMoreBtn && autoSortNoaiBtn) {
-      autoSortMoreBtn.addEventListener('click', (e) => {
-        e.stopPropagation();
-        autoSortNoaiBtn.style.display = autoSortNoaiBtn.style.display === 'none' ? 'block' : 'none';
-      });
-      autoSortNoaiBtn.addEventListener('click', () => _runTidy(true));
-    }
  }

  // Model sort dropdown
@@ -258,21 +258,29 @@
        <div class="memory-tab-panel" data-memory-panel="browse">
          <div class="admin-card" style="display:flex;flex-direction:column;overflow:hidden;flex:1;min-height:0;">
            <div style="display:flex;align-items:center;gap:8px;margin-bottom:2px;">
-              <h2 style="display:flex;align-items:center;gap:6px;margin:0;padding:0;line-height:1;">Memories <span id="memory-count-h2" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal"></span></h2>
+              <h2 style="display:flex;align-items:center;gap:6px;margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="color:var(--accent, var(--red));flex-shrink:0;"><path d="M12 2a7 7 0 0 1 7 7c0 2.4-1.2 4.5-3 5.7V17a2 2 0 0 1-2 2h-4a2 2 0 0 1-2-2v-2.3C6.2 13.5 5 11.4 5 9a7 7 0 0 1 7-7z"/><line x1="10" y1="22" x2="14" y2="22"/></svg>Memories <span id="memory-count-h2" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal"></span></h2>
              <span style="flex:1"></span>
+              <span class="admin-toggle-state"></span>
              <label class="admin-switch" title="Include memories in chat context"><input type="checkbox" id="memory-enabled-header-toggle" checked /><span class="admin-slider"></span></label>
            </div>
            <p class="memory-desc doclib-desc" style="margin-top:6px;">Long-term facts the AI remembers across chats — recall, edit, or curate.</p>
            <div class="memory-toolbar">
              <div class="memory-toolbar-row">
-                <select id="memory-sort" class="memory-sort-select" aria-label="Sort memories">
-                  <option value="newest">Newest</option>
-                  <option value="oldest">Oldest</option>
-                  <option value="alpha">A-Z</option>
-                  <option value="uses">Most used</option>
-                </select>
-                <button id="memory-select-btn" class="memory-toolbar-btn" title="Select multiple memories">Select</button>
-                <button id="memory-tidy-btn" class="memory-toolbar-btn" title="AI tidy: deduplicate and clean up memories"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy</button>
+                <div class="memory-sort-picker" id="memory-sort-picker" style="position:relative;">
+                  <select id="memory-sort" class="memory-sort-select" aria-label="Sort memories" style="display:none;">
+                    <option value="newest">Newest</option>
+                    <option value="oldest">Oldest</option>
+                    <option value="alpha">A-Z</option>
+                    <option value="uses">Most used</option>
+                  </select>
+                  <button type="button" class="memory-sort-btn" id="memory-sort-btn" aria-haspopup="listbox" aria-expanded="false">
+                    <span class="memory-sort-current"><span class="memory-sort-icon-cur"></span><span class="memory-sort-label">Newest</span></span>
+                    <svg class="memory-sort-caret" width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg>
+                  </button>
+                  <div class="memory-sort-menu" id="memory-sort-menu" role="listbox" hidden></div>
+                </div>
+                <button id="memory-tidy-btn" class="memory-toolbar-btn" title="AI tidy: deduplicate and clean up memories"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;color:var(--accent, var(--red));"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy</button>
+                <button id="memory-select-btn" class="memory-toolbar-btn" title="Select multiple memories" style="position:relative;left:-2px;"><svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:3px;"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor" stroke="none"/></svg>Select</button>
              </div>
              <input type="text" id="memory-search" placeholder="Search memories…" class="memory-search-input" aria-label="Search memories" />
              <div id="memory-category-filters" class="memory-category-filters">
@@ -293,7 +301,7 @@
        <div class="memory-tab-panel hidden" data-memory-panel="add">
          <div class="admin-card">
            <div style="display:flex;align-items:center;gap:8px;margin-bottom:2px;">
-              <h2 style="margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px"><path d="M12 2a7 7 0 0 1 7 7c0 2.4-1.2 4.5-3 5.7V17a2 2 0 0 1-2 2h-4a2 2 0 0 1-2-2v-2.3C6.2 13.5 5 11.4 5 9a7 7 0 0 1 7-7z"/><line x1="10" y1="22" x2="14" y2="22"/></svg>Add Memory</h2>
+              <h2 style="margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px;color:var(--accent, var(--red));"><path d="M12 2a7 7 0 0 1 7 7c0 2.4-1.2 4.5-3 5.7V17a2 2 0 0 1-2 2h-4a2 2 0 0 1-2-2v-2.3C6.2 13.5 5 11.4 5 9a7 7 0 0 1 7-7z"/><line x1="10" y1="22" x2="14" y2="22"/></svg>Add Memory</h2>
              <span style="flex:1"></span>
              <button id="memory-import-btn" class="theme-io-btn" title="Import memories from a file" style="height:26px;font-size:12px;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:4px;"><path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="7 10 12 15 17 10"/><line x1="12" y1="15" x2="12" y2="3"/></svg>Import</button>
              <button id="memory-export-btn" class="theme-io-btn" title="Export all memories as JSON" style="height:26px;font-size:12px;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:4px;"><path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/><polyline points="17 8 12 3 7 8"/><line x1="12" y1="3" x2="12" y2="15"/></svg>Export</button>
@@ -312,7 +320,7 @@
          </div>
          <div class="admin-card">
            <div style="display:flex;align-items:baseline;gap:8px;margin-bottom:2px;">
-              <h2 style="margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>Add Skill</h2>
+              <h2 style="margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px;color:var(--accent, var(--red));"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>Add Skill</h2>
            </div>
            <p class="memory-desc doclib-desc" style="margin-top:6px;">Import a skill from GitHub or <a href="https://skills.sh" target="_blank" rel="noopener noreferrer">skills.sh</a> (folder with <code>SKILL.md</code> and optional templates).</p>
            <div class="memory-add-row" style="margin-top:6px;margin-bottom:10px;">
@@ -348,8 +356,9 @@
        <div class="memory-tab-panel hidden" data-memory-panel="skills">
          <div class="admin-card" style="display:flex;flex-direction:column;overflow:hidden;flex:1;min-height:0;">
            <div style="display:flex;align-items:center;gap:8px;margin-bottom:2px;">
-              <h2 style="margin:0;padding:0;line-height:1;">Skills <span id="skills-count-h2" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal"></span></h2>
+              <h2 style="display:flex;align-items:center;gap:6px;margin:0;padding:0;line-height:1;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="color:var(--accent, var(--red));flex-shrink:0;"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>Skills <span id="skills-count-h2" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal"></span></h2>
              <span style="flex:1"></span>
+              <span class="admin-toggle-state"></span>
              <label class="admin-switch" title="Inject relevant skills into chat context"><input type="checkbox" id="skills-enabled-header-toggle" checked /><span class="admin-slider"></span></label>
            </div>
            <p class="memory-desc doclib-desc" style="margin-top:6px;">Reusable procedures the AI can call via /skill — sort by confidence to surface the proven ones.</p>
@@ -374,8 +383,8 @@
                    <option value="filter:conf70">Confidence ≤ 70%</option>
                  </optgroup>
                </select>
-                <button id="skills-select-btn" class="memory-toolbar-btn" title="Select multiple skills">Select</button>
-                <button id="skills-audit-btn" class="memory-toolbar-btn" title="Test every skill, auto-fix the weak ones, flag what still fails"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:3px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Audit all</button>
+                <button id="skills-audit-btn" class="memory-toolbar-btn" title="Test every skill, auto-fix the weak ones, flag what still fails"><svg width="11" height="11" viewBox="0 0 24 24" fill="var(--accent, var(--red))" style="vertical-align:-1px;margin-right:3px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Audit</button>
+                <button id="skills-select-btn" class="memory-toolbar-btn" title="Select multiple skills"><svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:3px;"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor" stroke="none"/></svg>Select</button>
              </div>
              <input type="text" id="skills-search" placeholder="Search skills…" class="memory-search-input" aria-label="Search skills" />
            </div>
@@ -395,34 +404,23 @@
        <!-- ── Settings tab ── -->
        <div class="memory-tab-panel hidden" data-memory-panel="settings">
          <div class="admin-card">
-            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px">
-              <h2 style="margin:0">Auto-extract memories</h2>
+            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;min-height:32px">
+              <h2 style="margin:0;display:inline-flex;align-items:center;gap:6px"><svg width="13" height="13" viewBox="0 0 24 24" fill="var(--accent, var(--red))" aria-hidden="true"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Auto-extract memories</h2>
              <label class="admin-switch" style="flex-shrink:0"><input type="checkbox" id="auto-memory-toggle" checked /><span class="admin-slider"></span></label>
            </div>
            <span class="admin-toggle-sub" style="display:block;margin-top:6px">Automatically extract memories from conversations.</span>
          </div>
          <div class="admin-card">
-            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px">
-              <h2 style="margin:0">Auto-extract skills</h2>
+            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;min-height:32px">
+              <h2 style="margin:0;display:inline-flex;align-items:center;gap:6px"><svg width="13" height="13" viewBox="0 0 24 24" fill="var(--accent, var(--red))" aria-hidden="true"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Auto-extract skills</h2>
              <label class="admin-switch" style="flex-shrink:0"><input type="checkbox" id="auto-skills-toggle" /><span class="admin-slider"></span></label>
            </div>
            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.6">Automatically draft reusable skills from your workflows. Audit all can publish passing skills using the threshold below.</span>
            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.6">The library can grow; cleanup retires weak/duplicate skills only after review.</span>
          </div>
          <div class="admin-card">
-            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px">
-              <h2 style="margin:0">Inject Skills</h2>
-            </div>
-            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.6">Controls how many relevant published or approved skills are added to each agent request.</span>
-            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;margin-top:8px">
-              <span class="admin-toggle-sub" style="margin:0">Max skills per request</span>
-              <input type="number" id="skill-max-input" min="0" max="12" step="1" value="3" aria-label="Max skills to inject" style="flex-shrink:0;width:72px;background:var(--input-bg,var(--panel));color:var(--fg);border:1px solid var(--border);border-radius:6px;padding:4px 6px;font-size:12px;text-align:right;font-variant-numeric:tabular-nums" />
-            </div>
-            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.5">Set to 0 to disable skill injection.</span>
-          </div>
-          <div class="admin-card">
-            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px">
-              <h2 style="margin:0">Auto-approve skills</h2>
+            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;min-height:32px">
+              <h2 style="margin:0;display:inline-flex;align-items:center;gap:6px"><svg width="13" height="13" viewBox="0 0 24 24" fill="var(--accent, var(--red))" aria-hidden="true"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Auto-approve skills</h2>
              <label class="admin-switch" style="flex-shrink:0"><input type="checkbox" id="auto-approve-skills-toggle" checked /><span class="admin-slider"></span></label>
            </div>
            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.6">Audit all publishes passing, necessary skills at or above this confidence. Off = keep audit results as drafts unless manually approved.</span>
@@ -434,6 +432,17 @@
              </span>
            </div>
          </div>
+          <div class="admin-card">
+            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;min-height:32px">
+              <h2 style="margin:0;display:inline-flex;align-items:center;gap:6px"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="opacity:0.7"><path d="M12 5v14"/><polyline points="6 11 12 17 18 11"/><path d="M5 20h14"/></svg>Inject Skills</h2>
+            </div>
+            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.6">Controls how many relevant published or approved skills are added to each agent request.</span>
+            <div style="display:flex;align-items:center;justify-content:space-between;gap:12px;margin-top:8px">
+              <span class="admin-toggle-sub" style="margin:0">Max skills per request</span>
+              <input type="number" id="skill-max-input" min="0" max="12" step="1" value="3" aria-label="Max skills to inject" style="flex-shrink:0;width:72px;background:var(--input-bg,var(--panel));color:var(--fg);border:1px solid var(--border);border-radius:6px;padding:4px 6px;font-size:12px;text-align:right;font-variant-numeric:tabular-nums" />
+            </div>
+            <span class="admin-toggle-sub" style="display:block;margin-top:6px;opacity:0.5">Set to 0 to disable skill injection.</span>
+          </div>
        </div>
      </div>
    </div>
@@ -704,12 +713,9 @@
        <div class="section-header-flex">
          <span class="section-title" id="chats-section-title"><svg class="section-icon" width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"/></svg><span id="chats-section-label" class="section-title-label">Chats</span><span id="chats-notif-dot" class="sidebar-notif-dot" style="display:none"></span></span>
          <div style="position:relative; display:inline-block; display:flex; gap:4px; align-items:center;">
-            <button type="button" class="section-header-btn chats-manage-btn" id="chats-library-btn" title="Manage Chats (Library)">
-              <svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
-                <path d="M4 19.5A2.5 2.5 0 0 1 6.5 17H20"/>
-                <path d="M6.5 2H20v20H6.5A2.5 2.5 0 0 1 4 19.5v-15A2.5 2.5 0 0 1 6.5 2z"/>
-                <path d="M9 7h6M9 11h4"/>
-              </svg>
+            <button type="button" class="section-header-btn list-item-plus-btn chats-manage-btn" id="chats-library-btn" title="Manage Chats (Library)">
+              <span aria-hidden="true" style="display:inline-block;width:13px;height:13px;"></span>
+              <span class="list-item-plus-label">manage</span>
            </button>
            <button type="button" class="section-header-btn" id="session-sort-btn" title="Sort sessions">
              <svg class="sort-icon" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
@@ -726,14 +732,11 @@
              <div class="dropdown-item sort-option sort-dropdown-item" data-sort="newest">Newest First</div>
              <div class="dropdown-item sort-option sort-dropdown-item" data-sort="group">By Folder</div>
              <div class="dropdown-item sort-dropdown-item sort-dropdown-sep" id="auto-sort-sessions-row" style="display:flex;align-items:center;padding:0;">
-                <span id="auto-sort-sessions-btn" style="flex:1;padding:5px 10px;cursor:pointer;display:inline-flex;align-items:center;gap:4px;">
-                  <span class="auto-sort-icon"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy</span>
+                <span id="auto-sort-sessions-btn" style="flex:1;padding:5px 10px 5px 4px;cursor:pointer;display:inline-flex;align-items:center;gap:6px;">
+                  <span class="auto-sort-icon"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg></span>
+                  <span>Group</span>
                  <span class="auto-sort-spinner" style="display:none;">Sorting...</span>
                </span>
-                <button type="button" id="auto-sort-sessions-more" title="Tidy options" aria-label="Tidy options" style="background:none;border:none;border-left:1px solid var(--border);color:inherit;cursor:pointer;padding:5px 8px;font-size:9px;opacity:0.7;"><svg width="8" height="8" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg></button>
-              </div>
-              <div class="dropdown-item sort-dropdown-item" id="auto-sort-sessions-noai-btn" style="display:none;padding-left:24px;">
-                Tidy <span class="auto-sort-noai-spinner" style="display:none;font-size:9px;opacity:0.6;margin-left:4px;">Cleaning...</span>
              </div>
              <div class="dropdown-item rearrange-toggle sort-dropdown-item sort-dropdown-sep" id="session-rearrange-toggle">
                &#8593;&#8595; Rearrange <span class="rearrange-check" style="float:right; opacity:0;">&#x2022;</span>
@@ -1330,7 +1333,6 @@
        </button>
        <button class="close-btn" aria-label="Close settings">✖</button>
      </div>
-      <div class="admin-toggle-sub" style="padding:0 12px 8px;opacity:0.6;font-size:11px;">Toggle on/off visibility of tools and modules across the interface.</div>
      <div class="settings-layout">
        <div class="settings-sidebar">
          <!-- Section 1: AI plumbing (Add Models → AI Defaults → Search) -->
@@ -1338,6 +1340,10 @@
            <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="2" y="2" width="20" height="8" rx="2"/><rect x="2" y="14" width="20" height="8" rx="2"/><circle cx="6" cy="6" r="1"/><circle cx="6" cy="18" r="1"/></svg>
            <span>Add Models</span>
          </button>
+          <button class="settings-nav-item" data-settings-tab="added-models">
+            <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="20 6 9 17 4 12"/></svg>
+            <span>Added Models</span>
+          </button>
          <button class="settings-nav-item" data-settings-tab="ai">
            <svg width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M12 2a4 4 0 0 0-4 4v2H6a2 2 0 0 0-2 2v10a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V10a2 2 0 0 0-2-2h-2V6a4 4 0 0 0-4-4z"/></svg>
            <span>AI Defaults</span>
@@ -1404,14 +1410,21 @@
            <div class="settings-col">
              <div class="settings-row">
                <label class="settings-label">Endpoint</label>
+                <span class="adm-model-logo" id="set-defaultEpSelect-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-defaultEpSelect" class="settings-select"></select>
              </div>
              <div class="settings-row">
                <label class="settings-label">Model</label>
+                <span class="adm-model-logo" id="set-defaultModelSelect-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-defaultModelSelect" class="settings-select"></select>
              </div>
-              <div id="set-defaultFallbacks" class="settings-fallbacks"></div>
-              <button type="button" class="settings-fallback-add" id="set-defaultAddFallback" title="Add a model to try if the one above fails">+ Add fallback</button>
+              <div class="settings-row" style="align-items:flex-start;">
+                <label class="settings-label" style="margin-top:6px;">Fallbacks</label>
+                <div style="flex:1;display:flex;flex-direction:column;gap:6px;">
+                  <div id="set-defaultFallbacks" class="settings-fallbacks"></div>
+                  <button type="button" class="settings-fallback-add" id="set-defaultAddFallback" title="Add a model to try if the one above fails">+ Add fallback</button>
+                </div>
+              </div>
              <div id="set-defaultChatMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
            </div>
          </div>
@@ -1421,14 +1434,21 @@
            <div class="settings-col">
              <div class="settings-row">
                <label class="settings-label">Endpoint</label>
+                <span class="adm-model-logo" id="set-utilityEpSelect-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-utilityEpSelect" class="settings-select"><option value="">—</option></select>
              </div>
              <div class="settings-row">
                <label class="settings-label">Model</label>
+                <span class="adm-model-logo" id="set-utilityModelSelect-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-utilityModelSelect" class="settings-select"><option value="">—</option></select>
              </div>
-              <div id="set-utilityFallbacks" class="settings-fallbacks"></div>
-              <button type="button" class="settings-fallback-add" id="set-utilityAddFallback" title="Add a model to try if the utility model fails">+ Add fallback</button>
+              <div class="settings-row" style="align-items:flex-start;">
+                <label class="settings-label" style="margin-top:6px;">Fallbacks</label>
+                <div style="flex:1;display:flex;flex-direction:column;gap:6px;">
+                  <div id="set-utilityFallbacks" class="settings-fallbacks"></div>
+                  <button type="button" class="settings-fallback-add" id="set-utilityAddFallback" title="Add a model to try if the utility model fails">+ Add fallback</button>
+                </div>
+              </div>
              <div id="set-utilityChatMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
            </div>
          </div>
@@ -1438,16 +1458,22 @@
            <div style="display:flex;flex-direction:column;gap:0.5rem;">
              <div style="display:flex;align-items:center;gap:0.75rem;">
                <label class="settings-label">Model</label>
+                <span class="adm-model-logo" id="set-vlModelSelect-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-vlModelSelect" class="settings-select"><option value="">Auto-detect</option></select>
              </div>
-              <div id="set-visionFallbacks" class="settings-fallbacks"></div>
-              <button type="button" class="settings-fallback-add" id="set-visionAddFallback" title="Add a vision model to try if the one above fails">+ Add fallback</button>
+              <div class="settings-row" style="align-items:flex-start;">
+                <label class="settings-label" style="margin-top:6px;">Fallbacks</label>
+                <div style="flex:1;display:flex;flex-direction:column;gap:6px;">
+                  <div id="set-visionFallbacks" class="settings-fallbacks"></div>
+                  <button type="button" class="settings-fallback-add" id="set-visionAddFallback" title="Add a vision model to try if the one above fails">+ Add fallback</button>
+                </div>
+              </div>
              <div id="set-visionSettingsMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
            </div>
          </div>
          <div class="admin-card">
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><circle cx="11" cy="11" r="8"/><path d="M21 21l-4.35-4.35"/><line x1="11" y1="8" x2="11" y2="14"/><line x1="8" y1="11" x2="14" y2="11"/></svg>Research Model</h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">Model used for Deep Research. Falls back to the default chat model if not set.</div>
+            <div class="admin-toggle-sub" style="margin-bottom:8px">Model used for Deep Research, more settings under <a href="#" data-go-settings-tab="search" style="color:var(--accent, var(--red));text-decoration:underline;font-weight:600;">Search →</a></div>
            <div class="settings-col">
              <div class="settings-row">
                <label class="settings-label">Endpoint</label>
@@ -1457,48 +1483,17 @@
              </div>
              <div class="settings-row">
                <label class="settings-label">Model</label>
+                <span class="adm-model-logo" id="set-researchModel-logo" style="display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;opacity:0.9;color:var(--fg);"></span>
                <select id="set-researchModel" class="settings-select">
                  <option value="">Same as chat</option>
                </select>
              </div>
-              <div class="settings-row">
-                <label class="settings-label">Search</label>
-                <select id="set-researchSearch" class="settings-select">
-                  <option value="">Same as web search</option>
-                  <option value="searxng">SearXNG</option>
-                  <option value="duckduckgo">DuckDuckGo</option>
-                  <option value="tavily">Tavily</option>
-                  <option value="brave">Brave</option>
-                  <option value="google">Google</option>
-                  <option value="serper">Serper</option>
-                </select>
-              </div>
-              <div class="settings-row">
-                <label class="settings-label">Max Tokens</label>
-                <input id="set-researchMaxTokens" type="text" inputmode="numeric" placeholder="8192 (default)" class="settings-select" style="width:120px;">
-              </div>
-              <div class="settings-row">
-                <label class="settings-label">Extract Timeout</label>
-                <input id="set-researchExtractTimeout" type="text" inputmode="numeric" placeholder="90 sec" class="settings-select" style="width:120px;">
-              </div>
-              <div class="settings-row">
-                <label class="settings-label">Extract Parallel</label>
-                <input id="set-researchExtractConcurrency" type="text" inputmode="numeric" placeholder="3" class="settings-select" style="width:120px;">
-              </div>
-              <div class="settings-row">
-                <label class="settings-label">Max Time</label>
-                <input id="set-researchRunTimeout" type="text" inputmode="numeric" placeholder="1800 sec (0 = no limit)" class="settings-select" style="width:120px;">
-              </div>
-              <div id="set-researchMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
+              <div id="set-researchMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);margin-top:2px;"></div>
            </div>
          </div>
          <!-- Agent card moved to the Agent Tools tab. -->
-          <!-- Image Generation removed — only inpaint remains in this build,
-               and inpaint is configured via the gallery editor not this card.
-               Keeping the DOM (hidden) so JS wiring against the inputs
-               doesn't throw. -->
-          <div class="admin-card" hidden style="display:none">
-            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="margin-right:1px;opacity:0.6;flex-shrink:0"><rect x="3" y="3" width="18" height="18" rx="2"/><circle cx="8.5" cy="8.5" r="1.5"/><path d="M21 15l-5-5L5 21"/></svg>Image Generation<span style="flex:1"></span><label class="admin-switch"><input type="checkbox" id="set-imgEnabledToggle" checked><span class="admin-slider"></span></label></h2>
+          <div class="admin-card">
+            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="margin-right:1px;opacity:0.6;flex-shrink:0"><rect x="3" y="3" width="18" height="18" rx="2"/><circle cx="8.5" cy="8.5" r="1.5"/><path d="M21 15l-5-5L5 21"/></svg>Image Generation<span style="flex:1"></span><label class="admin-switch"><input type="checkbox" id="set-imgEnabledToggle"><span class="admin-slider"></span></label></h2>
            <div class="admin-toggle-sub" style="margin-bottom:8px">Configure which model to use for image generation.</div>
            <div style="display:flex;flex-direction:column;gap:0.5rem;">
              <div style="display:flex;align-items:center;gap:0.75rem;">
@@ -1570,20 +1565,15 @@
              <div id="set-ttsSettingsMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
            </div>
          </div>
+          <!-- Teacher Model settings card hidden as part of the 2.0
+               "harden the core" pass. The escalation flow is dormant when
+               `teacher_model` is unset (its default), so the backend keeps
+               working for anyone who wired it via `manage_settings` /
+               settings backup. Re-add this card to surface the toggle
+               again once the core experience is faster. -->
          <div class="admin-card">
-            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M22 10v6M2 10l10-5 10 5-10 5z"/><path d="M6 12v5c3 3 9 3 12 0v-5"/></svg>Teacher Model <span style="font-size:0.72em;opacity:0.55;font-weight:normal;">(Experimental)</span><span style="flex:1"></span><label class="admin-switch"><input type="checkbox" id="set-teacherEnabledToggle"><span class="admin-slider"></span></label></h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">When a self-hosted student fails an agent-mode task, escalate to a SOTA teacher that writes a SKILL.md procedure so the student can do it next time. Off by default.</div>
-            <div class="settings-col">
-              <div class="settings-row">
-                <label class="settings-label">Endpoint</label>
-                <select id="set-teacherEpSelect" class="settings-select"><option value="">—</option></select>
-              </div>
-              <div class="settings-row">
-                <label class="settings-label">Model</label>
-                <select id="set-teacherModelSelect" class="settings-select"><option value="">—</option></select>
-              </div>
-              <div id="set-teacherChatMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
-            </div>
+            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="margin-right:1px;opacity:0.6;flex-shrink:0"><rect x="2" y="4" width="20" height="16" rx="2"/><polyline points="2 6 12 13 22 6"/></svg>Email Safety<span style="flex:1"></span><label class="admin-switch" title="When on, agent send_email and reply_to_email tools stage a draft for your approval instead of sending immediately."><input type="checkbox" id="set-agentEmailConfirm" checked><span class="admin-slider"></span></label></h2>
+            <div class="admin-toggle-sub" style="margin-bottom:8px">When on, agent <code>send_email</code> / <code>reply_to_email</code> tools stage a draft for your approval (in the chat) instead of SMTPing immediately. Stops models from inventing a signature and sending it to a real recipient before you can review.</div>
          </div>
        </div>

@@ -1614,10 +1604,12 @@
                  <option value="serper" data-search-logo="serper">Serper.dev</option>
                  <option value="disabled" data-search-logo="disabled">Disabled</option>
                </select>
-                <button type="button" class="admin-btn-sm" id="set-searchTestBtn" title="Run a test query against the configured provider" style="margin-left:6px;flex-shrink:0;position:relative;top:2px;">Test</button>
+                <button type="button" class="admin-btn-sm" id="set-searchTestBtn" title="Run a test query against the configured provider" style="margin-left:2px;flex-shrink:0;position:relative;top:2px;display:inline-flex;align-items:center;gap:4px;">
+                  <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test
+                </button>
              </div>
              <div class="settings-row">
-                <label class="settings-label">Results</label>
+                <label class="settings-label" title="How many web search results to fetch per query">Results per query</label>
                <div style="display:flex;gap:8px;flex:1;">
                  <select id="set-searchResultCount" class="settings-select" style="flex:1;">
                    <option value="3">3</option>
@@ -1631,30 +1623,78 @@
              </div>
              <div id="set-searchUrlRow" class="settings-row">
                <label class="settings-label">URL</label>
-                <input id="set-searchUrl" type="text" placeholder="http://localhost:8080" class="settings-select">
+                <input id="set-searchUrl" type="text" placeholder="http://localhost:8080 (optional)" class="settings-select">
              </div>
              <div id="set-searchKeyRow" class="settings-row" style="display:none;">
                <label class="settings-label">API Key</label>
-                <input id="set-searchApiKey" type="password" placeholder="API key" class="settings-select">
+                <div style="position:relative;flex:1;display:flex;align-items:center;">
+                  <svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="position:absolute;left:9px;top:50%;transform:translateY(-50%);opacity:0.55;pointer-events:none;"><path d="M21 2l-9.6 9.6"/><circle cx="7.5" cy="15.5" r="5.5"/><path d="M15.5 7.5l3 3"/></svg>
+                  <input id="set-searchApiKey" type="password" placeholder="API key" class="settings-select" style="flex:1;padding-left:28px;">
+                </div>
              </div>
              <div id="set-searchCxRow" class="settings-row" style="display:none;">
                <label class="settings-label">CX ID</label>
                <input id="set-searchCx" type="text" placeholder="Google PSE engine ID" class="settings-select">
              </div>
-              <div class="settings-row">
-                <label class="settings-label" title="Providers tried in order when the primary fails or hits a rate limit">Fallbacks</label>
-                <div class="search-fallback-chain" id="set-searchFallbackChain"></div>
+              <div class="settings-row" style="align-items:flex-start;">
+                <label class="settings-label" style="margin-top:6px;" title="Providers tried in order when the primary fails or hits a rate limit">Fallbacks</label>
+                <div style="flex:1;display:flex;flex-direction:column;gap:6px;">
+                  <div class="settings-fallbacks" id="set-searchFallbackChain"></div>
+                  <button type="button" class="settings-fallback-add" id="set-searchAddFallback" title="Add a search provider to try if the primary fails">+ Add fallback</button>
+                </div>
              </div>
              <div id="set-searchHint" class="admin-toggle-sub"></div>
              <div id="set-searchMsg" style="font-size:11px;"></div>
            </div>
          </div>
+
+          <div class="admin-card">
+            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><circle cx="11" cy="11" r="8"/><path d="M21 21l-4.35-4.35"/><line x1="11" y1="8" x2="11" y2="14"/><line x1="8" y1="11" x2="14" y2="11"/></svg>Deep Research</h2>
+            <div class="admin-toggle-sub" style="margin-bottom:8px">Deep Research runtime settings. Default Model is picked in <a href="#" data-go-settings-tab="ai" style="color:var(--accent, var(--red));text-decoration:underline;font-weight:600;">AI Defaults →</a></div>
+            <div class="settings-col">
+              <div class="settings-row">
+                <label class="settings-label">Search</label>
+                <span style="margin-left:auto;display:inline-flex;align-items:center;justify-content:center;width:18px;height:18px;flex-shrink:0;color:var(--fg);" id="set-researchSearch-logo"></span>
+                <select id="set-researchSearch" class="settings-select" style="width:358.5px;flex:0 0 auto;max-width:calc(100% - 24px);">
+                  <option value="" data-search-logo="">Same as web search</option>
+                  <option value="searxng" data-search-logo="searxng">SearXNG</option>
+                  <option value="duckduckgo" data-search-logo="duckduckgo">DuckDuckGo</option>
+                  <option value="tavily" data-search-logo="tavily">Tavily</option>
+                  <option value="brave" data-search-logo="brave">Brave</option>
+                  <option value="google" data-search-logo="google_pse">Google</option>
+                  <option value="serper" data-search-logo="serper">Serper</option>
+                </select>
+              </div>
+              <div class="settings-row">
+                <label class="settings-label">Max Tokens</label>
+                <input id="set-researchMaxTokens" type="text" inputmode="numeric" placeholder="8192 (default)" class="settings-select" style="width:382.5px;flex:0 0 auto;margin-left:auto;">
+              </div>
+              <div class="settings-row">
+                <label class="settings-label">Extract Timeout</label>
+                <div style="position:relative;width:382.5px;flex:0 0 auto;margin-left:auto;">
+                  <input id="set-researchExtractTimeout" type="text" inputmode="numeric" placeholder="90 sec" class="settings-select" style="width:100%;padding-right:30px;">
+                  <span title="How long the researcher waits for a single URL to fetch and extract before giving up on it. Slow sites get skipped. Default 90 seconds." style="position:absolute;right:8px;top:50%;transform:translateY(-50%);width:16px;height:16px;border-radius:50%;border:1px solid var(--border);display:inline-flex;align-items:center;justify-content:center;font-size:10px;font-weight:600;opacity:0.55;cursor:help;user-select:none;">?</span>
+                </div>
+              </div>
+              <div class="settings-row">
+                <label class="settings-label">Extract Parallel</label>
+                <div style="position:relative;width:382.5px;flex:0 0 auto;margin-left:auto;">
+                  <input id="set-researchExtractConcurrency" type="text" inputmode="numeric" placeholder="3" class="settings-select" style="width:100%;padding-right:30px;">
+                  <span title="How many URLs the researcher fetches and extracts in parallel. Higher is faster but uses more memory/CPU. Default 3." style="position:absolute;right:8px;top:50%;transform:translateY(-50%);width:16px;height:16px;border-radius:50%;border:1px solid var(--border);display:inline-flex;align-items:center;justify-content:center;font-size:10px;font-weight:600;opacity:0.55;cursor:help;user-select:none;">?</span>
+                </div>
+              </div>
+              <div class="settings-row">
+                <label class="settings-label">Timeout</label>
+                <input id="set-researchRunTimeout" type="text" inputmode="numeric" placeholder="1800 sec (0 = no limit)" class="settings-select" style="width:382.5px;flex:0 0 auto;margin-left:auto;">
+              </div>
+            </div>
+          </div>
        </div>

        <!-- ═══ APPEARANCE TAB ═══ -->
        <div data-settings-panel="appearance" class="settings-appearance-panel hidden">
          <div class="admin-card" style="padding-bottom:6px;">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="3" y="3" width="18" height="18" rx="2"/><line x1="9" y1="3" x2="9" y2="21"/></svg>Sidebar</h2>
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="3" y="3" width="18" height="18" rx="2"/><line x1="9" y1="3" x2="9" y2="21"/></svg>Sidebar<span style="flex:1"></span><button type="button" class="vis-reset-btn" data-vis-reset title="Reset this section to defaults" aria-label="Reset Sidebar to defaults" style="background:none;border:none;padding:2px 4px;cursor:pointer;color:inherit;opacity:0.55;display:inline-flex;align-items:center;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="1 4 1 10 7 10"/><path d="M3.51 15a9 9 0 1 0 2.13-9.36L1 10"/></svg></button></h2>
            <div class="vis-toggles">
              <label class="vis-row">
                <span class="vis-icon"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round"><circle cx="12" cy="12" r="10"/><path d="M8 12l2.5 2.5L16 9"/></svg></span>
@@ -1754,7 +1794,7 @@
            </div>
          </div>
          <div class="admin-card" style="padding-bottom:6px;">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M21 11.5a8.38 8.38 0 0 1-.9 3.8 8.5 8.5 0 0 1-7.6 4.7 8.38 8.38 0 0 1-3.8-.9L3 21l1.9-5.7a8.38 8.38 0 0 1-.9-3.8 8.5 8.5 0 0 1 4.7-7.6 8.38 8.38 0 0 1 3.8-.9h.5a8.48 8.48 0 0 1 8 8v.5z"/></svg>Chat Area</h2>
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M21 11.5a8.38 8.38 0 0 1-.9 3.8 8.5 8.5 0 0 1-7.6 4.7 8.38 8.38 0 0 1-3.8-.9L3 21l1.9-5.7a8.38 8.38 0 0 1-.9-3.8 8.5 8.5 0 0 1 4.7-7.6 8.38 8.38 0 0 1 3.8-.9h.5a8.48 8.48 0 0 1 8 8v.5z"/></svg>Chat Area<span style="flex:1"></span><button type="button" class="vis-reset-btn" data-vis-reset title="Reset this section to defaults" aria-label="Reset Chat Area to defaults" style="background:none;border:none;padding:2px 4px;cursor:pointer;color:inherit;opacity:0.55;display:inline-flex;align-items:center;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="1 4 1 10 7 10"/><path d="M3.51 15a9 9 0 1 0 2.13-9.36L1 10"/></svg></button></h2>
            <div class="vis-toggles">
              <label class="vis-row">
                <span class="vis-icon"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round"><path d="M4 6h16"/><path d="M4 10h8"/></svg></span>
@@ -1789,7 +1829,7 @@
            </div>
          </div>
          <div class="admin-card" style="padding-bottom:6px;">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><line x1="17" y1="10" x2="3" y2="10"/><line x1="21" y1="6" x2="3" y2="6"/><line x1="21" y1="14" x2="3" y2="14"/><line x1="17" y1="18" x2="3" y2="18"/></svg>Chat Bar</h2>
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><line x1="17" y1="10" x2="3" y2="10"/><line x1="21" y1="6" x2="3" y2="6"/><line x1="21" y1="14" x2="3" y2="14"/><line x1="17" y1="18" x2="3" y2="18"/></svg>Chat Bar<span style="flex:1"></span><button type="button" class="vis-reset-btn" data-vis-reset title="Reset this section to defaults" aria-label="Reset Chat Bar to defaults" style="background:none;border:none;padding:2px 4px;cursor:pointer;color:inherit;opacity:0.55;display:inline-flex;align-items:center;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="1 4 1 10 7 10"/><path d="M3.51 15a9 9 0 1 0 2.13-9.36L1 10"/></svg></button></h2>
            <div class="vis-toggles">
              <label class="vis-row">
                <span class="vis-icon"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="11" cy="11" r="8"/><line x1="21" y1="21" x2="16.65" y2="16.65"/></svg></span>
@@ -1833,9 +1873,6 @@
              </label>
            </div>
          </div>
-          <div style="text-align:right;padding:0 4px;">
-            <button type="button" class="admin-btn-sm" id="set-uiVisResetBtn" style="opacity:0.5;">Reset All</button>
-          </div>
        </div>

        <!-- ═══ THEME TAB ═══ -->
@@ -1848,7 +1885,7 @@
              <h2 style="margin:0;font-size:13px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="2" y="4" width="20" height="16" rx="2"/><path d="M6 8h.01M10 8h.01M14 8h.01M18 8h.01M8 12h.01M12 12h.01M16 12h.01M7 16h10"/></svg>Keyboard Shortcuts</h2>
              <p style="font-size:10px;opacity:0.4;margin:2px 0 0;">Click a shortcut to rebind. Press Escape to cancel.</p>
            </div>
-            <button type="button" class="shortcut-action-btn is-reset" id="shortcuts-reset-btn" title="Reset Shortcuts" style="width:28px;height:28px;font-size:15px;">&#x21A9;</button>
+            <button type="button" class="vis-reset-btn" id="shortcuts-reset-btn" title="Reset shortcuts to defaults" aria-label="Reset shortcuts to defaults" style="background:none;border:none;padding:2px 4px;cursor:pointer;color:inherit;opacity:0.55;display:inline-flex;align-items:center;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="1 4 1 10 7 10"/><path d="M3.51 15a9 9 0 1 0 2.13-9.36L1 10"/></svg></button>
          </div>
          <div class="admin-card">
            <div id="shortcuts-list"></div>
@@ -1860,7 +1897,7 @@
        <div data-settings-panel="account" class="hidden">
          <div class="admin-card">
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M20 21v-2a4 4 0 0 0-4-4H8a4 4 0 0 0-4 4v2"/><circle cx="12" cy="7" r="4"/></svg>Account</h2>
-            <div style="display:flex;align-items:center;gap:10px;margin:4px 0 12px;">
+            <div style="display:flex;align-items:center;gap:10px;margin:12px 0 12px;">
              <div class="user-bar-avatar" id="settings-account-avatar" style="width:32px;height:32px;font-size:14px;"></div>
              <div style="flex:1;">
                <div id="settings-account-username" style="font-size:13px;font-weight:600;"></div>
@@ -1898,7 +1935,7 @@
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="2" y="4" width="20" height="16" rx="2"/><path d="m22 7-8.97 5.7a1.94 1.94 0 0 1-2.06 0L2 7"/></svg>Email Accounts</h2>
            <div class="settings-row" style="align-items:center;">
              <div class="admin-toggle-sub" style="margin:0;flex:1;">Add, edit, delete, and test accounts in Integrations.</div>
-              <button class="admin-btn-add" id="set-email-open-integrations" style="display:inline-flex;align-items:center;gap:6px;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="opacity:0.7"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg>Manage in Integrations</button>
+              <button class="admin-btn-add" id="set-email-open-integrations" style="display:inline-flex;align-items:center;gap:6px;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="opacity:0.7"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg>Open Integrations</button>
            </div>
          </div>

@@ -1914,10 +1951,10 @@
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M11 4H4a2 2 0 0 0-2 2v14a2 2 0 0 0 2 2h14a2 2 0 0 0 2-2v-7"/><path d="M18.5 2.5a2.121 2.121 0 0 1 3 3L12 15l-4 1 1-4 9.5-9.5z"/></svg>Writing Style</h2>
            <div class="admin-toggle-sub" style="margin-bottom:8px">AI-extracted from your sent emails. Used when AI drafts replies.</div>
            <div class="settings-col">
-              <textarea id="set-email-style" rows="4" class="settings-select" style="font-family:inherit;resize:vertical" placeholder="e.g. I write emails in this style. I don't use exclamation marks. I sign emails with: ..."></textarea>
+              <textarea id="set-email-style" rows="6" class="settings-select" style="font-family:inherit;resize:none" placeholder="e.g. I write emails in this style. I don't use exclamation marks. I sign emails with: ..."></textarea>
              <div class="settings-row" style="margin-top:4px">
                <span id="set-email-style-msg" style="font-size:11px;"></span>
-                <button class="admin-btn-add" id="set-email-style-extract" style="margin-left:auto;">Extract from Sent (15 emails)</button>
+                <button class="admin-btn-add" id="set-email-style-extract" style="margin-left:auto;display:inline-flex;align-items:center;gap:5px;"><svg width="12" height="12" viewBox="0 0 24 24" fill="currentColor" aria-hidden="true"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Extract from Sent (15 emails)</button>
                <button class="admin-btn-add" id="set-email-style-save">Save</button>
              </div>
            </div>
@@ -1927,7 +1964,7 @@
        <!-- ═══ REMINDERS TAB ═══ -->
        <div data-settings-panel="reminders" class="hidden">
          <div class="admin-card">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M18 8A6 6 0 0 0 6 8c0 7-3 9-3 9h18s-3-2-3-9"/><path d="M13.73 21a2 2 0 0 1-3.46 0"/></svg>How you're reminded</h2>
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M18 8A6 6 0 0 0 6 8c0 7-3 9-3 9h18s-3-2-3-9"/><path d="M13.73 21a2 2 0 0 1-3.46 0"/></svg>How you're reminded<span style="flex:1"></span><span id="set-reminder-test-msg" style="font-size:11px;font-weight:normal;"></span><button class="admin-btn-sm" id="set-reminder-test-btn" style="font-size:11px;font-weight:normal;display:inline-flex;align-items:center;gap:4px;"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test</button></h2>
            <div class="admin-toggle-sub" style="margin-bottom:8px">Controls how fired note reminders are delivered.</div>
            <div class="settings-col">
              <div class="settings-row">
@@ -1965,7 +2002,19 @@
          </div>
          <div class="admin-card">
            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="margin-right:1px;opacity:0.6;flex-shrink:0"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>AI Synthesis<span style="flex:1"></span><label class="admin-switch" title="Use the utility model to write reminder messages"><input type="checkbox" id="set-reminder-llm-toggle"><span class="admin-slider"></span></label></h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">When on, the utility model writes a short, warm one-line reminder for browser, email, ntfy, AND webhook reminders instead of just the raw note content.</div>
+            <div class="admin-toggle-sub" style="margin-bottom:8px">When on, the utility model writes a short, warm one-line reminder for browser, email, ntfy, and webhook reminders instead of just the raw note content.</div>
+            <div class="settings-col">
+              <div class="settings-row">
+                <label class="settings-label" title="Optional — write the reminder in the voice of a saved character">Persona</label>
+                <select id="set-reminder-llm-persona" class="settings-select" style="flex:1;">
+                  <option value="">Default (warm, neutral)</option>
+                </select>
+              </div>
+              <div style="font-size:11px;opacity:0.7;margin-top:2px;">
+                <a href="#" data-open-prompt-modal style="color:var(--accent, var(--red));text-decoration:underline;font-weight:600;">Edit persona settings here →</a>
+              </div>
+              <div id="set-reminder-llm-persona-msg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 55%, transparent);"></div>
+            </div>
          </div>
          <div class="admin-card">
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg>Public App URL</h2>
@@ -1978,14 +2027,6 @@
              <div id="set-app-public-url-msg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 55%, transparent);"></div>
            </div>
          </div>
-          <div class="admin-card">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M22 11.08V12a10 10 0 1 1-5.93-9.14"/><polyline points="22 4 12 14.01 9 11.01"/></svg>Test</h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">Fire a test reminder using your current settings to verify everything works.</div>
-            <div class="settings-row">
-              <span id="set-reminder-test-msg" style="font-size:11px;"></span>
-              <button class="admin-btn-add" id="set-reminder-test-btn" style="margin-left:auto;">Send Test Reminder</button>
-            </div>
-          </div>
        </div>

        <!-- ═══ ADMIN: USERS TAB ═══ -->
@@ -2020,75 +2061,96 @@

        <!-- ═══ SERVICES TAB ═══ -->
        <div data-settings-panel="services">
-          <div class="admin-card">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="2" y="2" width="20" height="8" rx="2"/><rect x="2" y="14" width="20" height="8" rx="2"/><circle cx="6" cy="6" r="1"/><circle cx="6" cy="18" r="1"/></svg>Add Models <span style="opacity:0.45;font-weight:normal;font-size:0.82em">(Endpoints)</span></h2>
-            <div class="admin-toggle-sub" style="margin-bottom:10px">Connect local models first, or add a cloud API.</div>

-            <!-- Local subsection -->
-            <div class="adm-add-section collapsible collapsed" id="adm-add-local">
-              <div class="adm-ep-section-head adm-section-toggle" role="button" tabindex="0" aria-expanded="false">
-                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-1px;margin-right:4px;"><rect x="2" y="3" width="20" height="14" rx="2"/><path d="M8 21h8"/><path d="M12 17v4"/></svg>
-                <span>Local</span>
-                <svg class="adm-section-caret" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg>
+          <!-- ── Local card ─────────────────────────────────────────── -->
+          <div class="admin-card">
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="2" y="3" width="20" height="14" rx="2"/><path d="M8 21h8"/><path d="M12 17v4"/></svg>Add Local Models <span style="opacity:0.45;font-weight:normal;font-size:0.82em">(Endpoint)</span>
+              <span style="flex:1"></span>
+              <button class="admin-btn-sm" id="adm-epLocalTestBtn" style="font-size:11px;font-weight:normal;display:inline-flex;align-items:center;gap:4px;">
+                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test
+              </button>
+              <div style="position:relative;display:inline-block;">
+                <button class="admin-btn-sm" id="adm-epLocalMoreBtn" title="More options" aria-haspopup="true" aria-expanded="false" style="font-size:11px;font-weight:normal;padding:4px 8px;line-height:1;">
+                  <svg width="14" height="4" viewBox="0 0 14 4" fill="currentColor"><circle cx="2" cy="2" r="1.4"/><circle cx="7" cy="2" r="1.4"/><circle cx="12" cy="2" r="1.4"/></svg>
+                </button>
+                <div id="adm-epLocalMoreMenu" style="display:none;position:absolute;top:calc(100% + 4px);right:0;z-index:50;min-width:170px;padding:4px;background:var(--panel,var(--bg));border:1px solid var(--border);border-radius:8px;box-shadow:0 6px 20px rgba(0,0,0,0.22);flex-direction:column;gap:1px;">
+                  <button class="admin-btn-sm adm-more-item" id="adm-epDiscoverBtn" title="Scan your network for running model servers" style="background:none;border:0;border-radius:5px;padding:7px 9px;display:flex;align-items:center;gap:8px;width:100%;text-align:left;font-size:12px;font-weight:normal;color:var(--fg);cursor:pointer;">
+                    <svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round"><circle cx="11" cy="11" r="8"/><line x1="21" y1="21" x2="16.65" y2="16.65"/></svg>Scan network
+                  </button>
+                  <button class="admin-btn-sm adm-more-item" id="adm-epOllamaBtn" title="Fill the default Ollama endpoint" style="background:none;border:0;border-radius:5px;padding:7px 9px;display:flex;align-items:center;gap:8px;width:100%;text-align:left;font-size:12px;font-weight:normal;color:var(--fg);cursor:pointer;"><span class="adm-ollama-logo" style="display:inline-flex;width:13px;height:13px;"></span>Add Ollama</button>
+                  <button class="admin-btn-sm adm-more-item" id="adm-epLocalKeyBtn" title="Show / hide the API key field" aria-expanded="false" aria-controls="adm-epLocalApiKey-row" style="background:none;border:0;border-radius:5px;padding:7px 9px;display:flex;align-items:center;gap:8px;width:100%;text-align:left;font-size:12px;font-weight:normal;color:var(--fg);cursor:pointer;">
+                    <svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 2l-9.6 9.6"/><circle cx="7.5" cy="15.5" r="5.5"/><path d="M15.5 7.5l3 3"/></svg>API key
+                  </button>
+                </div>
              </div>
+            </h2>
+            <div class="admin-toggle-sub" style="margin:0 0 10px 2px;">Add a local model server (Ollama, llama.cpp, vLLM).</div>
+            <div class="adm-add-section">
              <div class="admin-model-form">
                <div class="admin-model-form-row">
-                  <input id="adm-epLocalUrl" type="text" placeholder="Paste endpoint URL, e.g. http://localhost:11434/v1" style="flex:1">
-                </div>
-                <!-- API key row stays in the DOM but is collapsed until the
-                     user clicks the Key button on the action row. Local
-                     endpoints rarely need a key; hiding it by default keeps
-                     the form a single visual line. -->
-                <div class="admin-model-form-row" id="adm-epLocalApiKey-row" style="display:none;">
-                  <input id="adm-epLocalApiKey" type="password" placeholder="API key (optional — for protected local endpoints)" autocomplete="off" style="flex:1">
-                </div>
-                <!-- Action row: LLM/Image type, Quickstart buttons (Scan,
-                     Ollama), Key reveal toggle, Test, Add — all inline so
-                     the Quickstart fold is gone and Type sits with the
-                     primary actions. -->
-                <div class="admin-model-form-row">
-                  <label style="display:inline-flex;align-items:center;gap:4px;font-size:11px;opacity:0.6;flex-shrink:0;">Type:<select id="adm-epLocalType" style="padding:5px;width:72px;flex-shrink:0;">
-                    <option value="llm" selected>LLM</option>
-                    <option value="image">Image</option>
-                  </select></label>
-                  <button class="admin-btn-sm" id="adm-epDiscoverBtn" title="Scan your network for running model servers" style="display:inline-flex;align-items:center;gap:4px;">
-                    <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round"><circle cx="11" cy="11" r="8"/><line x1="21" y1="21" x2="16.65" y2="16.65"/></svg>Scan
-                  </button>
-                  <button class="admin-btn-sm" id="adm-epOllamaBtn" title="Fill the default Ollama endpoint" style="display:inline-flex;align-items:center;gap:5px;"><span class="adm-ollama-logo" style="display:inline-flex;width:13px;height:13px;"></span>Ollama</button>
-                  <span style="flex:1"></span>
-                  <button class="admin-btn-sm" id="adm-epLocalKeyBtn" title="Show / hide the API key field" aria-expanded="false" aria-controls="adm-epLocalApiKey-row" style="opacity:0.75;display:inline-flex;align-items:center;gap:4px;">
-                    <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 2l-9.6 9.6"/><circle cx="7.5" cy="15.5" r="5.5"/><path d="M15.5 7.5l3 3"/></svg>API
-                  </button>
-                  <button class="admin-btn-sm" id="adm-epLocalTestBtn" style="min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;">
-                    <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test
-                  </button>
-                  <button class="admin-btn-add" id="adm-epLocalAddBtn" style="min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;">
+                  <div class="adm-fused-group" style="display:flex;flex:1 1 180px;min-width:0;">
+                    <select id="adm-epLocalType" style="padding:5px;width:66px;flex-shrink:0;border-top-right-radius:0;border-bottom-right-radius:0;border-right:0;">
+                      <option value="llm" selected>LLM</option>
+                      <option value="image">Image</option>
+                    </select>
+                    <input id="adm-epLocalUrl" type="text" placeholder="Paste endpoint URL, e.g. http://localhost:11434/v1" style="flex:1;min-width:0;border-top-left-radius:0;border-bottom-left-radius:0;">
+                  </div>
+                  <button class="admin-btn-add" id="adm-epLocalAddBtn" style="min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;flex-shrink:0;">
                    <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round"><polyline points="20 6 9 17 4 12"/></svg>Add
                  </button>
                </div>
+                <div class="admin-model-form-row" id="adm-epLocalApiKey-row" style="display:none;">
+                  <input id="adm-epLocalApiKey" type="password" placeholder="API key (optional — for protected local endpoints)" autocomplete="off" style="flex:1">
+                </div>
                <div id="adm-epLocalMsg" class="adm-ep-inline-msg"></div>
              </div>
            </div>
+          </div>

-            <!-- API subsection -->
-            <div class="adm-add-section collapsible collapsed" id="adm-add-api" style="margin-top:14px">
-              <div class="adm-ep-section-head adm-section-toggle" role="button" tabindex="0" aria-expanded="false">
-                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-1px;margin-right:4px;"><circle cx="12" cy="12" r="10"/><line x1="2" y1="12" x2="22" y2="12"/><path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/></svg>
-                <span>API</span>
-                <svg class="adm-section-caret" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg>
-              </div>
-              <div class="admin-model-form">
-                <!-- Custom picker (with logos). Hidden native <select> mirrors
-                     its value so the existing JS that reads adm-epProvider
-                     keeps working unchanged. -->
-                <div class="adm-provider-picker adm-provider-combo" id="adm-provider-picker">
-                  <input id="adm-epUrl" type="text" placeholder="Base URL or pick provider" autocomplete="off">
-                  <button type="button" class="adm-provider-btn" id="adm-provider-btn" title="Pick provider">
-                    <span class="adm-provider-current"><span class="adm-provider-logo"></span><span class="adm-provider-name">Provider</span></span>
-                    <svg class="adm-provider-caret" width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg>
+          <!-- ── API card ───────────────────────────────────────────── -->
+          <div class="admin-card">
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><circle cx="12" cy="12" r="10"/><line x1="2" y1="12" x2="22" y2="12"/><path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/></svg>Add API Models <span style="opacity:0.45;font-weight:normal;font-size:0.82em">(Endpoint)</span>
+              <span style="flex:1"></span>
+              <button class="admin-btn-sm" id="adm-epApiTestBtn" style="font-size:11px;font-weight:normal;display:inline-flex;align-items:center;gap:4px;">
+                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test
+              </button>
+              <button class="admin-btn-sm hidden" id="adm-epApiCancelTestBtn" style="font-size:11px;font-weight:normal;">Cancel</button>
+              <div style="position:relative;display:inline-block;">
+                <button class="admin-btn-sm" id="adm-epApiMoreBtn" title="More options" aria-haspopup="true" aria-expanded="false" style="font-size:11px;font-weight:normal;padding:4px 8px;line-height:1;">
+                  <svg width="14" height="4" viewBox="0 0 14 4" fill="currentColor"><circle cx="2" cy="2" r="1.4"/><circle cx="7" cy="2" r="1.4"/><circle cx="12" cy="2" r="1.4"/></svg>
+                </button>
+                <div id="adm-epApiMoreMenu" style="display:none;position:absolute;top:calc(100% + 4px);right:0;z-index:50;min-width:200px;padding:4px;background:var(--panel,var(--bg));border:1px solid var(--border);border-radius:8px;box-shadow:0 6px 20px rgba(0,0,0,0.22);flex-direction:column;gap:1px;">
+                  <div style="font-size:10px;text-transform:uppercase;letter-spacing:0.5px;opacity:0.55;padding:6px 9px 2px;">Connection mode</div>
+                  <button class="admin-btn-sm adm-more-item adm-kind-opt" data-kind="proxy" style="background:none;border:0;border-radius:5px;padding:7px 9px;display:flex;align-items:center;gap:8px;width:100%;text-align:left;font-size:12px;font-weight:normal;color:var(--fg);cursor:pointer;">
+                    <svg class="adm-kind-check" width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round"><polyline points="20 6 9 17 4 12"/></svg>
+                    <span>Proxy</span>
+                    <span style="margin-left:auto;opacity:0.5;font-size:10px;">routed via server</span>
+                  </button>
+                  <button class="admin-btn-sm adm-more-item adm-kind-opt" data-kind="api" style="background:none;border:0;border-radius:5px;padding:7px 9px;display:flex;align-items:center;gap:8px;width:100%;text-align:left;font-size:12px;font-weight:normal;color:var(--fg);cursor:pointer;">
+                    <svg class="adm-kind-check" width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round" style="visibility:hidden;"><polyline points="20 6 9 17 4 12"/></svg>
+                    <span>API (direct)</span>
+                    <span style="margin-left:auto;opacity:0.5;font-size:10px;">browser→provider</span>
                  </button>
-                  <div class="adm-provider-menu hidden" id="adm-provider-menu"></div>
                </div>
+              </div>
+            </h2>
+            <div class="admin-toggle-sub" style="margin:0 0 10px 2px;">Connect a cloud provider (OpenAI, Anthropic, DeepSeek, OpenRouter, etc.).</div>
+            <div class="adm-add-section">
+              <div class="admin-model-form">
+                <div class="admin-model-form-row">
+                  <div class="adm-provider-picker adm-provider-combo" id="adm-provider-picker" style="flex:1 1 220px;min-width:0;margin-bottom:0;">
+                    <button type="button" class="adm-provider-btn" id="adm-provider-btn" title="Pick provider" style="border-top-right-radius:0;border-bottom-right-radius:0;border-top-left-radius:6px;border-bottom-left-radius:6px;border-left:1px solid var(--border);border-right:1px solid var(--border);">
+                      <span class="adm-provider-current"><span class="adm-provider-logo"></span><span class="adm-provider-name">Provider</span></span>
+                      <svg class="adm-provider-caret" width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polyline points="6 9 12 15 18 9"/></svg>
+                    </button>
+                    <input id="adm-epUrl" type="text" placeholder="Base URL or pick provider" autocomplete="off" style="border-left:0;border-top-left-radius:0;border-bottom-left-radius:0;border-top-right-radius:6px;border-bottom-right-radius:6px;">
+                    <div class="adm-provider-menu hidden" id="adm-provider-menu"></div>
+                  </div>
+                </div>
+                <select id="adm-epKind" style="display:none">
+                  <option value="proxy">proxy</option>
+                  <option value="api" selected>api</option>
+                </select>
                <select id="adm-epProvider" style="display:none">
                  <option value="">Custom URL</option>
                  <option value="https://api.anthropic.com" data-logo="anthropic">Anthropic</option>
@@ -2110,41 +2172,27 @@
                  <option value="https://api.z.ai/api/coding/paas/v4" data-logo="zhipu">Z.AI Coding Plan</option>
                  <option value="https://integrate.api.nvidia.com/v1" data-logo="nvidia">NVIDIA</option>
                </select>
-                <!-- API key row stays in DOM, hidden until Key button is
-                     clicked. Mirrors the Local section pattern: most users
-                     paste a key via the provider preset flow rather than
-                     typing it free-form, so the row only appears on demand. -->
-                <div class="admin-model-form-row" id="adm-epApiKey-row" style="display:none;">
-                  <input id="adm-epApiKey" type="password" placeholder="API key" autocomplete="off" style="flex:1">
-                </div>
-                <div class="admin-model-form-row" style="margin-top:-4px;">
-                  <select id="adm-epKind" style="padding:5px;width:82px;">
-                    <option value="proxy">Proxy</option>
-                    <option value="api">API</option>
-                  </select>
-                  <label style="display:inline-flex;align-items:center;gap:4px;font-size:11px;opacity:0.6;flex-shrink:0;">Type:<select id="adm-epType" style="padding:5px;width:80px;flex-shrink:0;">
-                    <option value="llm" selected>LLM</option>
-                    <option value="image">Image</option>
-                  </select></label>
-                  <span style="flex:1"></span>
-                  <button class="admin-btn-sm" id="adm-epApiKeyBtn" title="Show / hide the API key field" aria-expanded="false" aria-controls="adm-epApiKey-row" style="opacity:0.75;display:inline-flex;align-items:center;gap:4px;">
-                    <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M21 2l-9.6 9.6"/><circle cx="7.5" cy="15.5" r="5.5"/><path d="M15.5 7.5l3 3"/></svg>API
-                  </button>
-                  <button class="admin-btn-sm" id="adm-epApiTestBtn" style="min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;">
-                    <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><polygon points="5 3 19 12 5 21 5 3"/></svg>Test
-                  </button>
-                  <button class="admin-btn-sm hidden" id="adm-epApiCancelTestBtn" style="width:62px;text-align:center;">Cancel</button>
-                  <button class="admin-btn-add" id="adm-epAddBtn" style="min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;">
+                <div class="admin-model-form-row" id="adm-epApiKey-row">
+                  <div style="position:relative;flex:1;display:flex;align-items:center;">
+                    <svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="position:absolute;left:9px;top:50%;transform:translateY(-50%);opacity:0.55;pointer-events:none;"><path d="M21 2l-9.6 9.6"/><circle cx="7.5" cy="15.5" r="5.5"/><path d="M15.5 7.5l3 3"/></svg>
+                    <input id="adm-epApiKey" type="password" placeholder="API key, e.g. sk-proj-AbCdEf…" autocomplete="off" style="flex:1;padding-left:28px;height:32px;box-sizing:border-box;">
+                  </div>
+                  <button class="admin-btn-add" id="adm-epAddBtn" style="height:32px;min-width:55px;text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:4px;flex-shrink:0;box-sizing:border-box;">
                    <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round"><polyline points="20 6 9 17 4 12"/></svg>Add
                  </button>
                </div>
                <div id="adm-epApiMsg" class="adm-ep-inline-msg"></div>
-                <div id="adm-deviceAuthStatus" class="adm-ep-inline-msg"></div>
+                <div id="adm-deviceAuthStatus" class="adm-ep-inline-msg" style="min-height:0;margin-top:0;"></div>
              </div>
            </div>
          </div>
+
+        </div>
+
+        <!-- ═══ ADDED MODELS TAB ═══ -->
+        <div data-settings-panel="added-models" class="hidden">
          <div class="admin-card">
-            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><rect x="2" y="3" width="20" height="14" rx="2"/><line x1="8" y1="21" x2="16" y2="21"/><line x1="12" y1="17" x2="12" y2="21"/></svg>Added Models <span style="opacity:0.45;font-weight:normal;font-size:0.82em">(Endpoints)</span>
+            <h2 style="display:flex;align-items:center;gap:8px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><polyline points="20 6 9 17 4 12"/></svg>Added Models <span style="opacity:0.45;font-weight:normal;font-size:0.82em">(Endpoints)</span>
              <span style="flex:1"></span>
              <button class="admin-btn-sm" id="adm-epProbeAllBtn" title="Re-test every endpoint and refresh online status" style="font-size:11px;font-weight:normal;display:inline-flex;align-items:center;gap:4px;">
                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.4" stroke-linecap="round" stroke-linejoin="round"><polyline points="23 4 23 10 17 10"/><polyline points="1 20 1 14 7 14"/><path d="M3.51 9a9 9 0 0 1 14.85-3.36L23 10M1 14l4.64 4.36A9 9 0 0 0 20.49 15"/></svg>Probe
@@ -2153,20 +2201,18 @@
                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.4" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/></svg>Clear offline <span id="adm-epOfflineCount" style="opacity:0.6;margin-left:2px;"></span>
              </button>
            </h2>
-            <div class="admin-toggle-sub" style="margin-bottom:10px">Manage the endpoints you've added.</div>
+            <div class="admin-toggle-sub" style="margin-bottom:12px">Endpoints you've connected. Probe re-tests them all; Clear offline removes the dead ones.</div>
            <div class="adm-ep-section">
-              <div class="adm-ep-section-head">
-                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-1px;margin-right:4px;"><rect x="2" y="3" width="20" height="14" rx="2"/><path d="M8 21h8"/><path d="M12 17v4"/></svg>
-                <span>Local</span>
+              <div class="adm-ep-section-head" style="font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:0.5px;opacity:0.7;margin-bottom:6px;display:inline-flex;align-items:center;gap:5px;">
+                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="2" y="3" width="20" height="14" rx="2"/><path d="M8 21h8"/><path d="M12 17v4"/></svg>Local
              </div>
              <div id="adm-epList-local"><div class="admin-empty">Loading...</div></div>
            </div>
-            <div class="adm-ep-section" style="margin-top:14px">
-              <div class="adm-ep-section-head">
-                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-1px;margin-right:4px;"><circle cx="12" cy="12" r="10"/><line x1="2" y1="12" x2="22" y2="12"/><path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/></svg>
-                <span>API</span>
+            <div class="adm-ep-section" style="margin-top:18px;">
+              <div class="adm-ep-section-head" style="font-size:11px;font-weight:600;text-transform:uppercase;letter-spacing:0.5px;opacity:0.7;margin-bottom:6px;display:inline-flex;align-items:center;gap:5px;">
+                <svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="10"/><line x1="2" y1="12" x2="22" y2="12"/><path d="M12 2a15.3 15.3 0 0 1 4 10 15.3 15.3 0 0 1-4 10 15.3 15.3 0 0 1-4-10 15.3 15.3 0 0 1 4-10z"/></svg>API
              </div>
-              <div id="adm-epList-api"></div>
+              <div id="adm-epList-api"><div class="admin-empty">No API endpoints yet.</div></div>
            </div>
          </div>
        </div>
@@ -2179,24 +2225,8 @@
            <div class="admin-toggle-sub" style="margin-bottom:8px">All external service connections in one place.</div>
            <div id="unified-integrations-list"></div>
            <div id="unified-intg-form" style="display:none"></div>
-            <div style="text-align:center;padding:8px 0;">
-              <button type="button" class="admin-btn-sm" id="unified-intg-add-btn" style="display:inline-flex;align-items:center;gap:6px;">+ Add Integration<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="opacity:0.7;"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg></button>
-            </div>
-          </div>
-          <div class="admin-card admin-only" style="margin-top:12px;">
-            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M21 2l-2 2m-7.61 7.61a5.5 5.5 0 1 1-7.778 7.778 5.5 5.5 0 0 1 7.777-7.777zm0 0L15.5 7.5m0 0l3 3L22 7l-3-3m-3.5 3.5L19 4"/></svg>API Tokens</h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">Bearer tokens for external integrations (scripts, Codex, headless agent runs). Token value shown ONCE on create — copy it then.</div>
-            <div id="adm-tokenList" style="margin-bottom:8px;"></div>
-            <div style="display:flex;gap:6px;flex-wrap:wrap;align-items:flex-start;">
-              <input type="text" id="adm-tokenName" placeholder="Token name (e.g. agent-test)" class="settings-select" style="flex:1;min-width:160px;">
-              <input type="text" id="adm-tokenScopes" placeholder="scopes (comma-separated, blank = chat)" class="settings-select" style="flex:2;min-width:220px;" title="Allowed: chat, cookbook:read, cookbook:launch, documents:read|write, todos:read|write, email:read|draft|send, calendar:read|write, memory:read|write">
-              <button class="admin-btn-add" id="adm-tokenAddBtn">Create token</button>
-            </div>
-            <div id="adm-tokenMsg" style="font-size:11px;margin-top:6px;"></div>
-            <div id="adm-tokenReveal" style="display:none;margin-top:8px;padding:8px 10px;background:color-mix(in srgb, var(--accent, var(--red)) 12%, transparent);border:1px solid color-mix(in srgb, var(--accent, var(--red)) 35%, transparent);border-radius:6px;">
-              <div style="font-size:11px;font-weight:600;margin-bottom:4px;">Copy now — this is the only time you'll see it:</div>
-              <code id="adm-tokenValue" style="font-family:'Berkeley Mono','SF Mono','Fira Code',monospace;font-size:11px;word-break:break-all;display:block;background:var(--bg);padding:6px 8px;border-radius:4px;margin-bottom:6px;user-select:all;"></code>
-              <button class="admin-btn-sm" id="adm-tokenCopyBtn">Copy</button>
+            <div style="text-align:right;padding:8px 0;">
+              <button type="button" class="admin-btn-add" id="unified-intg-add-btn" style="text-align:center;display:inline-flex;align-items:center;justify-content:center;gap:5px;flex-shrink:0;"><svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round"><path d="M10 13a5 5 0 0 0 7.54.54l3-3a5 5 0 0 0-7.07-7.07l-1.72 1.71"/><path d="M14 11a5 5 0 0 0-7.54-.54l-3 3a5 5 0 0 0 7.07 7.07l1.71-1.71"/></svg>Add Integration</button>
            </div>
          </div>
        </div>
@@ -2218,10 +2248,6 @@
              <div id="set-agentMsg" style="font-size:11px;color:color-mix(in srgb, var(--fg) 45%, transparent);"></div>
            </div>
          </div>
-          <div class="admin-card" style="margin-bottom:12px;">
-            <h2 style="display:flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="margin-right:1px;opacity:0.6;flex-shrink:0"><path d="M9 11l3 3L22 4"/><path d="M21 12v7a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V5a2 2 0 0 1 2-2h11"/></svg>Agent loop<span style="flex:1"></span><label class="admin-switch" title="On a failing effectful turn, climb verify → different-method → teacher → stop-and-summarize instead of silently quitting." style="flex-shrink:0"><input type="checkbox" id="set-agentSupervisorLadder"><span class="admin-slider"></span></label></h2>
-            <div class="admin-toggle-sub" style="margin-bottom:8px">Supervisor ladder. When on, every effectful agent turn that claims done is verified; on FAIL the ladder escalates verify → different method → teacher → stop-with-blocker, each rung visible in chat. Teacher rung requires <code>teacher_model</code> to be set.</div>
-          </div>
          <div class="admin-card" style="margin-bottom:12px;">
            <h2><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:5px;opacity:0.6"><path d="M14.7 6.3a1 1 0 0 0 0 1.4l1.6 1.6a1 1 0 0 0 1.4 0l3.77-3.77a6 6 0 0 1-7.94 7.94l-6.91 6.91a2.12 2.12 0 0 1-3-3l6.91-6.91a6 6 0 0 1 7.94-7.94l-3.76 3.76z"/></svg>Built-in Tools</h2>
            <div class="admin-toggle-sub" style="margin-bottom:8px">Enable or disable tools available to the AI agent.</div>
@@ -2303,68 +2329,76 @@

            <div style="display:flex;justify-content:space-between;align-items:center;">
              <div>
-                <div class="admin-toggle-label">Wipe all chats</div>
+                <div class="admin-toggle-label">Delete all chats</div>
                <div class="admin-toggle-sub">Every session, message, and chat history. Documents/notes/etc. stay.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="chats" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="chats" title="Delete all chats" aria-label="Delete all chats" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all memory</div>
+                <div class="admin-toggle-label">Delete all memory</div>
                <div class="admin-toggle-sub">Clears `memory.json`, the Memory table, and the vector store. Skills not affected.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="memory" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="memory" title="Delete all memory" aria-label="Delete all memory" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all skills</div>
+                <div class="admin-toggle-label">Delete all skills</div>
                <div class="admin-toggle-sub">Drops `data/skills/` (all SKILL.md files). Memory not affected.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="skills" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="skills" title="Delete all skills" aria-label="Delete all skills" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all notes</div>
+                <div class="admin-toggle-label">Delete all notes</div>
                <div class="admin-toggle-sub">Every note, todo, and checklist.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="notes" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="notes" title="Delete all notes" aria-label="Delete all notes" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all tasks</div>
+                <div class="admin-toggle-label">Delete all tasks</div>
                <div class="admin-toggle-sub">Every scheduled task and its run history (Tasks tool).</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="tasks" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="tasks" title="Delete all tasks" aria-label="Delete all tasks" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all documents</div>
+                <div class="admin-toggle-label">Delete all documents</div>
                <div class="admin-toggle-sub">Every document and version. Drafts, exports, library — all gone.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="documents" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="documents" title="Delete all documents" aria-label="Delete all documents" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all gallery</div>
+                <div class="admin-toggle-label">Delete all gallery</div>
                <div class="admin-toggle-sub">Every image record and the upload directory on disk.</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="gallery" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="gallery" title="Delete all gallery" aria-label="Delete all gallery" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

            <div style="display:flex;justify-content:space-between;align-items:center;margin-top:8px;">
              <div>
-                <div class="admin-toggle-label">Wipe all calendar</div>
+                <div class="admin-toggle-label">Delete all calendar</div>
                <div class="admin-toggle-sub">Every event and every calendar (incl. CalDAV-synced ones; resync to restore).</div>
              </div>
-              <button class="admin-btn-delete" data-wipe-kind="calendar" style="white-space:nowrap;">Wipe</button>
+              <button class="admin-btn-delete" data-wipe-kind="calendar" title="Delete all calendar" aria-label="Delete all calendar" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete</button>
            </div>

+            <hr style="border:0;border-top:1px solid color-mix(in srgb, #e55 25%, var(--border));margin:14px 0 10px;">
+            <div style="display:flex;justify-content:space-between;align-items:center;">
+              <div>
+                <div class="admin-toggle-label" style="color:#e55;">Delete everything</div>
+                <div class="admin-toggle-sub">All eight categories above, in one go. Same effect as wiping each one in sequence.</div>
+              </div>
+              <button class="admin-btn-delete" data-wipe-kind="__all__" title="Delete every category" aria-label="Delete everything" style="display:inline-flex;align-items:center;gap:5px;white-space:nowrap;font-weight:600;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/><path d="M9 6V4a1 1 0 0 1 1-1h4a1 1 0 0 1 1 1v2"/></svg>Delete All</button>
+            </div>
            <div id="adm-wipeMsg" style="margin-top:8px;"></div>
          </div>
        </div>
@@ -3,7 +3,7 @@

 import uiModule from './ui.js';
 import settingsModule from './settings.js';
-import { providerLogo } from './providers.js';
+import { providerLogo, providerLogoFromUrl } from './providers.js';
 import { sortModelObjects } from './modelSort.js';
 import { PROVIDER_DEVICE_FLOWS, formatDeviceFlowError, runProviderDeviceFlow } from './providerDeviceFlow.js';

@@ -486,13 +486,14 @@ async function loadEndpoints() {
      return `
        <div class="admin-user-row${ep.is_enabled ? '' : ' admin-ep-disabled'}${justAddedClass}" data-adm-ep-id="${ep.id}">
          <div style="display:flex;align-items:center;justify-content:space-between;${hasModels ? 'cursor:pointer;' : ''}padding:4px 0;" data-adm-ep-header="${ep.id}">
-            <div class="admin-user-info" style="flex:1;flex-wrap:wrap;gap:0.3rem;">
+            <div class="admin-user-info" style="flex:1;flex-wrap:wrap;gap:0.3rem;align-items:center;">
+              <span class="adm-ep-row-logo" style="display:inline-flex;align-items:center;justify-content:center;width:16px;height:16px;flex-shrink:0;opacity:0.9;">${providerLogoFromUrl(ep.base_url) || ''}</span>
              <span class="admin-user-name">${esc(ep.name)}</span>
              ${ep.model_type === 'image' ? '<span class="admin-badge" style="background:color-mix(in srgb, var(--accent) 20%, transparent);color:var(--accent);">Image</span>' : ''}
              ${kindLabel ? `<span class="admin-badge">${esc(kindLabel)}</span>` : ''}
              ${statusBadge}
              ${ep.is_enabled ? '' : '<span class="admin-badge admin-badge-off">disabled</span>'}
-              ${hasModels ? '<span style="font-size:10px;opacity:0.4;">Click to manage models</span>' : ''}
+              ${hasModels ? `<span style="font-size:10px;opacity:0.4;${category === 'api' ? 'flex-basis:100%;' : ''}">Click to manage models</span>` : ''}
            </div>
            <div style="display:flex;gap:4px;align-items:center;">
              <button class="admin-btn-sm" data-adm-toggle-ep="${ep.id}">${ep.is_enabled ? 'Disable' : 'Enable'}</button>
@@ -865,6 +866,14 @@ function initEndpointForm() {
    document.addEventListener('click', (e) => {
      if (!picker.contains(e.target)) pickerMenu.classList.add('hidden');
    });
+    // Capture-phase Esc: dismiss the picker menu without bubbling to the
+    // settings-modal handler that would otherwise close the whole modal.
+    document.addEventListener('keydown', (e) => {
+      if (e.key !== 'Escape') return;
+      if (pickerMenu.classList.contains('hidden')) return;
+      e.stopPropagation();
+      pickerMenu.classList.add('hidden');
+    }, { capture: true });
  }

  provider.addEventListener('change', () => {
@@ -1059,14 +1068,15 @@ function initEndpointForm() {
        if (d.id) _recentlyAddedEpId = String(d.id);
        await loadEndpoints();
        await _selectAddedModelInChat(d);
+        const goLink = ' <a href="#" data-go-added-models style="margin-left:6px;text-decoration:underline;color:inherit;font-weight:600;">Added Models →</a>';
        if (!d.online) {
-          msg.textContent = 'Added (endpoint offline — will retry on next load)';
+          msg.innerHTML = 'Added (endpoint offline — will retry on next load)' + goLink;
          msg.className = 'admin-error';
        } else if (d.status === 'empty') {
-          msg.textContent = 'Added — endpoint reachable, no models found';
+          msg.innerHTML = 'Added — endpoint reachable, no models found' + goLink;
          msg.className = 'admin-success';
        } else {
-          msg.textContent = `Added — found ${count} model${count !== 1 ? 's' : ''}`;
+          msg.innerHTML = `Added — found ${count} model${count !== 1 ? 's' : ''}` + goLink;
          msg.className = 'admin-success';
        }
      } else { msg.textContent = d.detail || 'Failed'; msg.className = 'admin-error'; }
@@ -1205,7 +1215,125 @@ function initEndpointForm() {
    });
  };
  _wireKeyToggle('adm-epLocalKeyBtn', 'adm-epLocalApiKey-row');
-  _wireKeyToggle('adm-epApiKeyBtn', 'adm-epApiKey-row');
+
+  // Delegated link handler for jumping between settings tabs.
+  //   [data-go-added-models]              → quick shortcut for the Added Models tab
+  //   [data-go-settings-tab="X"]          → any tab whose nav button has data-settings-tab="X"
+  //   [data-go-scroll-to="#elementId"]    → after switching, scroll the element into view
+  document.addEventListener('click', (e) => {
+    const explicit = e.target.closest('[data-go-settings-tab]');
+    if (explicit) {
+      e.preventDefault();
+      const tab = explicit.getAttribute('data-go-settings-tab');
+      const scrollTo = explicit.getAttribute('data-go-scroll-to');
+      const btn = document.querySelector(`[data-settings-tab="${tab}"]`);
+      if (btn) btn.click();
+      if (scrollTo) {
+        // Defer to the next frame so the panel has actually become visible
+        // before we try to scroll into it.
+        requestAnimationFrame(() => {
+          const target = document.querySelector(scrollTo);
+          if (target) target.scrollIntoView({ behavior: 'smooth', block: 'start' });
+        });
+      }
+      return;
+    }
+    const link = e.target.closest('[data-go-added-models]');
+    if (!link) return;
+    e.preventDefault();
+    const btn = document.querySelector('[data-settings-tab="added-models"]');
+    if (btn) btn.click();
+  });
+
+  // Generic open/close helper for the kebab dropdowns in this card.
+  // Both the Local and API cards use the same shape: an h2-anchored button
+  // with id "<prefix>MoreBtn" toggles a sibling menu with id "<prefix>MoreMenu".
+  // Global Esc handler: close any currently-open kebab menu in the admin
+  // panel regardless of which _wireKebab instance owns it. Belt-and-braces
+  // backup for the per-instance handler below — registered once.
+  if (!document._admKebabEscWired) {
+    document._admKebabEscWired = true;
+    document.addEventListener('keydown', (e) => {
+      if (e.key !== 'Escape') return;
+      // Any visible kebab dropdown in the admin panel — match by id pattern
+      // so adding a new kebab elsewhere automatically benefits.
+      const menus = document.querySelectorAll(
+        '#adm-epLocalMoreMenu, #adm-epApiMoreMenu'
+      );
+      let closed = false;
+      menus.forEach((m) => {
+        if (m && m.style.display !== 'none') {
+          m.style.display = 'none';
+          // Sync the associated button's aria-expanded when we can find it.
+          const btn = document.getElementById(m.id.replace('Menu', 'Btn'));
+          if (btn) btn.setAttribute('aria-expanded', 'false');
+          closed = true;
+        }
+      });
+      if (closed) e.stopPropagation();
+    }, { capture: true });
+  }
+
+  const _wireKebab = (btnId, menuId, onItem) => {
+    const btn = el(btnId);
+    const menu = el(menuId);
+    if (!btn || !menu) return;
+    const isOpen = () => menu.style.display !== 'none';
+    const close = () => { menu.style.display = 'none'; btn.setAttribute('aria-expanded', 'false'); };
+    const open = () => { menu.style.display = 'flex'; btn.setAttribute('aria-expanded', 'true'); };
+    btn.addEventListener('click', (e) => {
+      e.stopPropagation();
+      if (isOpen()) close(); else open();
+    });
+    menu.addEventListener('click', (e) => {
+      const item = e.target.closest('.adm-more-item');
+      if (!item) return;
+      if (onItem) onItem(item, e);
+      close();
+    });
+    document.addEventListener('click', (e) => {
+      if (!isOpen()) return;
+      if (e.target.closest('#' + menuId + ', #' + btnId)) return;
+      close();
+    });
+    // Use capture phase so this fires before the settings-modal Esc handler
+    // (which is in bubble phase). stopPropagation prevents the modal from
+    // closing when the user only meant to dismiss this menu.
+    document.addEventListener('keydown', (e) => {
+      if (e.key === 'Escape' && isOpen()) {
+        e.stopPropagation();
+        close();
+      }
+    }, { capture: true });
+  };
+
+  // API card "..." menu: contains the Proxy/API connection-mode toggle.
+  // Sync the visible checkmarks with the hidden #adm-epKind select so
+  // downstream code (which reads kindSel.value) keeps working.
+  (function wireApiKindMenu() {
+    const kind = el('adm-epKind');
+    if (!kind) return;
+    const opts = document.querySelectorAll('#adm-epApiMoreMenu .adm-kind-opt');
+    const sync = () => {
+      opts.forEach((o) => {
+        const check = o.querySelector('.adm-kind-check');
+        if (check) check.style.visibility = (o.dataset.kind === kind.value) ? 'visible' : 'hidden';
+      });
+    };
+    sync();
+    kind.addEventListener('change', sync);
+    _wireKebab('adm-epApiMoreBtn', 'adm-epApiMoreMenu', (item) => {
+      const k = item.dataset.kind;
+      if (!k) return;
+      kind.value = k;
+      kind.dispatchEvent(new Event('change'));
+    });
+  })();
+
+  // Local card "..." kebab: holds Scan network / Ollama / API key reveal.
+  // Item buttons keep their own click handlers; the helper just handles
+  // open/close + outside-click + Esc.
+  _wireKebab('adm-epLocalMoreBtn', 'adm-epLocalMoreMenu');

  // ── Added Models toolbar: Probe + Clear offline ────────────────────
  // Both buttons act over the currently-rendered endpoint list. The
@@ -1217,10 +1345,10 @@ function initEndpointForm() {
    if (!lbl) return;
    const n = document.querySelectorAll('[data-adm-ep-id] [data-adm-ep-online="0"]').length;
    lbl.textContent = n > 0 ? `(${n})` : '';
-    // Keep the button enabled even when there are no offline rows — a
-    // click on the empty case fires a toast instead of feeling dead.
+    // Hide the button entirely when there's nothing offline — no point
+    // showing an action that has nothing to act on.
    const btn = el('adm-epClearOfflineBtn');
-    if (btn) btn.style.opacity = n === 0 ? '0.55' : '0.85';
+    if (btn) btn.style.display = n === 0 ? 'none' : '';
  };
  // Wire after every loadEndpoints() run by patching the render hook —
  // simplest path: MutationObserver on the two list containers.
@@ -1237,7 +1365,17 @@ function initEndpointForm() {
    probeAllBtn.addEventListener('click', async () => {
      probeAllBtn.disabled = true;
      const origHTML = probeAllBtn.innerHTML;
-      probeAllBtn.innerHTML = '<span style="opacity:0.7;">Probing…</span>';
+      let _wp = null;
+      try {
+        const sp = window.spinnerModule || (await import('./spinner.js')).default;
+        _wp = sp.createWhirlpool(11);
+        _wp.element.style.cssText = 'display:inline-flex;width:11px;height:11px;margin:0 4px 0 0;';
+        probeAllBtn.innerHTML = '';
+        probeAllBtn.appendChild(_wp.element);
+        probeAllBtn.appendChild(document.createTextNode('Probing'));
+      } catch (_) {
+        probeAllBtn.innerHTML = '<span style="opacity:0.7;">Probing…</span>';
+      }
      try {
        // Hit the bulk local probe (same one the model picker uses).
        await fetch('/api/model-endpoints/probe-local', { credentials: 'same-origin' }).catch(() => {});
@@ -1259,6 +1397,7 @@ function initEndpointForm() {
        await loadEndpoints();
        if (uiModule && uiModule.showToast) uiModule.showToast('Endpoint status refreshed', 1800);
      } finally {
+        if (_wp) { try { _wp.destroy(); } catch (_) {} }
        probeAllBtn.innerHTML = origHTML;
        probeAllBtn.disabled = false;
      }
@@ -1329,15 +1468,16 @@ function initEndpointForm() {
  const localTestBtn = el('adm-epLocalTestBtn');
  if (localTestBtn) {
    localTestBtn.addEventListener('click', async () => {
+      const testOriginalHtml = localTestBtn.innerHTML || '>Test';
      const msg = _endpointMsg('local');
-      msg.textContent = ''; msg.className = '';
+      msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
      const raw = (el('adm-epLocalUrl').value || '').trim();
      if (!raw) { msg.textContent = 'Enter a base URL to test'; msg.className = 'admin-error'; return; }
      const url = _normalizeBaseUrl(raw);
      const keyEl = el('adm-epLocalApiKey');
      const apiKey = keyEl ? keyEl.value.trim() : '';
      localTestBtn.disabled = true;
-      localTestBtn.textContent = 'Testing...';
+      localTestBtn.innerHTML = testOriginalHtml.replace(/>Test\s*$/, '>Testing...');
      try {
        const fd = new FormData();
        fd.append('base_url', url);
@@ -1350,19 +1490,21 @@ function initEndpointForm() {
        msg.className = 'admin-error';
      }
      localTestBtn.disabled = false;
-      localTestBtn.textContent = 'Test';
+      localTestBtn.innerHTML = testOriginalHtml;
    });
  }
  if (localAddBtn) {
    localAddBtn.addEventListener('click', async () => {
+      const addOriginalHtml = localAddBtn.innerHTML || '>Add';
      const msg = _endpointMsg('local');
-      msg.textContent = ''; msg.className = '';
+      msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
      const raw = (el('adm-epLocalUrl').value || '').trim();
      if (!raw) { msg.textContent = 'Enter a base URL (e.g. http://localhost:8002/v1)'; msg.className = 'admin-error'; return; }
      const url = _normalizeBaseUrl(raw);
      const keyEl = el('adm-epLocalApiKey');
      const apiKey = keyEl ? keyEl.value.trim() : '';
-      localAddBtn.disabled = true; localAddBtn.textContent = 'Adding...';
+      localAddBtn.disabled = true;
+      localAddBtn.innerHTML = addOriginalHtml.replace(/>Add\s*$/, '>Adding...');
      try {
        const fd = new FormData();
        fd.append('base_url', url);
@@ -1382,15 +1524,17 @@ function initEndpointForm() {
          await loadEndpoints();
          await _selectAddedModelInChat(d);
          const count = (d.models || []).length;
-          msg.textContent = d.status === 'empty'
+          const baseText = d.status === 'empty'
            ? 'Added — Ollama is running, no models pulled yet'
            : d.online
            ? `Added — found ${count} model${count !== 1 ? 's' : ''}`
            : 'Added (offline — will retry on next load)';
+          msg.innerHTML = `${baseText} <a href="#" data-go-added-models style="margin-left:6px;text-decoration:underline;color:inherit;font-weight:600;">Added Models →</a>`;
          msg.className = d.online ? 'admin-success' : 'admin-error';
        } else { msg.textContent = d.detail || 'Failed'; msg.className = 'admin-error'; }
      } catch (e) { msg.textContent = 'Request failed'; msg.className = 'admin-error'; }
-      localAddBtn.disabled = false; localAddBtn.textContent = 'Add';
+      localAddBtn.disabled = false;
+      localAddBtn.innerHTML = addOriginalHtml;
    });
  }

@@ -1416,10 +1560,7 @@ function initEndpointForm() {
    discoverBtn.addEventListener('click', async () => {
      const msg = _endpointMsg('local');
      discoverBtn.disabled = true;
-      // Keep the button's icon as-is while scanning; the whirlpool +
-      // status text below is enough feedback. (Two spinning indicators
-      // at once looks busy.)
-      msg.className = '';
+      msg.className = 'adm-ep-inline-msg';
      msg.innerHTML = '';
      try {
        const sp = window.spinnerModule || (await import('./spinner.js')).default;
@@ -1430,7 +1571,7 @@ function initEndpointForm() {
        wrap.appendChild(wp.element);
        const txt = document.createElement('span');
        txt.textContent = 'Scanning ports 8000-8020 and 11434 for model servers...';
-        txt.style.cssText = 'font-size:12px;opacity:0.7;';
+        txt.style.cssText = 'opacity:0.7;';
        wrap.appendChild(txt);
        msg.appendChild(wrap);
        discoverBtn._wp = wp;
@@ -1481,30 +1622,6 @@ function initEndpointForm() {
    });
  }

-  // Collapsible Add-Models subsections (API / Local). Both start collapsed
-  // so the card is compact; the last-used state is remembered per section
-  // in localStorage so a frequent API-adder doesn't re-expand every time.
-  document.querySelectorAll('#adm-add-api, #adm-add-local').forEach((sec) => {
-    const head = sec.querySelector('.adm-section-toggle');
-    if (!head) return;
-    const key = 'odysseus.addModels.' + sec.id + '.open';
-    let open = false;
-    try { open = localStorage.getItem(key) === '1'; } catch {}
-    const apply = () => {
-      sec.classList.toggle('collapsed', !open);
-      head.setAttribute('aria-expanded', open ? 'true' : 'false');
-    };
-    apply();
-    const toggle = () => {
-      open = !open;
-      try { localStorage.setItem(key, open ? '1' : '0'); } catch {}
-      apply();
-    };
-    head.addEventListener('click', toggle);
-    head.addEventListener('keydown', (e) => {
-      if (e.key === 'Enter' || e.key === ' ') { e.preventDefault(); toggle(); }
-    });
-  });
  document.querySelectorAll('.adm-quickstart-section').forEach((sec) => {
    const head = sec.querySelector('.adm-quickstart-toggle');
    if (!head) return;
@@ -2220,28 +2337,126 @@ function initRag() {
 /* ═══════════════════════════════════════════
   SYSTEM TAB — Tokens
   ═══════════════════════════════════════════ */
+// Catalog mirrors the one in settings.js integration form. Keep keys in
+// sync with the backend scope allowlist.
+const _TOKEN_SCOPES = [
+  { key: 'todos:read',        label: 'Todos read',        detail: 'Read notes and checklists' },
+  { key: 'todos:write',       label: 'Todos write',       detail: 'Create, update, delete, and toggle todo items' },
+  { key: 'documents:read',    label: 'Documents read',    detail: 'Read documents when a document API is enabled' },
+  { key: 'documents:write',   label: 'Documents write',   detail: 'Create and update draft documents' },
+  { key: 'email:read',        label: 'Email read',        detail: 'Read email when an email API is enabled' },
+  { key: 'email:draft',       label: 'Email draft',       detail: 'Create email reply drafts without sending' },
+  { key: 'email:send',        label: 'Email send',        detail: 'Send email directly' },
+  { key: 'calendar:read',     label: 'Calendar read',     detail: 'Read calendar events when enabled' },
+  { key: 'calendar:write',    label: 'Calendar write',    detail: 'Create and update calendar events' },
+  { key: 'memory:read',       label: 'Memory read',       detail: 'Read memory when enabled' },
+  { key: 'memory:write',      label: 'Memory write',      detail: 'Write memory when enabled' },
+  { key: 'cookbook:read',     label: 'Cookbook read',     detail: 'List cookbook tasks + tail their tmux output' },
+  { key: 'cookbook:launch',   label: 'Cookbook launch',   detail: 'Launch and stop cookbook serve tasks' },
+];
+
+function _renderTokenScopeRows(t) {
+  const have = new Set(t.scopes || []);
+  return _TOKEN_SCOPES.map(s => {
+    const action = (s.key.split(':')[1] || '').toLowerCase();
+    const pill = action === 'read'
+      ? 'background:rgba(150,150,150,0.18);color:var(--fg-muted,#888);'
+      : 'background:color-mix(in srgb, var(--accent, var(--red)) 18%, transparent);color:var(--accent, var(--red));';
+    const tool = s.label.replace(/\s+(read|write|draft|send|launch)$/i, '');
+    return `
+      <label style="display:flex;align-items:center;gap:8px;min-height:28px;padding:1px 0;">
+        <span class="settings-label" style="width:90px;flex-shrink:0;padding:0;font-size:12px;">${esc(tool)}</span>
+        <span style="font-size:9px;font-weight:600;text-transform:uppercase;letter-spacing:0.5px;padding:1px 7px;border-radius:999px;flex-shrink:0;min-width:44px;text-align:center;box-sizing:border-box;${pill}">${esc(action)}</span>
+        <span style="font-size:11px;line-height:1.35;opacity:0.62;flex:1;min-width:0;">${esc(s.detail)}</span>
+        <label class="admin-switch" style="margin-left:auto;flex-shrink:0;"><input type="checkbox" class="adm-tok-scope" data-token-id="${esc(t.id)}" data-scope="${esc(s.key)}" ${have.has(s.key) ? 'checked' : ''}><span class="admin-slider"></span></label>
+      </label>`;
+  }).join('');
+}
+
 async function loadTokens() {
  const list = el('adm-tokenList');
+  if (!list) return;
  try {
    const res = await fetch('/api/tokens', { credentials: 'same-origin' });
    const tokens = await res.json();
-    if (!tokens.length) { list.innerHTML = '<div class="admin-empty">No API tokens</div>'; return; }
+    if (!tokens.length) { list.innerHTML = '<div class="admin-empty" style="color:var(--accent, var(--red));opacity:0.7;font-size:10px;">No API tokens</div>'; return; }
    list.innerHTML = tokens.map(t => `
-      <div class="admin-user-row">
-        <div class="admin-user-info" style="flex:1;flex-wrap:wrap;gap:0.3rem;">
-          <span class="admin-user-name">${esc(t.name)}</span>
-          <span class="admin-badge">${esc(t.token_prefix)}...</span>
-          <span class="admin-badge" title="Allowed API scopes">${esc((t.scopes || ['chat']).join(', '))}</span>
-          ${t.owner ? `<span style="font-size:0.75rem;opacity:0.5;">Owner: ${esc(t.owner)}</span>` : ''}
-          ${t.last_used_at ? `<span style="font-size:0.75rem;opacity:0.5;">Last used: ${new Date(t.last_used_at).toLocaleDateString()}</span>` : '<span style="font-size:0.75rem;opacity:0.4;">Never used</span>'}
+      <div class="admin-user-row" data-adm-tok-row="${esc(t.id)}" style="display:block;">
+        <div style="display:flex;align-items:center;gap:8px;flex-wrap:wrap;">
+          <div class="admin-user-info" style="flex:1;min-width:0;flex-wrap:wrap;gap:0.3rem;">
+            <input type="text" class="adm-tok-rename" data-token-id="${esc(t.id)}" value="${esc(t.name || '')}" placeholder="Token name" style="font-size:13px;font-weight:600;padding:3px 6px;background:transparent;border:1px solid transparent;border-radius:4px;min-width:160px;" title="Click to rename">
+            <span class="admin-badge">${esc(t.token_prefix)}...</span>
+            ${t.owner ? `<span style="font-size:0.75rem;opacity:0.5;">Owner: ${esc(t.owner)}</span>` : ''}
+            ${t.last_used_at ? `<span style="font-size:0.75rem;opacity:0.5;">Last used: ${new Date(t.last_used_at).toLocaleDateString()}</span>` : '<span style="font-size:0.75rem;opacity:0.4;">Never used</span>'}
+          </div>
+          <button class="admin-btn-sm" data-adm-tok-toggle="${esc(t.id)}" style="opacity:0.75;">Permissions</button>
+          <button class="admin-btn-delete" data-adm-del-token="${esc(t.id)}">Revoke</button>
+        </div>
+        <div data-adm-tok-perm="${esc(t.id)}" style="display:none;margin-top:8px;padding:8px 4px 0;border-top:1px solid var(--border);">
+          ${_renderTokenScopeRows(t)}
+          <div class="adm-tok-scope-msg" data-token-id="${esc(t.id)}" style="font-size:11px;min-height:14px;margin-top:4px;"></div>
        </div>
-        <button class="admin-btn-delete" data-adm-del-token="${t.id}">Revoke</button>
      </div>`).join('');
+
+    // Revoke
    list.querySelectorAll('[data-adm-del-token]').forEach(btn => {
      btn.addEventListener('click', async () => {
        if (!await uiModule.styledConfirm('Revoke this API token? External integrations using it will stop working.', { confirmText: 'Revoke', danger: true })) return;
        await fetch(`/api/tokens/${btn.dataset.admDelToken}`, { method: 'DELETE', credentials: 'same-origin' });
        loadTokens();
+        // Codex / Claude integration cards on the Integrations panel are
+        // backed by these tokens — let them re-render so the deleted token
+        // disappears there too.
+        try { window.dispatchEvent(new CustomEvent('odysseus-integrations-changed')); } catch (_) {}
+      });
+    });
+    // Toggle permissions panel
+    list.querySelectorAll('[data-adm-tok-toggle]').forEach(btn => {
+      btn.addEventListener('click', () => {
+        const panel = list.querySelector(`[data-adm-tok-perm="${btn.dataset.admTokToggle}"]`);
+        if (!panel) return;
+        panel.style.display = panel.style.display === 'none' ? '' : 'none';
+      });
+    });
+    // Rename
+    list.querySelectorAll('.adm-tok-rename').forEach(input => {
+      const original = input.value;
+      const commit = async () => {
+        const name = (input.value || '').trim();
+        if (!name || name === original) return;
+        try {
+          const r = await fetch(`/api/tokens/${input.dataset.tokenId}`, {
+            method: 'PATCH', credentials: 'same-origin',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ name }),
+          });
+          if (!r.ok) throw new Error('Save failed');
+          loadTokens();
+        } catch (_) { input.value = original; }
+      };
+      input.addEventListener('blur', commit);
+      input.addEventListener('keydown', e => { if (e.key === 'Enter') { e.preventDefault(); input.blur(); } });
+    });
+    // Scope toggle change → PATCH the whole scopes array for this token.
+    list.querySelectorAll('.adm-tok-scope').forEach(cb => {
+      cb.addEventListener('change', async () => {
+        const tokenId = cb.dataset.tokenId;
+        const panel = list.querySelector(`[data-adm-tok-perm="${tokenId}"]`);
+        const msg = list.querySelector(`.adm-tok-scope-msg[data-token-id="${tokenId}"]`);
+        const scopes = Array.from(panel.querySelectorAll('.adm-tok-scope:checked')).map(input => input.dataset.scope);
+        try {
+          const r = await fetch(`/api/tokens/${tokenId}`, {
+            method: 'PATCH', credentials: 'same-origin',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ scopes }),
+          });
+          const d = await r.json().catch(() => ({}));
+          if (!r.ok) throw new Error(d.detail || 'Failed');
+          if (msg) { msg.textContent = 'Saved'; msg.style.color = 'var(--green, #50fa7b)'; setTimeout(() => { msg.textContent = ''; }, 1200); }
+        } catch (err) {
+          cb.checked = !cb.checked;
+          if (msg) { msg.textContent = (err && err.message) || 'Failed'; msg.style.color = 'var(--red)'; }
+        }
      });
    });
  } catch (e) { list.innerHTML = '<div class="admin-error">Failed to load tokens</div>'; }
@@ -2273,11 +2488,20 @@ function initTokenForm() {
      else { msg.textContent = data.detail || 'Failed'; msg.className = 'admin-error'; }
    } catch (e) { msg.textContent = 'Request failed'; msg.className = 'admin-error'; }
  });
+  const TOKEN_COPY_ICON = '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>';
+  const TOKEN_CHECK_ICON = '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round" stroke-linejoin="round"><polyline points="20 6 9 17 4 12"/></svg>';
  el('adm-tokenCopyBtn').addEventListener('click', () => {
    const val = el('adm-tokenValue').textContent;
+    const btn = el('adm-tokenCopyBtn');
    navigator.clipboard.writeText(val).then(() => {
-      el('adm-tokenCopyBtn').textContent = 'Copied!';
-      setTimeout(() => { el('adm-tokenCopyBtn').textContent = 'Copy'; }, 2000);
+      btn.innerHTML = TOKEN_CHECK_ICON;
+      btn.style.color = 'var(--accent, var(--red))';
+      btn.style.opacity = '1';
+      setTimeout(() => {
+        btn.innerHTML = TOKEN_COPY_ICON;
+        btn.style.color = '';
+        btn.style.opacity = '0.7';
+      }, 1600);
    });
  });
 }
@@ -2504,23 +2728,54 @@ function initDangerZone() {
  modalEl.querySelectorAll('[data-wipe-kind]').forEach(btn => {
    btn.addEventListener('click', async () => {
      const kind = btn.dataset.wipeKind;
-      const label = _LABELS[kind] || kind;
-      if (!await uiModule.styledConfirm(`Wipe ALL ${label}? This cannot be undone.`, { confirmText: 'Wipe', danger: true })) return;
-      if (!await uiModule.styledConfirm(`Really wipe every one of your ${label}?`, { confirmText: 'Yes, wipe everything', danger: true })) return;
-      btn.disabled = true; const prev = btn.textContent; btn.textContent = 'Wiping…';
+      const isAll = kind === '__all__';
+      const label = isAll ? 'data across every category' : (_LABELS[kind] || kind);
+      if (!await uiModule.styledConfirm(`Delete ALL ${label}? This cannot be undone.`, { confirmText: 'Delete', danger: true })) return;
+      if (!await uiModule.styledConfirm(`Really delete every one of your ${label}?`, { confirmText: isAll ? 'Yes, delete everything' : 'Yes, delete everything', danger: true })) return;
+      btn.disabled = true;
+      const prevHtml = btn.innerHTML;
+      btn.innerHTML = isAll ? 'Deleting all…' : 'Deleting…';
      if (_wipeMsg) { _wipeMsg.textContent = ''; _wipeMsg.className = ''; }
      try {
-        const res = await fetch(`/api/admin/wipe/${kind}`, { method: 'DELETE', credentials: 'same-origin' });
-        const data = await res.json().catch(() => ({}));
-        if (res.ok) {
-          if (_wipeMsg) { _wipeMsg.textContent = `Wiped ${data.count ?? 0} ${label}.`; _wipeMsg.className = 'admin-success'; }
+        if (isAll) {
+          // Iterate every known category. Failures in one shouldn't stop
+          // the rest — record per-category counts and surface a summary.
+          const kinds = Object.keys(_LABELS);
+          const results = [];
+          for (const k of kinds) {
+            try {
+              const r = await fetch(`/api/admin/wipe/${k}`, { method: 'DELETE', credentials: 'same-origin' });
+              const d = await r.json().catch(() => ({}));
+              results.push({ k, ok: r.ok, count: d.count ?? 0, error: r.ok ? null : (d.detail || 'failed') });
+            } catch (e) {
+              results.push({ k, ok: false, count: 0, error: e.message });
+            }
+          }
+          const okCount = results.filter(r => r.ok).length;
+          const total = results.reduce((n, r) => n + (r.ok ? r.count : 0), 0);
+          const fails = results.filter(r => !r.ok).map(r => r.k);
+          if (_wipeMsg) {
+            if (!fails.length) {
+              _wipeMsg.textContent = `Deleted ${total} items across all ${okCount} categories.`;
+              _wipeMsg.className = 'admin-success';
+            } else {
+              _wipeMsg.textContent = `Deleted ${total} items; failed: ${fails.join(', ')}.`;
+              _wipeMsg.className = 'admin-error';
+            }
+          }
        } else {
-          if (_wipeMsg) { _wipeMsg.textContent = data.detail || 'Failed'; _wipeMsg.className = 'admin-error'; }
+          const res = await fetch(`/api/admin/wipe/${kind}`, { method: 'DELETE', credentials: 'same-origin' });
+          const data = await res.json().catch(() => ({}));
+          if (res.ok) {
+            if (_wipeMsg) { _wipeMsg.textContent = `Deleted ${data.count ?? 0} ${label}.`; _wipeMsg.className = 'admin-success'; }
+          } else {
+            if (_wipeMsg) { _wipeMsg.textContent = data.detail || 'Failed'; _wipeMsg.className = 'admin-error'; }
+          }
        }
      } catch (e) {
        if (_wipeMsg) { _wipeMsg.textContent = 'Request failed: ' + e.message; _wipeMsg.className = 'admin-error'; }
      }
-      btn.disabled = false; btn.textContent = prev;
+      btn.disabled = false; btn.innerHTML = prevHtml;
    });
  });
 }
@@ -632,6 +632,28 @@ function _getModal() {

 // ── Render dispatch ──

+// Quick-add hint examples — the placeholder cycles through these every few
+// seconds so users see different prompt shapes (events, deadlines, recurring).
+const _QA_HINT_EXAMPLES = [
+  'return home to Ithaca 1pm tmrw',
+  'dinner with Penelope Friday 8pm',
+  'coffee with Athena 9am Saturday',
+  'call Telemachus tomorrow morning',
+  'dentist appointment 3pm next Tuesday',
+  'finish the wooden horse by Friday EOD',
+  'gym 7am every weekday',
+  'flight to Athens Sunday 6:30am',
+  'crew muster 10am daily',
+  'council on Ithaca Monday 2pm',
+];
+function _initQuickAddHintCycle() {
+  const span = document.getElementById('qa-hint-example');
+  if (!span) return;
+  // Pick one random example per calendar open — no interval cycling.
+  const idx = Math.floor(Math.random() * _QA_HINT_EXAMPLES.length);
+  span.textContent = _QA_HINT_EXAMPLES[idx];
+}
+
 // Stash the quick-add input's state (focus + caret + value) before a
 // re-render so background fetches don't kick the user out mid-type. Picked
 // up by _wireAll after the new DOM lands.
@@ -846,7 +868,7 @@ function _headerHTML() {
      placeholder=" "
      autocomplete="off"
    />
-    <span class="cal-quickadd-hint" id="cal-quickadd-hint" aria-hidden="true"><span class="qa-hint-accent">Quick add</span> — return home to Ithaca 1pm tmrw <svg class="qa-hint-enter" width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><polyline points="9 10 4 15 9 20"/><path d="M20 4v7a4 4 0 0 1-4 4H4"/></svg></span>
+    <span class="cal-quickadd-hint" id="cal-quickadd-hint" aria-hidden="true"><span class="qa-hint-accent">Quick add</span> — <span class="qa-hint-example" id="qa-hint-example">return home to Ithaca 1pm tmrw</span> <svg class="qa-hint-enter" width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><polyline points="9 10 4 15 9 20"/><path d="M20 4v7a4 4 0 0 1-4 4H4"/></svg></span>
    <span class="cal-quickadd-status" id="cal-quickadd-status"></span>
  </div>`;
 }
@@ -1913,6 +1935,7 @@ function _wireAll(body) {
  // ── Quick-add input ─────────────────────────────────────────────
  const _qaInput = document.getElementById('cal-quickadd');
  const _qaStatus = document.getElementById('cal-quickadd-status');
+  _initQuickAddHintCycle();
  if (_qaInput && !_qaInput._wired) {
    _qaInput._wired = true;
    const _submitQA = async () => {
@@ -3139,6 +3162,29 @@ function _showEventForm(existing, defaultDate, defaultEndDate) {
  // mode opens already expanded when there's any detail content to see.
  titleInput?.addEventListener('focus', () => setExpanded(true), { once: true });

+  // Live time parse: typing a time like "11pm" or "15:30" into the title
+  // updates the hero clock + start input on the fly. The same parser still
+  // runs again on submit, but doing it live makes the hero clock track
+  // intent immediately instead of jumping at save.
+  if (titleInput) {
+    titleInput.addEventListener('input', () => {
+      if (document.getElementById('cal-f-allday')?.checked) return;
+      const tt = _parseTitleTime(titleInput.value);
+      if (!tt) return;
+      const startEl = document.getElementById('cal-f-start');
+      const endEl = document.getElementById('cal-f-end');
+      const newStart = `${String(tt.h).padStart(2, '0')}:${String(tt.m).padStart(2, '0')}`;
+      if (!startEl || startEl.value === newStart) return;
+      const toMin = (v) => { const p = (v || '').split(':'); return p.length === 2 ? (+p[0]) * 60 + (+p[1]) : null; };
+      const s0 = toMin(startEl.value), e0 = toMin(endEl?.value);
+      const dur = (s0 != null && e0 != null && e0 > s0) ? e0 - s0 : 60;
+      startEl.value = newStart;
+      const endMin = (tt.h * 60 + tt.m + dur) % 1440;
+      if (endEl) endEl.value = `${String(Math.floor(endMin / 60)).padStart(2, '0')}:${String(endMin % 60).padStart(2, '0')}`;
+      startEl.dispatchEvent(new Event('input'));
+    });
+  }
+
  // Location → Apple Maps. The pin button next to the input is enabled
  // only when there's a non-empty location, and its href tracks the live
  // input value. Apple's universal URL opens the native Maps app on
@@ -787,6 +787,19 @@ import { wireArrowUpRecall, getLastUserMessageFromChatHistory } from './composer
        try { await documentModule.saveDocument({ silent: true }); } catch (_e) { /* best-effort */ }
        fd.append('active_doc_id', documentModule.getCurrentDocId());
      }
+      // Active email context — when an email reader is open, pass its
+      // uid/folder/account so "reply", "summarize", "what does this say"
+      // resolve to the email the user is actually looking at instead of
+      // making the agent invent a new markdown draft with fake headers.
+      try {
+        const getEmailCtx = window.__odysseusGetActiveEmailContext;
+        const emCtx = typeof getEmailCtx === 'function' ? getEmailCtx() : null;
+        if (emCtx && emCtx.uid) {
+          fd.append('active_email_uid', String(emCtx.uid));
+          fd.append('active_email_folder', String(emCtx.folder || 'INBOX'));
+          if (emCtx.account) fd.append('active_email_account', String(emCtx.account));
+        }
+      } catch (_e) { /* best-effort */ }
      // Web toggle: pre-search in Chat mode, tool permission in Agent mode
      const toggleState = Storage.loadToggleState();
      let isAgentMode = (toggleState.mode || 'chat') === 'agent';
@@ -185,7 +185,7 @@ export function handleUIControl(uiData) {
    } else if (uiEvent === 'open_email_reply' || uiData.ui_event === 'open_email_reply') {
      import('./emailInbox.js').then(function(mod) {
        var fn = mod.openReplyDraft || (mod.default && mod.default.openReplyDraft);
-        if (fn) fn(uiData.uid, uiData.folder || 'INBOX', uiData.mode || 'reply');
+        if (fn) fn(uiData.uid, uiData.folder || 'INBOX', uiData.mode || 'reply', uiData.body || '');
      }).catch(function(e) {
        console.warn('open_email_reply failed:', e);
      });
@@ -0,0 +1,97 @@
+// Per-backend × per-model install recipes for the Dependencies tab.
+//
+// Each entry says: when you're about to serve `model` on `backend`, here's
+// the exact shell sequence to make the venv + install the right packages.
+// Entries are matched first-hit; put the more specific patterns ABOVE the
+// generic fallback for that backend.
+
+// Recipes carry two variants per entry:
+//   variants.pip    → install into the configured venv via uv/pip
+//   variants.docker → pull the official container image
+//
+// The renderer prepends a `source <venv>/bin/activate` for the pip variant
+// (env_prefix handles activation for Run). The docker variant skips the
+// activate line — `docker pull` doesn't need a venv.
+
+const _RECIPES = [
+  // ── vllm ──────────────────────────────────────────────────────────────
+  // MiniMax M2/M2.7 — same as the generic vllm install/image for now;
+  // kept as its own entry so future model-specific patches land in one
+  // obvious place without touching the catch-all.
+  {
+    backend: 'vllm',
+    label: 'MiniMax M2 / M2.7',
+    match: (m) => /minimax[-_]?m\s?2(\.7)?/i.test(m || ''),
+    variants: {
+      pip:    { commands: ['uv pip install -U vllm --torch-backend auto'] },
+      docker: { commands: ['docker pull vllm/vllm-openai:latest'] },
+    },
+  },
+  // Generic vllm fallback.
+  {
+    backend: 'vllm',
+    label: 'Any vLLM model',
+    match: () => true,
+    variants: {
+      pip:    { commands: ['uv pip install -U vllm --torch-backend auto'] },
+      docker: { commands: ['docker pull vllm/vllm-openai:latest'] },
+    },
+  },
+
+  // ── sglang ────────────────────────────────────────────────────────────
+  {
+    backend: 'sglang',
+    label: 'Any SGLang model',
+    match: () => true,
+    variants: {
+      pip:    { commands: ['uv pip install -U "sglang[all]" --torch-backend auto'] },
+      docker: { commands: ['docker pull lmsysorg/sglang:latest'] },
+    },
+  },
+
+  // ── llama.cpp ─────────────────────────────────────────────────────────
+  {
+    backend: 'llama_cpp',
+    label: 'Any GGUF model',
+    match: () => true,
+    variants: {
+      pip:    { commands: ['CMAKE_ARGS="-DGGML_CUDA=on" uv pip install -U "llama-cpp-python[server]"'] },
+      docker: { commands: ['docker pull ghcr.io/ggerganov/llama.cpp:server-cuda'] },
+    },
+  },
+];
+
+export const RECIPE_VARIANTS = ['pip', 'docker'];
+export const RECIPE_DEFAULT_VARIANT = 'pip';
+
+// Get the commands array for a recipe + variant. Falls back to pip when
+// the requested variant isn't defined for the recipe.
+export function recipeCommands(recipe, variant) {
+  if (!recipe) return [];
+  const v = (recipe.variants || {})[variant] || (recipe.variants || {}).pip;
+  return (v && v.commands) || [];
+}
+
+// Backends we surface a recipe panel for. Other rows in the Dependencies
+// list keep the existing flat Install/Reinstall button without an expand
+// affordance.
+export const RECIPE_BACKENDS = new Set(['vllm', 'sglang', 'llama_cpp']);
+
+// All recipe entries for a given backend, in catalog order. The first one
+// is the model-specific match (when present); the last is always the
+// generic fallback.
+export function recipesForBackend(backend) {
+  return _RECIPES.filter((r) => r.backend === backend);
+}
+
+// Pick the best recipe for a backend + model id. Returns the catalog
+// fallback when nothing more specific matches, or null if the backend
+// isn't in the catalog at all.
+export function pickRecipe(backend, modelId) {
+  const candidates = recipesForBackend(backend);
+  if (!candidates.length) return null;
+  for (const r of candidates) {
+    try { if (r.match(modelId)) return r; } catch (_) {}
+  }
+  return candidates[candidates.length - 1] || null;
+}
@@ -65,7 +65,13 @@ import spinnerModule from './spinner.js';

 // ── Error diagnosis ──

-function _openCookbookDependencies(pkgName = '') {
+// Re-exported so callers (Launch-tab pre-flight) can deep-link into the
+// Dependencies tab + auto-expand a specific backend's recipe panel and
+// pre-select the model they were trying to launch.
+export function openCookbookDependencies(pkgName = '', opts = {}) {
+  _openCookbookDependencies(pkgName, opts);
+}
+function _openCookbookDependencies(pkgName = '', opts = {}) {
  const cookbook = window.cookbookModule;
  if (cookbook && typeof cookbook.open === 'function') {
    cookbook.open({ tab: 'Dependencies' });
@@ -94,6 +100,34 @@ function _openCookbookDependencies(pkgName = '') {
      row.scrollIntoView({ block: 'center' });
      row.classList.add('cookbook-pkg-flash');
      setTimeout(() => row.classList.remove('cookbook-pkg-flash'), 1800);
+      // Pre-flight deep link: auto-expand the recipe panel + pre-select
+      // the model the user was trying to launch. The dropdown values are
+      // now full model ids (sourced from _cachedModelIds), so we match by
+      // exact value first, then fall back to a substring match.
+      if (opts.expandRecipe) {
+        const caret = row.querySelector('[data-dep-recipe-toggle]');
+        if (caret && caret.getAttribute('aria-expanded') !== 'true') caret.click();
+        if (opts.model) {
+          const sel = document.querySelector(`[data-dep-recipe-pick="${CSS.escape(opts.expandRecipe)}"]`);
+          if (sel) {
+            const wanted = String(opts.model);
+            let matched = false;
+            for (let i = 0; i < sel.options.length; i++) {
+              if (sel.options[i].value === wanted) {
+                sel.value = wanted; matched = true; break;
+              }
+            }
+            if (!matched) {
+              for (let i = 0; i < sel.options.length; i++) {
+                if (sel.options[i].value && wanted.includes(sel.options[i].value)) {
+                  sel.value = sel.options[i].value; matched = true; break;
+                }
+              }
+            }
+            if (matched) sel.dispatchEvent(new Event('change'));
+          }
+        }
+      }
    }
  };
  tryHighlight();
@@ -626,7 +660,24 @@ export function _showDiagnosis(panel, diagnosis, sourceText) {
  // the full error+context for a forum/discord paste.
  const toolbar = document.createElement('div');
  toolbar.className = 'cookbook-diag-toolbar';
-  toolbar.style.cssText = 'display:flex;justify-content:flex-end;align-items:center;gap:4px;margin-bottom:-2px;';
+  // Left side carries the diagnosis text (message + suggestion); buttons
+  // stay on the right. Was a separate body row below the toolbar, but
+  // the message reads more like "this is what the toolbar is for" when
+  // it sits inline with Copy / × Dismiss.
+  toolbar.style.cssText = 'display:flex;align-items:flex-start;gap:8px;margin-bottom:-2px;';
+
+  const textWrap = document.createElement('div');
+  textWrap.style.cssText = 'flex:1;min-width:0;font-size:11px;line-height:1.35;';
+  const msg = document.createElement('div');
+  msg.className = 'cookbook-diag-message';
+  msg.textContent = diagnosis.message;
+  textWrap.appendChild(msg);
+  const suggestion = document.createElement('div');
+  suggestion.className = 'cookbook-diag-suggestion';
+  suggestion.textContent = suggestionText;
+  suggestion.style.cssText = 'opacity:0.75;margin-top:1px;';
+  textWrap.appendChild(suggestion);
+  toolbar.appendChild(textWrap);

  const copyBtn = document.createElement('button');
  copyBtn.type = 'button';
@@ -660,18 +711,6 @@ export function _showDiagnosis(panel, diagnosis, sourceText) {
  toolbar.appendChild(dismissBtn);
  diag.appendChild(toolbar);

-  const body = document.createElement('div');
-  body.className = 'cookbook-diag-body';
-  const msg = document.createElement('div');
-  msg.className = 'cookbook-diag-message';
-  msg.textContent = diagnosis.message;
-  body.appendChild(msg);
-  const suggestion = document.createElement('div');
-  suggestion.className = 'cookbook-diag-suggestion';
-  suggestion.textContent = suggestionText;
-  body.appendChild(suggestion);
-  diag.appendChild(body);
-
  const runFix = async (fix, button, busyLabel = fix.label, onStart = null, onDone = null) => {
    if (!fix || !button || button.dataset.busy) return;
    button.dataset.busy = '1';
@@ -31,6 +31,44 @@ import {
 } from './cookbook.js';
 import uiModule from './ui.js';
 import spinnerModule from './spinner.js';
+import { _loadTasks, _tmuxGracefulKill } from './cookbookRunning.js';
+import { openCookbookDependencies } from './cookbook-diagnosis.js';
+
+// Map a serve-backend code (vllm / sglang / llamacpp) → the package name
+// the Dependencies API reports. Used to look up "is this backend installed
+// on the target server" before firing a launch.
+const _BACKEND_PKG = { vllm: 'vllm', sglang: 'sglang', llamacpp: 'llama_cpp' };
+
+// Pre-launch: ask the deps API whether the chosen backend is present on
+// the target server. Returns true if it's good to go, false if we should
+// block and route the user into Dependencies.
+async function _ensureBackendInstalled(runBackend, host, port, envPath, modelName) {
+  const pkgName = _BACKEND_PKG[runBackend];
+  if (!pkgName) return true; // unknown backend — don't block
+  try {
+    const params = new URLSearchParams();
+    if (host) {
+      params.set('host', host);
+      if (port) params.set('ssh_port', String(port));
+      if (envPath) params.set('venv', envPath);
+    }
+    const r = await fetch('/api/cookbook/packages' + (params.toString() ? '?' + params : ''));
+    const d = await r.json();
+    const pkg = (d.packages || []).find(p => p.name === pkgName);
+    if (pkg && pkg.installed) return true;
+  } catch (_) {
+    // If we can't tell, don't block — the server's own serve route will
+    // surface a clearer error anyway.
+    return true;
+  }
+  const targetLabel = host || 'this server';
+  uiModule.showToast(
+    `${pkgName} not installed on ${targetLabel}. Opening Dependencies — pick your model and click Run.`,
+    6000
+  );
+  openCookbookDependencies(pkgName, { expandRecipe: pkgName, model: modelName });
+  return false;
+}

 // ── What Fits? (hardware model fitting) ──

@@ -127,7 +165,12 @@ export function _renderGpuToggles(system) {
    _gpuToggleTotal = 0;
    return;
  }
-  if (!_gpuToggleTotal) _gpuToggleTotal = total;
+  // Update on every scan that returns a positive total — previously this
+   // only set on the first scan, so switching servers (e.g. local 1-GPU
+   // first, then a 4-GPU remote) left the Run-panel GPU buttons stuck on
+   // the original count. Zero/missing totals still don't clobber a known
+   // good value (avoids flicker during an in-flight re-probe).
+  if (total > 0) _gpuToggleTotal = total;

  container._groups = groups;
  if (container._activeGroup === undefined) container._activeGroup = 0;  // auto = largest pool
@@ -159,8 +202,17 @@ export function _renderGpuToggles(system) {
  // visual highlight. Before this, _activeCount stayed undefined → no
  // gpu_count param sent → backend's fallback could rank against RAM on
  // mixed-resource boxes ("tightest" sorted by RAM instead of GPU).
-  if (container._activeCount === undefined && validCounts.length) {
-    container._activeCount = maxGpu;
+  //
+  // On boxes where total RAM > total VRAM, default to RAM (count=0) instead
+  // — RAM is the dominant pool so it's the better starting filter.
+  if (container._activeCount === undefined) {
+    const ramGb = Number(system.total_ram_gb) || 0;
+    const vramGb = Number(system.gpu_vram_gb) || 0;
+    if (ramGb > vramGb) {
+      container._activeCount = 0;
+    } else if (validCounts.length) {
+      container._activeCount = maxGpu;
+    }
  }
  html += '<button class="hwfit-gpu-btn" data-count="0" title="CPU / RAM only">RAM</button>';
  const hasExplicitCount = typeof container._activeCount === 'number';
@@ -363,7 +415,7 @@ function _scanSig() {
    hk: _currentServerValue(),
    u: document.getElementById('hwfit-usecase')?.value || '',
    s: document.getElementById('hwfit-search')?.value?.trim() || '',
-    o: sortEl?.value || 'score',
+    o: sortEl?.value || 'newest',
    r: sortEl?.dataset.reverse === '1' ? 1 : 0,
    q: document.getElementById('hwfit-quant')?.value || '',
    c: _ctxValue(),
@@ -582,7 +634,7 @@ export async function _hwfitFetch(fresh = false) {
      }).catch(() => {});
  }
  try {
-    const sortBy = document.getElementById('hwfit-sort')?.value || 'score';
+    const sortBy = document.getElementById('hwfit-sort')?.value || 'newest';
    const quantPref = document.getElementById('hwfit-quant')?.value || '';
    const targetCtx = _ctxValue();
    // Get active GPU count from toggles
@@ -710,7 +762,7 @@ export async function _hwfitFetch(fresh = false) {
    // 1st click on a column = highest first; clicking it again = lowest first.
    if (!isImageMode) {
      const sortSel = document.getElementById('hwfit-sort');
-      const sortKey = sortSel?.value || 'score';
+      const sortKey = sortSel?.value || 'newest';
      const asc = sortSel?.dataset.reverse === '1';   // reversed → ascending (lowest first)
      if (sortKey === 'fit') {
        // fit_level is categorical (perfect→good→marginal→too_tight), not numeric,
@@ -723,6 +775,18 @@ export async function _hwfitFetch(fresh = false) {
          const as = Number(a.score) || 0, bs = Number(b.score) || 0;
          return asc ? as - bs : bs - as;
        });
+      } else if (sortKey === 'newest') {
+        // release_date is an ISO-ish "YYYY-MM-DD" string — lexical sort is
+        // chronological. Default direction: newest first (reverse=undefined).
+        data.models.sort((a, b) => {
+          const ad = String(a.release_date || ''), bd = String(b.release_date || '');
+          if (ad === bd) return 0;
+          // Empty dates land last regardless of direction so the column never
+          // floats undated rows above real releases.
+          if (!ad) return 1;
+          if (!bd) return -1;
+          return asc ? (ad < bd ? -1 : 1) : (ad < bd ? 1 : -1);
+        });
      } else {
        const field = { score: 'score', vram: 'required_gb', speed: 'speed_tps', params: 'params_b', context: 'context' }[sortKey] || 'score';
        data.models.sort((a, b) => {
@@ -1043,7 +1107,7 @@ function _modeLabel(model) {

 export const _hwfitColumns = [
  { key: 'fit', label: 'Fit',    cls: 'hwfit-fit' },
-  { key: null,    label: 'Model',  cls: 'hwfit-name' },
+  { key: 'newest', label: 'Model (latest)',  cls: 'hwfit-name' },
  { key: 'params',label: 'Param', cls: 'hwfit-c-params' },
  { key: null,    label: 'Quant',  cls: 'hwfit-c-quant' },
  { key: 'vram',  label: 'VRAM',   cls: 'hwfit-c-vram' },
@@ -1073,7 +1137,7 @@ export function _hwfitRenderList(el, models) {
    return;
  }
  const sortSel = document.getElementById('hwfit-sort');
-  const currentSort = sortSel?.value || 'score';
+  const currentSort = sortSel?.value || 'newest';
  const isReversed = sortSel?.dataset.reverse === '1';
  // Active budget for the Fit column label \u2014 make it obvious whether the
  // ranking is against GPU or RAM so "tightest" can't be ambiguous on a
@@ -1102,6 +1166,13 @@ export function _hwfitRenderList(el, models) {
      // (Budget tag removed — the GPU/RAM/N-GPU suffix next to "Fit" was noise;
      // the toggle row already shows which budget is active.)
    }
+    // The Model column's "(newest)" / "(oldest)" suffix flips with the sort
+    // direction so the user can see at a glance which way they're sorted.
+    if (col.key === 'newest' && col.key === currentSort) {
+      label = isReversed ? 'Model (oldest)' : 'Model (latest)';
+    } else if (col.key === 'newest') {
+      label = 'Model (latest)';
+    }
    html += `<span class="hwfit-col ${col.cls}${sortable}${active}"${dataAttr}>${label}${arrow}</span>`;
  }
  html += '</div>';
@@ -1256,6 +1327,72 @@ function _syncHostFromScanDropdown() {
  return host;
 }

+// Minimum backend version a given model needs. Returns a semver string like
+// "0.10.0" or null when the model has no known floor. Hardcoded for now —
+// when the vLLM-recipes integration lands we can pull this from the upstream
+// recipe page instead. Keep this conservative: a null return means "any
+// installed version passes", so we don't false-positive launches.
+function _minBackendVersion(modelName, backend) {
+  const n = (modelName || '').toLowerCase();
+  if (backend === 'vllm') {
+    // MiniMax M2 / M2.5 / M2.7 — minimax_m2 parser shipped in 0.10.0
+    if (n.includes('minimax') && n.match(/\bm2(?:\.\d)?\b/)) return '0.10.0';
+    // MiniMax M3 — newer parser registered in 0.11.x
+    if (n.includes('minimax') && n.includes('m3')) return '0.11.0';
+    // DeepSeek V3 / V3.1 / R1 — MoE expert-parallel paths matured in 0.7.0+
+    if (n.includes('deepseek') && (n.includes('v3') || n.includes('r1'))) return '0.7.0';
+    // Qwen3 reasoning models — qwen3 reasoning parser added in 0.7.0
+    if (n.includes('qwen3') && !n.includes('coder') && !n.includes('instruct')) return '0.7.0';
+    // GLM-4.5 / GLM-4.6 — glm45 reasoning parser added in 0.8.0
+    if (n.includes('glm-4.5') || n.includes('glm-4.6') || n.includes('glm-5')) return '0.8.0';
+    // gpt-oss reasoning models — gpt_oss parser
+    if (n.includes('gpt-oss')) return '0.10.0';
+    // Llama-4 multimodal — landed in 0.7.0
+    if (n.includes('llama-4') || n.includes('llama4')) return '0.7.0';
+  }
+  return null;
+}
+
+// Tiny semver compare: returns <0 / 0 / >0 like strcmp. Tolerates "0.10",
+// "0.10.0", "0.10.0+cu124" — pre-release / build suffixes are stripped.
+function _cmpSemver(a, b) {
+  const _parse = (s) => String(s || '').split(/[.+-]/).filter(p => /^\d+$/.test(p)).map(Number);
+  const A = _parse(a), B = _parse(b);
+  for (let i = 0; i < Math.max(A.length, B.length); i++) {
+    const av = A[i] || 0, bv = B[i] || 0;
+    if (av !== bv) return av - bv;
+  }
+  return 0;
+}
+
+// Map the detected GPU + the model's quant to SGLang's URL-hash params so
+// the cookbook page lands on the right preset. SGLang supports:
+//   hw      = b200 | b300 | gb200 | gb300 | mi300x | mi325x | mi350x | mi355x | h200
+//   quant   = mxfp8 | bf16
+//   variant = default        strategy = balanced       nodes = single
+// We only set what we can confidently infer; anything missing degrades to
+// SGLang's own default (which is `h200` + bf16 single-node balanced).
+function _sglangHashFor(modelData) {
+  const sys = (typeof _hwfitCache !== 'undefined' ? _hwfitCache?.system : null) || {};
+  const gpuName = String(sys.gpu_name || '').toLowerCase();
+  let hw = '';
+  if (/\bgb300/.test(gpuName)) hw = 'gb300';
+  else if (/\bgb200/.test(gpuName)) hw = 'gb200';
+  else if (/\bb300/.test(gpuName)) hw = 'b300';
+  else if (/\bb200/.test(gpuName)) hw = 'b200';
+  else if (/\bh200/.test(gpuName)) hw = 'h200';
+  else if (/mi355/.test(gpuName)) hw = 'mi355x';
+  else if (/mi350/.test(gpuName)) hw = 'mi350x';
+  else if (/mi325/.test(gpuName)) hw = 'mi325x';
+  else if (/mi300/.test(gpuName)) hw = 'mi300x';
+  const qRaw = String(modelData?.quant || '').toLowerCase();
+  // mxfp8 covers fp8 / mxfp8 / nvfp4; bf16 covers everything else cheap.
+  const quant = /fp8|mxfp|nvfp/.test(qRaw) ? 'mxfp8' : 'bf16';
+  const parts = ['variant=default', `quant=${quant}`, 'strategy=balanced', 'nodes=single'];
+  if (hw) parts.unshift(`hw=${hw}`);
+  return '#' + parts.join('&');
+}
+
 export function _expandModelRow(row, modelData) {
  const list = row.closest('.hwfit-list');
  if (!list) return;
@@ -1278,11 +1415,23 @@ export function _expandModelRow(row, modelData) {

  const dlSource = _downloadSourceRepo(modelData, backend);
  const hfUrl = `https://huggingface.co/${dlSource.repo}`;
+  // Official vendor recipe deep-links. These point to vLLM / SGLang's curated
+  // hardware-specific launch-command pages. They 404 for uncatalogued models \u2014
+  // a known tradeoff; user just gets the vendor's "model not found" page.
+  const _recipeRepo = modelData.name || '';
+  const _vllmUrl = _recipeRepo ? `https://recipes.vllm.ai/${_recipeRepo}` : '';
+  const _sglangUrl = _recipeRepo ? `https://docs.sglang.io/cookbook/autoregressive/${_recipeRepo}${_sglangHashFor(modelData)}` : '';
  let html = `<div class="hwfit-action-panel" data-model-name="${esc(modelData.name)}">`;
  html += `<div class="hwfit-panel-header">`;
  html += `<span class="hwfit-panel-model">${esc(modelData.name)}${dlSource.kind ? ` <span style="opacity:0.5;font-size:10px;">(${esc(dlSource.kind)} ${esc(modelData.quant || '')})</span>` : (modelData.quant_repo ? ` <span style="opacity:0.5;font-size:10px;">(${esc(modelData.quant)})</span>` : '')}</span>`;
  html += `<span class="hwfit-panel-badge">${esc(label)}</span>`;
  html += `<a href="${esc(hfUrl)}" target="_blank" rel="noopener" class="hwfit-panel-hf-link" title="View download source on HuggingFace">HF \u2197</a>`;
+  if (backend === 'vllm' && _vllmUrl) {
+    html += `<a href="${esc(_vllmUrl)}" target="_blank" rel="noopener" class="hwfit-panel-hf-link" title="vLLM official recipe (curated launch command). 404s if this model isn't in vLLM's recipes catalog.">vLLM \u2197</a>`;
+  }
+  if (backend === 'sglang' && _sglangUrl) {
+    html += `<a href="${esc(_sglangUrl)}" target="_blank" rel="noopener" class="hwfit-panel-hf-link" title="SGLang cookbook (hash pre-filled with your detected hardware). 404s if this model isn't in SGLang's cookbook catalog.">SGLang \u2197</a>`;
+  }
  html += `</div>`;
  html += `<div class="hwfit-panel-actions">`;
  html += `<button class="cookbook-btn hwfit-dl-btn">Download</button>`;
@@ -1351,6 +1500,133 @@ export function _expandModelRow(row, modelData) {
        return;
      }

+      // ─── Pre-launch: stop the model already serving on this host ───────
+      // Two servers can't share port 8000. Without this, the new launch
+      // silently collided and the user saw no feedback. We surface the
+      // conflict and offer to kill the running one first as the default
+      // action (it's almost always what the user wants).
+      try {
+        const _qrHostStr = _envState.remoteHost || '';
+        const _activeServes = _loadTasks().filter(t =>
+          t && t.type === 'serve'
+          && (t.remoteHost || '') === _qrHostStr
+          && (t.status === 'running' || t.status === 'ready' || t._serveReady)
+        );
+        if (_activeServes.length) {
+          const _names = _activeServes.map(t => t.payload?.repo_id || t.repo || t.name || '?').filter(Boolean);
+          const _ok = await window.styledConfirm?.(
+            `${_names.length} model${_names.length === 1 ? '' : 's'} already serving on ${_qrHostStr || 'local'} (${_names.join(', ')}). Port 8000 will collide. Stop the running model and launch this one?`,
+            { confirmText: 'Stop & launch', cancelText: 'Cancel' }
+          );
+          if (!_ok) return;
+          // Mark + kill each running serve, then wait briefly for the
+          // tmux session to actually go down before we kick off the new
+          // launch. Otherwise vLLM still races against the dying socket.
+          quickRunBtn.disabled = true;
+          quickRunBtn.textContent = 'Stopping…';
+          for (const t of _activeServes) {
+            try {
+              // Use that task's own Stop button if it's rendered (handles
+              // endpoint cleanup, Ollama unload, fade-out). Falls back to
+              // a direct tmux kill if the Active tab isn't in the DOM yet.
+              const _taskEl = document.querySelector(`.cookbook-task[data-task-id="${t.sessionId}"]`);
+              const _stopBtn = _taskEl?.querySelector('.cookbook-task-action-stop');
+              if (_stopBtn) {
+                _stopBtn.click();
+              } else {
+                await fetch('/api/shell/exec', {
+                  method: 'POST',
+                  credentials: 'same-origin',
+                  headers: { 'Content-Type': 'application/json' },
+                  body: JSON.stringify({ command: _tmuxGracefulKill(t) }),
+                });
+              }
+            } catch (_killErr) { /* best-effort */ }
+          }
+          // Give the OS a beat to release port 8000.
+          await new Promise(r => setTimeout(r, 2500));
+        }
+      } catch (_e) { /* best-effort */ }
+
+      // ─── Pre-launch driver check ─────────────────────────────────────
+      // vLLM/SGLang need a working CUDA/ROCm driver. nvidia-smi failures
+      // surface as system.gpu_error from our hardware probe; "no GPU
+      // detected" is the other common case. Bail with a clear message
+      // before kicking off the long install/launch chain — otherwise the
+      // user watches `pip install vllm` finish, then sees a cryptic CUDA
+      // error 10 minutes later. (llama.cpp / Ollama have CPU fallbacks
+      // so they skip this gate.)
+      const _qrBackendDetect = _detectBackend(modelData);
+      const _qrRunBackend = _qrBackendDetect.backend || 'vllm';
+      if (_qrRunBackend === 'vllm' || _qrRunBackend === 'sglang') {
+        const _sys = _hwfitCache?.system || {};
+        if (_sys.gpu_error) {
+          uiModule.showError(`Can't launch: GPU driver error — ${_sys.gpu_error}. Reinstall or repair the NVIDIA driver, then re-scan.`);
+          return;
+        }
+        if (!_sys.has_gpu || !(_sys.gpu_count > 0)) {
+          uiModule.showError(`Can't launch: no GPU detected by nvidia-smi. ${_qrRunBackend === 'vllm' ? 'vLLM' : 'SGLang'} needs a working CUDA or ROCm device.`);
+          return;
+        }
+      }
+
+      // ─── Pre-launch install + version check ─────────────────────────
+      // Catches:
+      //   a) "command not found" (binary not in PATH)
+      //   b) "version too old" (model needs e.g. vllm >= 0.10.0 for the
+      //      reasoning/tool parser registered for it).
+      // Both cases would otherwise fail 10s-3min into the launch with a
+      // cryptic shell error. Best-effort: a venv activated only by the
+      // launch wrapper can false-negative the PATH check, in which case
+      // the launch proceeds and the existing diagnosis layer handles it.
+      if (_qrRunBackend === 'vllm' || _qrRunBackend === 'sglang') {
+        try {
+          const _qrHostStr = _envState.remoteHost || '';
+          const _coreCheck = _qrRunBackend === 'vllm'
+            ? "command -v vllm >/dev/null 2>&1 && vllm --version 2>&1 | grep -oE '[0-9]+\\.[0-9]+(\\.[0-9]+)?' | head -1 || echo MISSING"
+            : "python3 -c 'import sglang, sys; sys.stdout.write(sglang.__version__)' 2>/dev/null || echo MISSING";
+          const _wrappedCheck = _qrHostStr
+            ? `ssh -o BatchMode=yes -o ConnectTimeout=5 -o StrictHostKeyChecking=accept-new ${_qrHostStr} "bash -lc ${JSON.stringify(_coreCheck)}"`
+            : `bash -lc ${JSON.stringify(_coreCheck)}`;
+          const _chkRes = await fetch('/api/shell/exec', {
+            method: 'POST',
+            credentials: 'same-origin',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ command: _wrappedCheck, timeout: 10 }),
+          });
+          if (_chkRes.ok) {
+            const _chk = await _chkRes.json();
+            const _stdout = String(_chk.stdout || '').trim();
+            const _stderr = String(_chk.stderr || '').trim();
+            const _out = `${_stdout}\n${_stderr}`;
+            if (_out.includes('MISSING')) {
+              const _pkg = _qrRunBackend === 'vllm' ? 'vLLM' : 'SGLang';
+              const _hint = _qrRunBackend === 'vllm'
+                ? 'uv pip install -U vllm --torch-backend auto'
+                : "pip install -U 'sglang[all]'";
+              uiModule.showError(`Can't launch: ${_pkg} isn't installed${_qrHostStr ? ' on ' + _qrHostStr : ''}. Install it first:\n${_hint}`);
+              return;
+            }
+            // Version-floor check. _minBackendVersion returns null when this
+            // model has no known requirement — in which case any installed
+            // version passes.
+            const _minVer = _minBackendVersion(modelData.name, _qrRunBackend);
+            const _verMatch = _stdout.match(/(\d+\.\d+(?:\.\d+)?)/);
+            const _curVer = _verMatch ? _verMatch[1] : '';
+            if (_minVer && _curVer && _cmpSemver(_curVer, _minVer) < 0) {
+              const _pkg = _qrRunBackend === 'vllm' ? 'vLLM' : 'SGLang';
+              const _hint = _qrRunBackend === 'vllm'
+                ? 'uv pip install -U vllm --torch-backend auto'
+                : "pip install -U 'sglang[all]'";
+              uiModule.showError(`Can't launch: ${modelData.name} needs ${_pkg} ≥ ${_minVer}, but ${_curVer} is installed${_qrHostStr ? ' on ' + _qrHostStr : ''}. Upgrade:\n${_hint}`);
+              return;
+            }
+          }
+        } catch (_e) {
+          // Network/exec failed — fall through and let the launch try.
+        }
+      }
+
      quickRunBtn.disabled = true;
      quickRunBtn.textContent = 'Starting...';

@@ -1428,6 +1704,23 @@ export function _expandModelRow(row, modelData) {
      // schema (repo_id + cmd) — sending `command`/`model` failed Pydantic
      // validation (422), which is why Run silently did nothing.
      const _srv = _serverByVal(_envState.remoteServerKey || host);
+
+      // Pre-flight: if the backend isn't installed on the target server,
+      // route the user into Dependencies → recipe panel for that backend
+      // instead of launching into an obvious "command not found" failure.
+      const _ok = await _ensureBackendInstalled(
+        runBackend,
+        host,
+        (_srv && _srv.port) || undefined,
+        _envState.envPath || '',
+        modelData.name,
+      );
+      if (!_ok) {
+        quickRunBtn.disabled = false;
+        quickRunBtn.textContent = 'Run';
+        return;
+      }
+
      const payload = {
        repo_id: modelData.name,
        cmd: cmd,
@@ -8,6 +8,7 @@ import spinnerModule from './spinner.js';
 import { providerLogo } from './providers.js';
 import { makeWindowDraggable } from './windowDrag.js';
 import { _diagnose, _showDiagnosis, _clearDiagnosis, _runQuickCmd, ERROR_PATTERNS } from './cookbook-diagnosis.js';
+import { RECIPE_BACKENDS, recipesForBackend, pickRecipe, recipeCommands, RECIPE_DEFAULT_VARIANT } from './cookbook-deps-recipes.js';
 import { _hwfitCache, _hwfitDebounce, _hwfitFetch, _hwfitInit, _hwfitRenderList, _hwfitRenderHw, _renderGpuToggles, _expandModelRow, _fitColors, _hwfitColumns, _cachedModelIds, _gpuToggleTotal, _resetGpuToggleState } from './cookbook-hwfit.js';

 // Sub-modules
@@ -233,22 +234,39 @@ function _detectModelOptimizations(modelName) {
  const n = (modelName || '').toLowerCase();
  const opts = { envVars: [], flags: [], tips: [] };

-  // Qwen3.5 MoE models
+  // Qwen3.5 MoE models — MoE-specific env vars + expert-parallel.
+  // The --reasoning-parser flag is added uniformly below via
+  // _detectReasoningParser, no longer hardcoded here.
  if (n.includes('qwen3.5') || n.includes('qwen3-') && (n.includes('a10b') || n.includes('a22b') || n.includes('a3b'))) {
    opts.envVars.push('VLLM_USE_DEEP_GEMM=0', 'VLLM_USE_FLASHINFER_MOE_FP16=1', 'VLLM_USE_FLASHINFER_SAMPLER=0', 'OMP_NUM_THREADS=4');
-    opts.flags.push('--enable-expert-parallel', '--reasoning-parser qwen3');
+    opts.flags.push('--enable-expert-parallel');
    opts.tips.push('MoE optimizations: expert parallel + flashinfer MoE kernels');
  }
  // Qwen3 MoE (non-3.5)
  else if (n.includes('qwen3') && (n.includes('a10b') || n.includes('a22b') || n.includes('a3b'))) {
    opts.envVars.push('VLLM_USE_DEEP_GEMM=0', 'VLLM_USE_FLASHINFER_MOE_FP16=1');
-    opts.flags.push('--enable-expert-parallel', '--reasoning-parser qwen3');
+    opts.flags.push('--enable-expert-parallel');
    opts.tips.push('MoE optimizations: expert parallel');
  }
-  // DeepSeek MoE
-  else if (n.includes('deepseek') && (n.includes('v3') || n.includes('r1'))) {
+  // DeepSeek MoE — V3 / V3.1 / V4 (and future Vx), R1 / R2 reasoning.
+  // Anything v-{integer} or r-{integer} family from DeepSeek is MoE in
+  // current architectures. These models also require fp8 KV cache to
+  // fit at meaningful context with current tensor-parallel layouts —
+  // the launch crashes otherwise (--kv-cache-dtype auto → bf16 OOMs).
+  else if (n.includes('deepseek') && /\b(v[3-9]|v\d{2,}|r[1-9])\b/.test(n)) {
    opts.flags.push('--enable-expert-parallel');
    opts.tips.push('MoE expert parallel for DeepSeek');
+    opts.kvCacheDtype = 'fp8';
+    opts.tips.push('fp8 KV cache required — bf16 OOMs at usable context');
+  }
+  // Reasoning parser — applies independently of MoE detection. Without this
+  // flag, models like MiniMax-M2.x, DeepSeek-R1, Qwen3 reasoning, GLM-4.x,
+  // gpt-oss leak <think> blocks as plain text instead of separating them
+  // into the reasoning_content channel.
+  const _reasoningParser = _detectReasoningParser(modelName);
+  if (_reasoningParser) {
+    opts.flags.push(`--reasoning-parser ${_reasoningParser}`);
+    opts.tips.push(`Reasoning parser (${_reasoningParser}): splits <think> tokens into a separate channel`);
  }
  // Speculative decoding — pick the right MTP method per model family.
  // opts.spec.{method,tokens} seed the UI dropdown/input; the actual flag is
@@ -257,7 +275,7 @@ function _detectModelOptimizations(modelName) {
  if (n.includes('qwen3-next') || (n.includes('qwen3.5') && (n.includes('a10b') || n.includes('a22b')))) {
    specDefault = { method: 'qwen3_next_mtp', tokens: 2 };
  } else if (
-    (n.includes('deepseek') && (n.includes('v3') || n.includes('v3.1') || n.includes('r1'))) ||
+    (n.includes('deepseek') && /\b(v[3-9]|v\d{2,}|r[1-9])\b/.test(n)) ||
    n.includes('kimi-k2') || n.includes('kimi_k2') ||
    n.includes('glm-4.5') || n.includes('glm4.5') ||
    n.includes('minimax-m1') || n.includes('minimax_m1')
@@ -273,6 +291,36 @@ function _detectModelOptimizations(modelName) {
  return opts;
 }

+/** Detect the right vLLM --reasoning-parser based on model name.
+ *  Returns the parser slug (matches vLLM's official list) or null when the
+ *  model isn't a reasoning model. Without the right parser, thinking tokens
+ *  leak as plain text instead of being split into a separate channel.
+ *  Source: vllm/reasoning/__init__.py registered parsers.
+ */
+export function _detectReasoningParser(modelName) {
+  const n = (modelName || '').toLowerCase();
+  // MiniMax M2 / M2.5 / M2.7 — released with a dedicated parser. Catch M2
+  // before plain "minimax" so M2.x doesn't fall through to a wrong parser.
+  if (n.includes('minimax') && n.match(/\bm2(?:\.\d)?\b/)) return 'minimax_m2';
+  // DeepSeek-R1 / V3-Thinking / V3.1-Thinking variants. Bare V3/V3.1 (non-
+  // thinking) skip this — they're not reasoning models.
+  if (n.includes('deepseek') && (n.includes('r1') || n.includes('thinking'))) return 'deepseek_r1';
+  // Qwen3 / Qwen3.5 reasoning models. Qwen3-Coder + Qwen3-Instruct don't
+  // emit <think> blocks, so skip the parser there.
+  if (n.includes('qwen3') && !n.includes('coder') && !n.includes('instruct')) return 'qwen3';
+  // GLM-4 / GLM-4.5 / GLM-4.6 with reasoning.
+  if (n.includes('glm-4') || n.includes('glm-5')) return 'glm45';
+  // OpenAI gpt-oss family.
+  if (n.includes('gpt-oss')) return 'gpt_oss';
+  // Hunyuan A13B reasoning.
+  if (n.includes('hunyuan') && n.includes('a13b')) return 'hunyuan_a13b';
+  // IBM Granite reasoning.
+  if (n.includes('granite') && (n.includes('reason') || n.includes('think'))) return 'granite';
+  // InternLM reasoning.
+  if (n.includes('internlm')) return 'internlm';
+  return null;
+}
+
 /** Detect the right vLLM tool-call-parser based on model name.
 *  Qwen tool-call formats split by generation:
 *   - Qwen3-Coder           → qwen3_coder  (XML <tool_call> with named params)
@@ -416,7 +464,10 @@ export function _buildServeCmd(f, modelName, backend) {
  const _py3Bin = _venvBin ? `${_venvBin}python3` : 'python3';
  let cmd = '';
  if (backend === 'vllm') {
-    const gpuId = f.gpu_id?.trim() || '';
+    // GPU list comes from the Row-1 button strip (data-field="gpus") —
+    // the bare "auto" input that used to back gpu_id is gone, and the
+    // button strip is the only source for which devices to pin.
+    const gpuId = (f.gpus || f.gpu_id || '').toString().trim();
    if (gpuId) cmd += `CUDA_VISIBLE_DEVICES=${gpuId} `;
    if (f.moe_env) {
      const _opts = _detectModelOptimizations(modelName);
@@ -458,7 +509,10 @@ export function _buildServeCmd(f, modelName, backend) {
      cmd += ` --speculative-config '{"method":"${_specMethod}","num_speculative_tokens":${_specToks}}'`;
    }
  } else if (backend === 'sglang') {
-    const gpuId = f.gpu_id?.trim() || '';
+    // GPU list comes from the Row-1 button strip (data-field="gpus") —
+    // the bare "auto" input that used to back gpu_id is gone, and the
+    // button strip is the only source for which devices to pin.
+    const gpuId = (f.gpus || f.gpu_id || '').toString().trim();
    if (gpuId) cmd += `CUDA_VISIBLE_DEVICES=${gpuId} `;
    const _extraEnv = (f.extra_env ?? '').toString().replace(/\s+/g, ' ').trim();
    if (_extraEnv) cmd += _extraEnv + ' ';
@@ -475,7 +529,9 @@ export function _buildServeCmd(f, modelName, backend) {
    if (f.enforce_eager) cmd += ' --disable-cuda-graph';
  } else if (backend === 'llamacpp') {
    const ggufPath = f._gguf_path || 'model.gguf';
-    const gpuId = f.gpu_id?.trim() || '';
+    // GPU list — read from gpus (button strip); fall back to gpu_id for
+    // backward-compat with older saved presets that pre-date the removal.
+    const gpuId = (f.gpus || f.gpu_id || '').toString().trim();
    const py = _isWindows() ? 'python' : 'python3';
    // CPU-only serve (-ngl 0): drop the GPU-only flags, otherwise the command
    // mixes "zero GPU layers" with CUDA unified-memory + flash-attn and fails to
@@ -737,6 +793,22 @@ async function _fetchDependencies() {
      return `<button class="cookbook-dep-tag cookbook-dep-install" data-dep-pip="${esc(pkg.pip)}" data-dep-target="${isLocal ? 'local' : 'remote'}">Install</button>`;
    };

+    // Per-package inline glyphs — same accent-coloured marks used in the
+    // Backend picker on the Run page, so the Dependencies row visually
+    // matches the engine you're configuring. Unknown packages get no
+    // icon (the name alone is fine for librosa, hf_transfer, etc.).
+    const _DEP_GLYPHS = {
+      vllm:    '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.4" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><path d="M3 4l7 16 7-16"/><path d="M14 4l4 9 3-9"/></svg>',
+      sglang:  '<svg width="13" height="13" viewBox="0 0 24 24" fill="currentColor" stroke="none" aria-hidden="true"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>',
+      llama_cpp: '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><circle cx="12" cy="12" r="9"/><path d="M8 12h8M12 8v8"/></svg>',
+      ollama:  '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><path d="M6 10a6 6 0 0 1 12 0v4a4 4 0 0 1-8 0v-1"/><circle cx="10" cy="9" r="1"/><circle cx="14" cy="9" r="1"/></svg>',
+      diffusers: '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><circle cx="12" cy="12" r="4"/><path d="M12 2v3M12 19v3M2 12h3M19 12h3M5 5l2 2M17 17l2 2M5 19l2-2M17 7l2-2"/></svg>',
+    };
+    const _depGlyphHtml = (name) => {
+      const g = _DEP_GLYPHS[name];
+      return g ? `<span class="cookbook-dep-glyph" aria-hidden="true" style="display:inline-flex;align-items:center;justify-content:center;width:14px;height:14px;color:var(--accent, var(--red));margin-right:5px;vertical-align:-2px;">${g}</span>` : '';
+    };
+
    const _depRow = (pkg) => {
      const isLocal = pkg.target === 'local';
      const isSystemDep = pkg.kind === 'system';
@@ -757,9 +829,16 @@ async function _fetchDependencies() {
      } else if (pkg.name === 'sglang' && pkg.installed) {
        _rebuildBtn = `<button type="button" class="cookbook-dep-tag cookbook-dep-rebuild cookbook-dep-reinstall" data-reinstall-pkg="sglang" title="Force-reinstall SGLang (pulls a matching torch). Runs as a tmux task in the Running tab.">Reinstall</button>`;
      }
+      // For backends with a recipe catalog (vllm / sglang / llama_cpp),
+      // append a caret button that toggles a per-row recipe panel below.
+      const hasRecipe = RECIPE_BACKENDS.has(pkg.name);
+      const recipeCaret = hasRecipe
+        ? `<button class="cookbook-dep-tag cookbook-dep-recipe-caret" data-dep-recipe-toggle="${esc(pkg.name)}" title="Pick a model to see the exact install commands" aria-expanded="false" style="background:none;border:1px solid var(--border);padding:2px 6px;display:inline-flex;align-items:center;cursor:pointer;"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round" style="transition:transform 0.15s"><polyline points="6 9 12 15 18 9"/></svg></button>`
+        : '';
+      const recipePanel = hasRecipe ? _recipePanelHtml(pkg.name) : '';
      return `<div class="cookbook-dep-row${winBlocked ? ' cookbook-dep-blocked' : ''}" data-pkg-name="${esc(pkg.name)}" data-dep-pip="${esc(pkg.pip || '')}" data-dep-target="${isLocal ? 'local' : 'remote'}" data-dep-kind="${esc(pkg.kind || 'python')}">`
        + `<div class="cookbook-dep-info">`
-        + `<div class="memory-item-title">${esc(pkg.name)}</div>`
+        + `<div class="memory-item-title">${_depGlyphHtml(pkg.name)}${esc(pkg.name)}</div>`
        + `<div class="memory-item-meta" style="font-size:10px;opacity:0.5;margin-top:2px;">${esc(pkg.desc)}</div>`
        + note
        + updateNote
@@ -767,9 +846,65 @@ async function _fetchDependencies() {
        + _rebuildBtn
        + `<span class="cookbook-dep-tag cookbook-dep-cat">${esc(pkg.category)}</span>`
        + _statusTag(pkg, isLocal, isSystemDep, winBlocked)
-        + `</div>`;
+        + recipeCaret
+        + `</div>`
+        + recipePanel;
    };

+    // Prepend the configured venv's activate line (pip variant only) so
+    // the user sees a paste-ready sequence; Run keeps using env_prefix to
+    // activate the same venv before the pip command. Docker variant skips
+    // the activate line — `docker pull` doesn't need a venv.
+    function _recipeDisplayText(commands, variant) {
+      if (variant === 'docker') return commands.join('\n');
+      const envPath = (_envState.envPath || '').replace(/\/+$/, '');
+      const activate = envPath
+        ? `source ${envPath}${envPath.endsWith('/bin/activate') ? '' : '/bin/activate'}`
+        : '# (activate your venv first)';
+      return [activate, ...commands].join('\n');
+    }
+
+    // Per-backend recipe panel (model picker + commands + Copy/Run).
+    // Lives directly below the row it expands and starts collapsed.
+    // The model picker lists every downloaded model from _cachedModelIds
+    // (the same set the Launch tab uses); pickRecipe() then finds the
+    // best-matching recipe for whatever the user selects, with the
+    // backend's generic entry as the fallback.
+    function _recipePanelHtml(backend) {
+      const candidates = recipesForBackend(backend);
+      if (!candidates.length) return '';
+      const downloadedIds = _cachedModelIds ? Array.from(_cachedModelIds).sort() : [];
+      const modelOptions = downloadedIds.length
+        ? downloadedIds.map(id => `<option value="${esc(id)}">${esc(id)}</option>`).join('')
+        : '';
+      // "Other" entry: user types/pastes an id, OR uses the generic fallback
+      // when no models have been downloaded yet.
+      const otherOpt = `<option value="">Other (generic ${esc(backend)} install)</option>`;
+      const opts = modelOptions + otherOpt;
+      // Initial recipe: the generic fallback (matches first time, no model id).
+      const initial = pickRecipe(backend, '') || candidates[0];
+      const initialVariant = RECIPE_DEFAULT_VARIANT;
+      const initialCmds = recipeCommands(initial, initialVariant);
+      const rightActive = initialVariant === 'docker' ? ' mode-right' : '';
+      return `<div class="cookbook-dep-recipe-panel" data-dep-recipe-panel="${esc(backend)}" data-dep-recipe-active-variant="${esc(initialVariant)}" style="display:none;margin:-4px 0 8px;padding:8px 12px 10px;background:rgba(0,0,0,0.04);border:1px solid var(--border);border-top:none;border-radius:0 0 6px 6px;">
+          <div style="display:flex;align-items:center;gap:8px;margin-bottom:6px;">
+            <span style="font-size:11px;opacity:0.75;flex-shrink:0;">Serving which model?</span>
+            <select class="settings-select cookbook-dep-recipe-pick" data-dep-recipe-pick="${esc(backend)}" style="flex:1;font-size:11px;padding:3px 6px;">${opts}</select>
+            <div class="mode-toggle${rightActive}" data-dep-recipe-variants="${esc(backend)}" style="flex-shrink:0;">
+              <button type="button" class="mode-toggle-btn${initialVariant === 'pip' ? ' active' : ''}" data-dep-recipe-variant="${esc(backend)}" data-variant="pip" aria-pressed="${initialVariant === 'pip'}">Pip/uv</button>
+              <button type="button" class="mode-toggle-btn${initialVariant === 'docker' ? ' active' : ''}" data-dep-recipe-variant="${esc(backend)}" data-variant="docker" aria-pressed="${initialVariant === 'docker'}">Docker</button>
+            </div>
+          </div>
+          <div style="position:relative;">
+            <pre class="cookbook-dep-recipe-cmds" data-dep-recipe-cmds="${esc(backend)}" data-dep-recipe-install="${esc(initialCmds.join('\n'))}" style="margin:0;padding:8px 36px 8px 10px;background:rgba(0,0,0,0.08);border-radius:4px;font-size:11px;line-height:1.5;overflow-x:auto;white-space:pre;">${esc(_recipeDisplayText(initialCmds, initialVariant))}</pre>
+            <button type="button" id="recipe-copy-${esc(backend)}" class="cookbook-dep-recipe-copy" data-dep-recipe-copy="${esc(backend)}" title="Copy" aria-label="Copy" style="position:absolute;top:6px;right:6px;padding:3px 5px;background:none;border:none;color:inherit;opacity:0.7;cursor:pointer;display:inline-flex;align-items:center;"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg></button>
+          </div>
+          <div style="display:flex;gap:6px;justify-content:flex-end;margin-top:6px;">
+            <button type="button" class="cookbook-dep-tag cookbook-dep-install cookbook-dep-recipe-run" data-dep-recipe-run="${esc(backend)}" style="display:inline-flex;align-items:center;gap:4px;cursor:pointer;"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor"><polygon points="5 3 19 12 5 21 5 3"/></svg>Run</button>
+          </div>
+        </div>`;
+    }
+
    const _section = (title, note, items) =>
      items.length
        ? `<div class="cookbook-dep-section"><span class="cookbook-dep-section-title">${title}</span><span class="cookbook-dep-section-note">${note}</span></div>` + items.map(_depRow).join('')
@@ -866,7 +1001,7 @@ async function _fetchDependencies() {
    }

    // Wire install buttons (not-installed packages)
-    list.querySelectorAll('.cookbook-dep-install').forEach(btn => {
+    list.querySelectorAll('.cookbook-dep-install:not(.cookbook-dep-recipe-run)').forEach(btn => {
      btn.addEventListener('click', async (e) => {
        e.stopPropagation();
        const pipName = btn.dataset.depPip;
@@ -875,6 +1010,135 @@ async function _fetchDependencies() {
      });
    });

+    // ── Recipe panel wiring (per-backend dropdown with model + commands) ──
+    // Caret toggle: shows/hides the panel directly below the backend row.
+    list.querySelectorAll('[data-dep-recipe-toggle]').forEach(btn => {
+      btn.addEventListener('click', (e) => {
+        e.stopPropagation();
+        const backend = btn.dataset.depRecipeToggle;
+        const panel = list.querySelector(`[data-dep-recipe-panel="${CSS.escape(backend)}"]`);
+        if (!panel) return;
+        const open = panel.style.display === 'none' || !panel.style.display;
+        panel.style.display = open ? 'block' : 'none';
+        btn.setAttribute('aria-expanded', open ? 'true' : 'false');
+        const caret = btn.querySelector('svg');
+        if (caret) caret.style.transform = open ? 'rotate(180deg)' : '';
+      });
+    });
+    // Re-render the <pre> for a backend using the currently-active variant
+    // (pip / docker) and the currently-picked model. Used by every input
+    // that changes which install sequence we should show.
+    function _refreshRecipePre(backend) {
+      const panel = list.querySelector(`[data-dep-recipe-panel="${CSS.escape(backend)}"]`);
+      if (!panel) return;
+      const variant = panel.dataset.depRecipeActiveVariant || RECIPE_DEFAULT_VARIANT;
+      const sel = panel.querySelector('[data-dep-recipe-pick]');
+      const recipe = pickRecipe(backend, (sel && sel.value) || '');
+      const cmds = recipeCommands(recipe, variant);
+      const pre = panel.querySelector('[data-dep-recipe-cmds]');
+      if (pre) {
+        pre.textContent = _recipeDisplayText(cmds, variant);
+        pre.dataset.depRecipeInstall = cmds.join('\n');
+      }
+    }
+    // Model select: pickRecipe matches the model id against the catalog.
+    list.querySelectorAll('[data-dep-recipe-pick]').forEach(sel => {
+      sel.addEventListener('change', () => _refreshRecipePre(sel.dataset.depRecipePick));
+    });
+    // Variant toggle (Pip/uv vs Docker): mirrors the agent/chat mode-toggle
+    // pattern — buttons get .active, container gets .mode-right when the
+    // right slot is selected so the sliding pill animates over.
+    list.querySelectorAll('[data-dep-recipe-variant]').forEach(btn => {
+      btn.addEventListener('click', (e) => {
+        e.stopPropagation();
+        const backend = btn.dataset.depRecipeVariant;
+        const variant = btn.dataset.variant;
+        const panel = list.querySelector(`[data-dep-recipe-panel="${CSS.escape(backend)}"]`);
+        if (!panel) return;
+        panel.dataset.depRecipeActiveVariant = variant;
+        const container = panel.querySelector('.mode-toggle[data-dep-recipe-variants]');
+        if (container) container.classList.toggle('mode-right', variant === 'docker');
+        panel.querySelectorAll('[data-dep-recipe-variant]').forEach(b => {
+          const on = b.dataset.variant === variant;
+          b.classList.toggle('active', on);
+          b.setAttribute('aria-pressed', on ? 'true' : 'false');
+        });
+        _refreshRecipePre(backend);
+      });
+    });
+    // Copy: drop the visible command block on the clipboard.
+    list.querySelectorAll('[data-dep-recipe-copy]').forEach(btn => {
+      btn.addEventListener('click', async (e) => {
+        e.stopPropagation();
+        const backend = btn.dataset.depRecipeCopy;
+        const pre = list.querySelector(`[data-dep-recipe-cmds="${CSS.escape(backend)}"]`);
+        if (!pre) return;
+        try {
+          await navigator.clipboard.writeText(pre.textContent);
+          uiModule.showToast('Copied');
+        } catch {
+          // Fallback for non-secure contexts: select the pre's text so
+          // the user can Ctrl+C themselves.
+          const sel = window.getSelection(); const range = document.createRange();
+          range.selectNodeContents(pre); sel.removeAllRanges(); sel.addRange(range);
+        }
+      });
+    });
+    // Run: launch the install command(s) as a tmux task on the currently-
+    // selected deps server. Activation comes from env_prefix (same plumbing
+    // the Install button uses) so the install lands in the configured venv
+    // instead of a fresh .venv in some random CWD.
+    list.querySelectorAll('[data-dep-recipe-run]').forEach(btn => {
+      btn.addEventListener('click', async (e) => {
+        e.stopPropagation();
+        const backend = btn.dataset.depRecipeRun;
+        const pre = list.querySelector(`[data-dep-recipe-cmds="${CSS.escape(backend)}"]`);
+        if (!pre) return;
+        // Use the install-only command list (no activate line) — the
+        // displayed source line is for the user's reading; env_prefix
+        // handles it for the actual run.
+        const installRaw = pre.dataset.depRecipeInstall || pre.textContent;
+        const cmd = installRaw.split('\n').map(s => s.trim()).filter(Boolean).join(' && ');
+        const depsSel = document.getElementById('hwfit-deps-server');
+        if (depsSel) _applyServerSelection(depsSel.value);
+        const targetHost = _envState.remoteHost || 'local';
+        // Build env_prefix from the configured envPath (matches _installDep).
+        let envPrefix = '';
+        if (_envState.env === 'venv' && _envState.envPath) {
+          const p = _envState.envPath;
+          envPrefix = 'source ' + _shellQuote(p.endsWith('/bin/activate') ? p : p + '/bin/activate');
+        } else if (_envState.env === 'conda' && _envState.envPath) {
+          envPrefix = 'eval "$(conda shell.bash hook)" && conda activate ' + _shellQuote(_envState.envPath);
+        }
+        const reqBody = {
+          repo_id: `${backend} setup`,
+          cmd: cmd,
+          remote_host: _envState.remoteHost || undefined,
+          ssh_port: _getPort(_envState.remoteHost) || undefined,
+          env_prefix: envPrefix || undefined,
+          platform: _envState.platform || undefined,
+        };
+        try {
+          const res = await fetch('/api/model/serve', {
+            method: 'POST', credentials: 'same-origin',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify(reqBody),
+          });
+          const data = await res.json().catch(() => ({}));
+          if (!res.ok || !data.ok) {
+            uiModule.showToast('Run failed: ' + String(data.detail || data.error || `HTTP ${res.status}`).slice(0, 200));
+            return;
+          }
+          const payload = { repo_id: `${backend} setup`, _cmd: cmd, remote_host: _envState.remoteHost || '', _dep: true };
+          _addTask(data.session_id, `${backend} setup`, 'download', payload);
+          uiModule.showToast(`Running ${backend} setup on ${targetHost}…`);
+        } catch (err) {
+          uiModule.showToast('Run failed: ' + err.message);
+        }
+      });
+    });
+
+
    // Wire the ⋮ menu on installed packages — currently just "Update".
    function _showDepMenu(anchor) {
      document.querySelectorAll('.cookbook-dep-menu').forEach(d => d.remove());
@@ -1404,16 +1668,49 @@ function _wireTabEvents(body) {
  const dlFoldBody = document.getElementById('cookbook-dl-tab-fold-body');
  const dlFoldChevron = document.getElementById('cookbook-dl-tab-chevron');
  if (dlFold && dlFoldBody && dlFoldChevron) {
+    const _setFolded = (folded, persist = true) => {
+      // Toggle via class so CSS transition animates the height/opacity
+      // — display:none was an instant on/off and felt jarring.
+      dlFoldBody.classList.toggle('is-folded', folded);
+      dlFoldChevron.textContent = folded ? '▸' : '▾';
+      dlFold.classList.toggle('is-folded', folded);
+      if (persist) {
+        try { localStorage.setItem('cookbook_dl_tab_folded_v1', folded ? '1' : '0'); } catch {}
+      }
+    };
    dlFold.addEventListener('click', () => {
-      const folded = dlFoldBody.style.display === 'none';
-      dlFoldBody.style.display = folded ? '' : 'none';
-      dlFoldChevron.textContent = folded ? '▾' : '▸';
-      // Toggle is-folded class on the h2 so the line under it only shows when
-      // the section is collapsed (the body's content normally provides
-      // separation; with no body visible, the line gives the h2 definition).
-      dlFold.classList.toggle('is-folded', !folded);
-      try { localStorage.setItem('cookbook_dl_tab_folded_v1', folded ? '0' : '1'); } catch {}
+      const folded = dlFoldBody.classList.contains('is-folded');
+      _setFolded(!folded);
    });
+    // Auto-fold on any downward scroll inside the cookbook modal,
+    // and auto-expand when the user scrolls all the way back to the
+    // top of whichever scroller they're in. The chevron ▸ still
+    // toggles manually.
+    const _maybeFold = () => {
+      if (dlFoldBody.classList.contains('is-folded')) return;
+      _setFolded(true, /* persist */ false);
+    };
+    const _maybeExpand = () => {
+      if (!dlFoldBody.classList.contains('is-folded')) return;
+      _setFolded(false, /* persist */ false);
+    };
+    // Capture phase so scrolls on nested scrollers (.hwfit-list,
+    // .cookbook-body, .modal-content) all hit us.
+    const _modal = dlFold.closest('#cookbook-modal') || document;
+    const _lastY = new WeakMap();
+    _modal.addEventListener('scroll', (e) => {
+      const tgt = e.target;
+      if (!tgt || typeof tgt.scrollTop !== 'number') return;
+      // Ignore scrolls that originate INSIDE the Direct Download body
+      // (e.g. the Trending models list) — those are local to the
+      // section and shouldn't auto-fold the section that owns them.
+      if (dlFoldBody.contains && (tgt === dlFoldBody || dlFoldBody.contains(tgt))) return;
+      const y = tgt.scrollTop;
+      const prev = _lastY.get(tgt) || 0;
+      if (y > prev) _maybeFold();
+      else if (y <= 0) _maybeExpand();
+      _lastY.set(tgt, y);
+    }, true);
  }
  const hfToggle = document.getElementById('cookbook-hf-latest-toggle');
  const hfArrow = document.getElementById('cookbook-hf-latest-arrow');
@@ -1571,9 +1868,9 @@ function _wireTabEvents(body) {
    document.getElementById('hwfit-server-select')?.addEventListener('change', _onServerChange);
  }

-  // Browse Ollama library — popular models from ollama.com via cached backend
-  // proxy. Click a row → fills the download input with `<name>:<size>` so the
-  // existing Download button kicks off `ollama pull`.
+  // Browse Ollama library popup removed — Engine = Ollama in the
+  // Scan / Download filter covers this use case. The handler below is a
+  // no-op now because the elements no longer exist.
  const olToggle = document.getElementById('cookbook-ollama-toggle');
  const olArrow = document.getElementById('cookbook-ollama-arrow');
  const olList = document.getElementById('cookbook-ollama-list');
@@ -1774,8 +2071,8 @@ function _renderRecipes() {

  // Tabs
  html += '<div class="cookbook-tabs">';
+  html += '<button class="cookbook-tab" data-backend="Serve"><svg width="12" height="12" viewBox="0 0 24 24" fill="currentColor" stroke="none" style="vertical-align:-1px;margin-right:3px;"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>Launch</button>';
  html += '<button class="cookbook-tab active" data-backend="Search"><svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" style="vertical-align:-1px;margin-right:3px;"><polyline points="7 14 12 19 17 14"/><line x1="12" y1="19" x2="12" y2="5"/><line x1="5" y1="21" x2="19" y2="21"/></svg>Download</button>';
-  html += '<button class="cookbook-tab" data-backend="Serve"><svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" style="vertical-align:-1px;margin-right:3px;"><rect x="2" y="2" width="20" height="8" rx="2"/><rect x="2" y="14" width="20" height="8" rx="2"/><circle cx="6" cy="6" r="1"/><circle cx="6" cy="18" r="1"/></svg>Serve</button>';
  html += '<button class="cookbook-tab" data-backend="Dependencies"><svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" style="vertical-align:-1px;margin-right:3px;"><path d="M21 16V8a2 2 0 0 0-1-1.73l-7-4a2 2 0 0 0-2 0l-7 4A2 2 0 0 0 3 8v8a2 2 0 0 0 1 1.73l7 4a2 2 0 0 0 2 0l7-4A2 2 0 0 0 21 16z"/><polyline points="3.27 6.96 12 12.01 20.73 6.96"/><line x1="12" y1="22.08" x2="12" y2="12"/></svg>Dependencies</button>';
  html += '<button class="cookbook-tab" data-backend="Settings"><svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" style="vertical-align:-1px;margin-right:3px;"><circle cx="12" cy="12" r="3"/><path d="M19.4 15a1.65 1.65 0 0 0 .33 1.82l.06.06a2 2 0 1 1-2.83 2.83l-.06-.06a1.65 1.65 0 0 0-1.82-.33 1.65 1.65 0 0 0-1 1.51V21a2 2 0 0 1-4 0v-.09A1.65 1.65 0 0 0 9 19.4a1.65 1.65 0 0 0-1.82.33l-.06.06a2 2 0 1 1-2.83-2.83l.06-.06A1.65 1.65 0 0 0 4.68 15a1.65 1.65 0 0 0-1.51-1H3a2 2 0 0 1 0-4h.09A1.65 1.65 0 0 0 4.6 9a1.65 1.65 0 0 0-.33-1.82l-.06-.06a2 2 0 1 1 2.83-2.83l.06.06A1.65 1.65 0 0 0 9 4.68a1.65 1.65 0 0 0 1-1.51V3a2 2 0 0 1 4 0v.09a1.65 1.65 0 0 0 1 1.51 1.65 1.65 0 0 0 1.82-.33l.06-.06a2 2 0 1 1 2.83 2.83l-.06.06A1.65 1.65 0 0 0 19.4 9a1.65 1.65 0 0 0 1.51 1H21a2 2 0 0 1 0 4h-.09a1.65 1.65 0 0 0-1.51 1z"/></svg>Settings</button>';
  html += '</div>';
@@ -1788,9 +2085,9 @@ function _renderRecipes() {
  // State persisted to localStorage so the fold survives reloads.
  const _dlTabFolded = (() => { try { return localStorage.getItem('cookbook_dl_tab_folded_v1') === '1'; } catch { return false; } })();
  html += '<div style="display:flex;align-items:center;gap:8px;margin-bottom:2px;">';
-  html += `<h2 id="cookbook-dl-tab-fold" class="${_dlTabFolded ? 'is-folded' : ''}" style="margin:0;padding:0;line-height:1;cursor:pointer;display:flex;align-items:center;justify-content:space-between;user-select:none;flex:1;">Download<span id="cookbook-dl-tab-chevron" style="display:inline-block;transition:transform 0.15s;font-size:1.1em;margin-left:8px;opacity:0.85;">${_dlTabFolded ? '▸' : '▾'}</span></h2>`;
+  html += `<h2 id="cookbook-dl-tab-fold" class="${_dlTabFolded ? 'is-folded' : ''}" style="margin:0;padding:0;line-height:1;cursor:pointer;display:flex;align-items:center;justify-content:space-between;user-select:none;flex:1;">Direct Download<span id="cookbook-dl-tab-chevron" style="display:inline-block;transition:transform 0.15s;font-size:1.1em;margin-left:8px;opacity:0.85;">${_dlTabFolded ? '▸' : '▾'}</span></h2>`;
  html += '</div>';
-  html += `<div id="cookbook-dl-tab-fold-body" style="${_dlTabFolded ? 'display:none;' : ''}">`;
+  html += `<div id="cookbook-dl-tab-fold-body" class="${_dlTabFolded ? 'is-folded' : ''}">`;
  html += '<p class="memory-desc doclib-desc" style="margin-top:6px;">Download from <a href="https://huggingface.co/models" target="_blank" rel="noopener" style="color:var(--accent,var(--red));text-decoration:none;"><svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-1px;margin-right:1px;"><path d="M18 13v6a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V8a2 2 0 0 1 2-2h6"/><polyline points="15 3 21 3 21 9"/><line x1="10" y1="14" x2="21" y2="3"/></svg>HuggingFace</a> by pasting model link, or download directly in the Scan section below.</p>';
  html += '<div class="hwfit-container" id="hwfit-container">';

@@ -1820,42 +2117,34 @@ function _renderRecipes() {
  // silently sending downloads to the wrong server. An empty selection means Local; the user
  // chooses a remote server explicitly via the dropdown.

-  // Manual download input
-  html += `<div style="margin-top:7px;margin-bottom:2px;display:flex;gap:4px;align-items:center;">`;
+  // Manual download input — server picker on the same row as the repo input,
+  // on the left. The standalone "add server" button is gone (use Settings).
+  html += `<div class="cookbook-dl-input" style="margin-top:7px;display:flex;gap:4px;align-items:center;">`;
  if (_es.servers.length > 1) {
-    html += `<select class="cookbook-field-input hwfit-dl-server" id="hwfit-dl-server" style="height:28px;position:relative;top:0px;">`;
+    html += `<select class="cookbook-field-input hwfit-dl-server" id="hwfit-dl-server" style="height:28px;flex-shrink:0;">`;
    html += _buildServerOpts(true);
    html += `</select>`;
  } else {
    html += `<input type="hidden" id="hwfit-dl-server" value="local" />`;
  }
-  html += `<button class="memory-toolbar-btn cookbook-dl-add-server" title="Add server in Settings" style="height:28px;">add server</button>`;
-  html += `</div>`;
-  html += `<div class="cookbook-dl-input" style="margin-top:0;">`;
-  html += `<input type="text" class="cookbook-dl-repo" id="cookbook-dl-repo" placeholder="org/model-name, qwen2.5:14b, or HF URL" />`;
+  html += `<input type="text" class="cookbook-dl-repo" id="cookbook-dl-repo" placeholder="org/model-name, qwen2.5:14b, or HF URL" style="flex:1;min-width:0;" />`;
  html += `<button class="cookbook-btn cookbook-dl-btn" id="cookbook-dl-btn">Download</button>`;
  html += `</div>`;
-  // Browse Ollama library — fetches popular models from ollama.com via the
-  // /api/cookbook/ollama/library cached proxy, click → fills the input with
-  // `<name>:<size>` so the existing Download button kicks off `ollama pull`.
-  html += `<div style="margin-top:5px;position:relative;top:-3px;">`;
-  html += `<div style="display:flex;gap:4px;align-items:center;">`;
-  html += `<button type="button" class="memory-toolbar-btn" id="cookbook-ollama-toggle" style="flex:1;text-align:left;height:26px;display:flex;align-items:center;gap:6px;border-radius:4px;">`;
-  html += `<span id="cookbook-ollama-arrow" style="display:inline-block;transition:transform 0.15s;pointer-events:none;">▸</span>`;
-  html += `<span style="pointer-events:none;">Browse Ollama library</span>`;
-  html += `</button>`;
-  html += `<button type="button" class="memory-toolbar-btn" id="cookbook-ollama-refresh" title="Refresh" style="height:26px;width:26px;padding:0;border-radius:4px;">↻</button>`;
-  html += `</div>`;
-  html += `<div id="cookbook-ollama-list" style="display:none;margin-top:4px;max-height:320px;overflow-y:auto;flex-direction:column;gap:4px;"></div>`;
-  html += `</div>`;
+  // Ollama-library browse used to live here as its own collapsible dropdown,
+  // but that duplicated the Engine filter (which already has Ollama). The
+  // standalone UI is gone — to find Ollama models, set Engine = Ollama in
+  // the Scan / Download section below.
  // Latest HF models that fit — collapsible card list
-  html += `<div style="margin-top:5px;position:relative;top:-3px;">`;
+  html += `<div style="margin-top:5px;position:relative;top:-11px;">`;
  html += `<div style="display:flex;gap:4px;align-items:center;">`;
-  html += `<button type="button" class="memory-toolbar-btn" id="cookbook-hf-latest-toggle" style="flex:1;text-align:left;height:26px;display:flex;align-items:center;gap:6px;border-radius:4px;">`;
-  html += `<span id="cookbook-hf-latest-arrow" style="display:inline-block;transition:transform 0.15s;pointer-events:none;">\u25B8</span>`;
-  html += `<span style="pointer-events:none;">Trending models that fit your hardware</span>`;
+  html += `<button type="button" class="memory-toolbar-btn" id="cookbook-hf-latest-toggle" style="flex:1;text-align:left;height:28px;font-size:11px;display:flex;align-items:center;gap:6px;border-radius:5px;">`;
+  // Trending-up icon (accent) so the section reads as "what's hot".
+  html += `<svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="var(--accent, var(--red))" stroke-width="2.2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="flex-shrink:0;pointer-events:none;"><polyline points="23 6 13.5 15.5 8.5 10.5 1 18"/><polyline points="17 6 23 6 23 12"/></svg>`;
+  html += `<span style="pointer-events:none;flex:1;">Trending models that fit your hardware</span>`;
+  // Chevron moved to the RIGHT \u2014 collapsed = pointing right, expanded
+  // = rotated 90deg into a down chevron (handled by existing toggle CSS).
+  html += `<span id="cookbook-hf-latest-arrow" style="display:inline-block;transition:transform 0.15s;pointer-events:none;opacity:0.6;font-size:11px;">\u25B8</span>`;
  html += `</button>`;
-  html += `<button type="button" class="memory-toolbar-btn" id="cookbook-hf-latest-refresh" title="Refresh" style="height:26px;width:26px;padding:0;border-radius:4px;">\u21BB</button>`;
  html += `</div>`;
  html += `<div id="cookbook-hf-latest-list" style="display:none;margin-top:4px;max-height:320px;overflow-y:auto;flex-direction:column;gap:4px;"></div>`;
  html += `</div>`;
@@ -1876,9 +2165,10 @@ function _renderRecipes() {
  // Image tab removed — text→image gen is gone from this build (only inpaint
   // remains, which uses its own settings panel). Vision (multimodal) stays.
  html += '<option value="multimodal">Vision</option></select>';
-  // Engine sits next to the type filter so the "what category / which serving
-  // path" filters live together; Quant + Context are storage-format and budget
-  // levers, grouped to the right.
+  // Search moved next to the Type filter so the two primary picks
+  // (what category + free text) sit together; the more advanced
+  // levers (Engine / Quant / Context) live to the right.
+  html += '<input type="text" class="cookbook-field-input hwfit-search" id="hwfit-search" placeholder="Search models..." style="flex:1;" />';
  html += '<span class="hwfit-engine-wrap">';
  html += '<select class="cookbook-field-input hwfit-engine" id="hwfit-engine" style="height:28px;" title="Filter by serving engine">';
  html += '<option value="">Engine</option>';
@@ -1893,7 +2183,7 @@ function _renderRecipes() {
  // quant for every model instead of silently filtering to Q4.
  html += '<span class="hwfit-quant-wrap">';
  html += '<select class="cookbook-field-input hwfit-quant" id="hwfit-quant" style="height:28px;">';
-  html += '<option value="" selected>Quant: All</option>';
+  html += '<option value="" selected>Quant</option>';
  html += '<option value="Q4_K_M">Q4</option><option value="Q8_0">Q8</option>';
  html += '<option value="Q6_K">Q6</option><option value="Q5_K_M">Q5</option>';
  html += '<option value="Q3_K_M">Q3</option><option value="Q2_K">Q2</option>';
@@ -1906,21 +2196,19 @@ function _renderRecipes() {
  html += '<label class="hwfit-ctx-control" title="Context length for fit estimates. Lower it to find more models that could fit your hardware.">';
  html += '<span>Context</span><span class="hwfit-help-chip hwfit-help-chip-inline" title="Context length. Lower it to find more models that could fit your hardware; raise it when you need longer chats or documents.">?</span><input type="range" id="hwfit-context" min="0" max="5" step="1" value="3" />';
  html += '<output id="hwfit-context-label">50k</output></label>';
-  // Search lives at the far right of the toolbar so the controls (Type/Quant/
-  // Engine/Context) read as a row of compact filters followed by free-text.
-  html += '<input type="text" class="cookbook-field-input hwfit-search" id="hwfit-search" placeholder="Search models..." style="flex:1;" />';
  html += '</div>';
  html += '<div class="hwfit-toolbar" style="margin-top:7px;">';
  html += '<select class="cookbook-field-input hwfit-server-select" id="hwfit-server-select" style="height:28px;min-width:88px;position:relative;top:0px;">';
  html += _buildServerOpts(false);
  html += '</select>';
  html += '<div class="hwfit-gpu-toggles" id="hwfit-gpu-toggles"></div>';
-  // Scan/refresh button (icon-only) where the quant dropdown used to sit.
-  html += '<button type="button" class="hwfit-gpu-btn" id="hwfit-rescan" title="Re-scan hardware" style="flex-shrink:0;position:relative;top:-3px;left:-1px;">↻ RESCAN</button>';
+  // (Rescan button removed — Edit handles manual hardware updates;
+  // automatic re-probe runs on container restart.)
  html += '<button type="button" class="hwfit-gpu-btn hwfit-hw-manual-btn" id="hwfit-hw-manual-btn" title="Set hardware manually" style="flex-shrink:0;position:relative;top:-3px;left:-1px;display:inline-flex;align-items:center;gap:3px;"><svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.2" stroke-linecap="round" stroke-linejoin="round" style="flex-shrink:0;"><path d="M12 20h9"/><path d="M16.5 3.5a2.121 2.121 0 0 1 3 3L7 19l-4 1 1-4Z"/></svg>EDIT</button>';
  // Sort state — the clickable column headers read/write this (pewds' original
  // sort paradigm). Newest is reachable by clicking the Model column header.
  html += '<select class="cookbook-field-input hwfit-sort" id="hwfit-sort" style="display:none">';
+  html += '<option value="newest" selected>Latest</option>';
  html += '<option value="fit">Fit</option><option value="score">Score</option><option value="vram">VRAM</option>';
  html += '<option value="speed">Speed</option><option value="params">Params</option>';
  html += '<option value="context">Context</option></select>';
@@ -808,7 +808,7 @@ function _winSessionCmd(task, tmuxArgs) {
  return host ? `ssh ${pf}${host} 'tmux ${tmuxArgs}' 2>/dev/null` : `tmux ${tmuxArgs} 2>/dev/null`;
 }

-function _tmuxGracefulKill(task) {
+export function _tmuxGracefulKill(task) {
  if (_isWindows(task)) {
    const host = task.remoteHost;
    const sd = host ? '$env:TEMP\\odysseus-sessions' : '$env:TEMP\\odysseus-tmux';
@@ -825,6 +825,48 @@ function _tmuxGracefulKill(task) {
  return `tmux send-keys -t ${task.sessionId} C-c 2>/dev/null; sleep 2; tmux kill-session -t ${task.sessionId} 2>/dev/null`;
 }

+// Force-kill escalation: SIGKILL the tmux pane's owning PID and any children,
+// then nuke the session. Use AFTER the graceful kill when the process is
+// still detected — vLLM sometimes ignores SIGINT during model init, and a
+// stuck CUDA context can survive `tmux kill-session` alone.
+export function _tmuxForceKill(task) {
+  if (_isWindows(task)) {
+    // Windows graceful path already does Stop-Process -Force, so the same
+    // command serves as the "force" variant.
+    return _tmuxGracefulKill(task);
+  }
+  const sid = task.sessionId;
+  const inner =
+    `PIDS=$(tmux list-panes -t ${sid} -F "#{pane_pid}" 2>/dev/null); ` +
+    `if [ -n "$PIDS" ]; then ` +
+    `  for P in $PIDS; do ` +
+    `    pkill -KILL -P "$P" 2>/dev/null; ` +
+    `    kill -9 "$P" 2>/dev/null; ` +
+    `  done; ` +
+    `fi; ` +
+    `tmux kill-session -t ${sid} 2>/dev/null`;
+  if (task.remoteHost) {
+    return `ssh ${_sshPrefix(_getPort(task))}${task.remoteHost} ${_shQuote(inner)}`;
+  }
+  return inner;
+}
+
+// Returns a shell snippet that prints "ALIVE" if the tmux session still
+// exists (or its main PID is still listed in /proc), "DEAD" otherwise.
+// Used by the Stop-all escalation to decide whether to force-kill.
+export function _tmuxIsAliveCheck(task) {
+  if (_isWindows(task)) {
+    // Skip the check on Windows — the graceful path already force-kills.
+    return null;
+  }
+  const sid = task.sessionId;
+  const inner = `if tmux has-session -t ${sid} 2>/dev/null; then echo ALIVE; else echo DEAD; fi`;
+  if (task.remoteHost) {
+    return `ssh ${_sshPrefix(_getPort(task))}${task.remoteHost} ${_shQuote(inner)}`;
+  }
+  return inner;
+}
+
 function _shQuote(value) {
  return "'" + String(value ?? '').replace(/'/g, "'\\''") + "'";
 }
@@ -1643,7 +1685,7 @@ export function _renderRunningTab() {
    runTab.className = 'cookbook-tab';
    runTab.dataset.backend = 'Running';
    const _errCount = tasks.filter(t => t.status === 'error' || t.status === 'crashed').length;
-    runTab.innerHTML = `Running${activeCountHtml}${_errCount ? `<span class="cookbook-tab-error-dot"></span>` : ''}`;
+    runTab.innerHTML = `Active${activeCountHtml}${_errCount ? `<span class="cookbook-tab-error-dot"></span>` : ''}`;
    tabBar.insertBefore(runTab, tabBar.firstChild);
    runTab.addEventListener('click', () => {
      tabBar.querySelectorAll('.cookbook-tab').forEach(t => t.classList.remove('active'));
@@ -1654,7 +1696,7 @@ export function _renderRunningTab() {
    });
  } else if (runTab) {
    const _errCount2 = tasks.filter(t => t.status === 'error' || t.status === 'crashed').length;
-    runTab.innerHTML = tasks.length ? `Running${activeCountHtml}${_errCount2 ? '<span class="cookbook-tab-error-dot"></span>' : ''}` : 'Running';
+    runTab.innerHTML = tasks.length ? `Active${activeCountHtml}${_errCount2 ? '<span class="cookbook-tab-error-dot"></span>' : ''}` : 'Active';
    if (!hasContent) {
      if (runTab.classList.contains('active')) {
        const wfTab = tabBar.querySelector('.cookbook-tab[data-backend="Search"]');
@@ -1669,9 +1711,13 @@ export function _renderRunningTab() {
    group = document.createElement('div');
    group.className = 'cookbook-group hidden';
    group.dataset.backendGroup = 'Running';
-    group.innerHTML = '<div class="admin-card" style="flex:1;display:flex;flex-direction:column;overflow:hidden;">' +
+    // No `flex:1` on the card — with overflow:visible (forced via #cookbook-modal
+    // .cookbook-group > .admin-card), flex:1 collapsed the card to body height
+    // and the body's scrollHeight stopped tracking the overflowing children.
+    // Sized-to-content means cookbook-body's overflow-y:auto kicks in naturally.
+    group.innerHTML = '<div class="admin-card" style="display:flex;flex-direction:column;">' +
      '<div style="display:flex;align-items:baseline;gap:8px;margin-bottom:2px;">' +
-      '<h2 style="margin:0;padding:0;line-height:1;">Running <span id="running-count" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal">' + activeCount + '</span></h2>' +
+      '<h2 style="margin:0;padding:0;line-height:1;">Active <span id="running-count" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal">' + activeCount + '</span></h2>' +
      '</div>' +
      '<p class="memory-desc doclib-desc" style="margin-top:6px;">Active downloads and serving processes.</p>' +
      '</div>';
@@ -1751,7 +1797,7 @@ export function _renderRunningTab() {
      // green when reachable, red if any serve task on it is crashed/unreachable.
      const _secDot = (key && allTasks.some(_serveTaskFailed)) ? 'fail' : 'ok';
      const _dotTitle = key ? (_secDot === 'fail' ? 'Server not responding' : 'Reachable') : 'Local (this machine)';
-      sec.insertAdjacentHTML('afterbegin', `<div class="cookbook-section-header" data-collapse="${bodyId}"><svg class="cookbook-section-chevron" width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round"><polyline points="6 9 12 15 18 9"/></svg><span class="cookbook-srv-status ${_secDot}" title="${_dotTitle}" style="flex-shrink:0;position:relative;top:0px;"></span><span class="cookbook-section-title" style="margin:0;">${esc(sg.name)}</span><button class="cookbook-btn cookbook-stop-all-btn" data-stop-server="${esc(key)}">Stop all</button><button class="cookbook-btn cookbook-clear-btn" data-clear-server="${esc(key)}">Clear finished</button></div><div id="${bodyId}" class="cookbook-section-body"></div>`);
+      sec.insertAdjacentHTML('afterbegin', `<div class="cookbook-section-header" data-collapse="${bodyId}"><svg class="cookbook-section-chevron" width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round"><polyline points="6 9 12 15 18 9"/></svg><span class="cookbook-srv-status ${_secDot}" title="${_dotTitle}" style="flex-shrink:0;position:relative;top:0px;"></span><span class="cookbook-section-title" style="margin:0;">${esc(sg.name)}</span><button class="cookbook-btn cookbook-stop-all-btn" data-stop-server="${esc(key)}" title="Stop all running servers"><svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" stroke="none" aria-hidden="true" style="vertical-align:-1px;margin-right:4px;"><rect x="5" y="5" width="14" height="14" rx="1.5"/></svg>Stop all</button><button class="cookbook-btn cookbook-clear-btn" data-clear-server="${esc(key)}" title="Clear finished tasks"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="vertical-align:-1px;margin-right:4px;"><polyline points="3 6 5 6 21 6"/><path d="M19 6l-1 14a2 2 0 0 1-2 2H8a2 2 0 0 1-2-2L5 6"/><path d="M10 11v6"/><path d="M14 11v6"/></svg>Clear finished</button></div><div id="${bodyId}" class="cookbook-section-body"></div>`);
    }
  }

@@ -1762,9 +1808,21 @@ export function _renderRunningTab() {
    btn.addEventListener('click', async (e) => {
      e.stopPropagation();  // don't toggle the section collapse (was an inline onclick, blocked by CSP)
      const host = btn.dataset.clearServer;
-      if (!await window.styledConfirm(`Clear finished tasks on ${_serverName(host)}?`, { confirmText: 'Clear' })) return;
      const allTasks = _loadTasks();
      const toRemove = allTasks.filter(t => (t.remoteHost || '') === host && _canClearTask(t));
+      // Bail with a clear message instead of silently doing nothing when
+      // every task on this server is still running (nothing finished to
+      // clear yet) — the previous behavior looked like the button was dead.
+      if (!toRemove.length) {
+        const stillRunning = allTasks.filter(t => (t.remoteHost || '') === host && t.status === 'running').length;
+        const _msg = stillRunning
+          ? `No finished tasks on ${_serverName(host)} — ${stillRunning} still running. Stop them first to clear.`
+          : `No finished tasks on ${_serverName(host)}.`;
+        if (window.uiModule?.showToast) window.uiModule.showToast(_msg);
+        else alert(_msg);
+        return;
+      }
+      if (!await window.styledConfirm(`Clear ${toRemove.length} finished task${toRemove.length === 1 ? '' : 's'} on ${_serverName(host)}?`, { confirmText: 'Clear' })) return;
      const remaining = allTasks.filter(t => (t.remoteHost || '') !== host || !_canClearTask(t));
      _saveTasks(remaining);
      // Fade/slide each finished card out (same exit as the per-card clear)
@@ -2100,57 +2158,43 @@ export function _renderRunningTab() {
        dropdown.className = 'cookbook-task-dropdown';

        const items = [];
+        // ── Run section ─────────────────────────────────────────────
        // Queued download: let the user jump the queue and start it immediately
        // (downloads otherwise run one-at-a-time per server).
        if (task.type === 'download' && task.status === 'queued') {
-          items.push({ label: 'Start now', action: 'start-now', custom: () => {
+          items.push({ group: 'run', label: 'Start now', action: 'start-now', custom: () => {
            _startQueuedDownload(task);
            _renderRunningTab();
          }});
        }
        if (task.status !== 'running' && task.status !== 'queued') {
-          items.push({ label: 'Reconnect', action: 'reconnect' });
+          items.push({ group: 'run', label: 'Reconnect tmux', action: 'reconnect' });
        }
        if (task.status === 'running') {
-          items.push({ label: 'Stop', action: 'stop', danger: true });
+          items.push({ group: 'run', label: 'Stop', action: 'stop', danger: true });
        }
-        items.push({ label: 'Restart', action: 'retry' });
-        // Edit serve — open the full serve panel (same as the edit icon),
-        // switching to this task's server first so the model is found.
+        items.push({ group: 'run', label: 'Restart', action: 'retry' });
+        // ── Edit section ────────────────────────────────────────────
+        // Merged "Edit & relaunch" — opens the structured serve panel
+        // pre-filled with this task's config. The old standalone "Edit
+        // cmd & relaunch" raw-text dialog is now reachable from inside
+        // that panel (Show command). Single entry-point per task.
        if (task.type === 'serve' && task.payload?.repo_id) {
-          items.push({ label: 'Edit in serve panel', action: 'edit-panel', tooltip: 'Open the full Serve config panel pre-filled with this task — pick a different backend, change GPUs, edit env vars, then Launch from there', custom: () => _openEdit() });
+          items.push({ group: 'edit', label: 'Edit & relaunch', action: 'edit-panel', tooltip: 'Open the Serve config panel pre-filled with this task — pick a different backend, change GPUs, edit env vars or the raw cmd, then Launch.', custom: () => _openEdit() });
        }
-        // Save serve — save current launch config as a preset.
        if (task.type === 'serve' && task.payload?._cmd) {
-          items.push({ label: 'Save serve', action: 'save', custom: () => {
+          items.push({ group: 'edit', label: 'Save serve', action: 'save', custom: () => {
            if (!_saveTaskAsPreset(task)) { uiModule.showToast('Already saved'); return; }
            uiModule.showToast('Saved to presets');
            _renderRunningTab();
          }});
        }
-        // Edit command — only meaningful for serve tasks that aren't running.
-        // Lets the user tweak flags after a crash/error and relaunch.
-        if (task.type === 'serve' && task.status !== 'running' && task.payload?._cmd) {
-          items.push({ label: 'Edit cmd & relaunch', action: 'edit', tooltip: 'Edit the raw vllm/llama-server cmd string in a dialog and relaunch immediately on the same host', custom: async () => {
-            const newCmd = await _promptEditServeCmd(task.payload._cmd);
-            if (newCmd == null) return; // cancelled
-            try {
-              await fetch('/api/shell/exec', {
-                method: 'POST', credentials: 'same-origin',
-                headers: { 'Content-Type': 'application/json' },
-                body: JSON.stringify({ command: _tmuxGracefulKill(task) }),
-              });
-            } catch {}
-            _removeTask(task.sessionId);
-            // Relaunch on the task's OWN host, not the current global selection.
-            _launchServeTask(task.name, task.payload.repo_id, newCmd, task.payload._fields, task.remoteHost || '');
-          }});
-        }
+        // ── Endpoint section ────────────────────────────────────────
        // Manual endpoint registration — fallback for when auto-add fails
        // (e.g. probe timeout on a remote that's slow). Forces adding this
        // serve to the model-endpoints list regardless of prior flag state.
        if (task.type === 'serve' && task.payload?._cmd) {
-          items.push({ label: 'Register endpoint', action: 'register-endpoint', custom: async () => {
+          items.push({ group: 'endpoint', label: 'Register endpoint', action: 'register-endpoint', custom: async () => {
            const host = _connectHostFromRemote(task.remoteHost);
            const portMatch = task.payload?._cmd?.match(/--port\s+(\d+)/);
            const port = portMatch ? portMatch[1] : '8000';
@@ -2195,31 +2239,32 @@ export function _renderRunningTab() {
            }
          }});
        }
+        // ── Copy section ────────────────────────────────────────────
        if (_isWindows(task)) {
          const host = task.remoteHost;
          const sd = host ? '$env:TEMP\\odysseus-sessions' : '$env:TEMP\\odysseus-tmux';
          const logCmd = host
            ? `ssh ${_sshPrefix(_getPort(task))}${host} "powershell -Command \\"Get-Content '${sd}\\${task.sessionId}.log' -Wait\\""`
            : `powershell -Command "Get-Content (Join-Path $env:TEMP 'odysseus-tmux\\${task.sessionId}.log') -Wait"`;
-          items.push({ label: 'Copy log cmd', action: 'copy-tmux', custom: () => {
+          items.push({ group: 'copy', label: 'Copy log cmd', action: 'copy-tmux', custom: () => {
            _copyText(logCmd);
          }});
        } else {
          // Just the tmux command itself — no ssh wrapper.
          const tmuxAttach = `tmux attach -t ${task.sessionId}`;
-          items.push({ label: 'Copy tmux', action: 'copy-tmux', custom: () => {
+          items.push({ group: 'copy', label: 'Copy tmux', action: 'copy-tmux', custom: () => {
            _copyText(tmuxAttach);
          }});
        }
        if (_shouldOfferCrashReport(task)) {
-          items.push({ label: 'Copy crash report', action: 'copy-crash-report', custom: () => {
+          items.push({ group: 'copy', label: 'Copy crash report', action: 'copy-crash-report', custom: () => {
            const out = (el.querySelector('.cookbook-output-pre')?.textContent || task.output || '');
            _copyText(_buildCrashReport(task, out));
            uiModule.showToast('Copied crash report');
          }});
        }
        // Copy the last 50 lines of the task's output/log.
-        items.push({ label: 'Copy last 50 lines', action: 'copy-log', custom: () => {
+        items.push({ group: 'copy', label: 'Copy last 50 lines', action: 'copy-log', custom: () => {
          const out = (el.querySelector('.cookbook-output-pre')?.textContent || task.output || '');
          const last = out.split('\n').slice(-50).join('\n');
          if (!last.trim()) {
@@ -2233,8 +2278,10 @@ export function _renderRunningTab() {
        // the live tmux session and (for serve tasks) deletes the
        // matching model-endpoint, THEN animates the task card out.
        // Just "Remove" hid that it stops the live serve too.
+        // ── Danger section ──────────────────────────────────────────
        const _isLive = task.type === 'serve' && ['running', 'ready', 'loading', 'warming', 'starting'].includes(task.status || '');
        items.push({
+          group: 'danger',
          label: _isLive ? 'Stop and remove' : 'Remove',
          action: 'kill',
          tooltip: _isLive
@@ -2242,10 +2289,8 @@ export function _renderRunningTab() {
            : 'Remove this row',
          danger: true,
        });
-        // Cancel = mobile-only dismiss item. Same pattern as the email kebab:
-        // the `dropdown-cancel-mobile` class is hidden on desktop and styled
-        // as a separated bottom row on mobile (border-top + extra padding).
-        items.push({ label: 'Cancel', action: 'cancel', mobileOnly: true, custom: () => {} });
+        // Cancel = mobile-only dismiss item. Same pattern as the email kebab.
+        items.push({ group: 'danger', label: 'Cancel', action: 'cancel', mobileOnly: true, custom: () => {} });

        const _MENU_ICONS = {
          'start-now': '<polygon points="6 4 20 12 6 20 6 4"/>',
@@ -2262,7 +2307,18 @@ export function _renderRunningTab() {
          kill: '<path d="M3 6h18"/><path d="M19 6v14a2 2 0 0 1-2 2H7a2 2 0 0 1-2-2V6m3 0V4a2 2 0 0 1 2-2h4a2 2 0 0 1 2 2v2"/>',
          cancel: '<line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/>',
        };
+        let _lastGroup = null;
        for (const item of items) {
+          // Insert a thin divider whenever the group changes, so the
+          // user can visually scan Run / Edit / Endpoint / Copy / Danger
+          // blocks instead of one long undifferentiated list.
+          if (item.group && _lastGroup && item.group !== _lastGroup) {
+            const sep = document.createElement('div');
+            sep.className = 'cookbook-dropdown-divider';
+            sep.style.cssText = 'height:1px;margin:4px 6px;background:color-mix(in srgb, var(--fg) 12%, transparent);pointer-events:none;';
+            dropdown.appendChild(sep);
+          }
+          _lastGroup = item.group || _lastGroup;
          const div = document.createElement('div');
          div.className = 'dropdown-item-compact'
            + (item.danger ? ' cookbook-dropdown-danger' : '')
@@ -2652,7 +2708,7 @@ async function _reconnectTask(el, task) {
              // capture-pane lets the existing _reconnectTask flow pick up
              // the real state (running, finished, or truly dead).
              const _reconnectFix = {
-                label: 'Reconnect',
+                label: 'Reconnect tmux',
                action: () => {
                  _updateTask(task.sessionId, { status: 'running' });
                  el.dataset.status = 'running';
@@ -9,6 +9,7 @@ import spinnerModule from './spinner.js';
 import { providerLogo } from './providers.js';
 import { modelColor } from './chatRenderer.js';
 import { bindMenuDismiss, dismissOrRemove } from './escMenuStack.js';
+import { openCookbookDependencies } from './cookbook-diagnosis.js';

 // Shared state/functions injected by init()
 let _envState;
@@ -546,7 +547,14 @@ function _rerenderCachedModels() {
          : (_es.gpus || detectedGpuIds));
      const tpOpts = [1,2,4,8].map(n => `<option${defaultTp==String(n)?' selected':''}>${n}</option>`).join('');
      const dtypeOpts = ['auto','float16','bfloat16'].map(d => `<option value="${d}"${sv('dtype','auto')===d?' selected':''}>${d}</option>`).join('');
-      const vllmKvCacheOpts = ['auto','fp8'].map(d => `<option value="${d}"${sv('vllm_kv_cache_dtype','auto')===d?' selected':''}>${d}</option>`).join('');
+      // KV cache default — most models are fine on auto, but a few
+      // (e.g. DeepSeek-V3/V4/R1 MoE) need fp8 explicitly or the launch
+      // OOMs. _detectModelOptimizations seeds opts.kvCacheDtype for
+      // those families; honour it unless the user has a saved override.
+      const _kvOptsCheck = _detectModelOptimizations(repo);
+      const _kvAutoDefault = (_kvOptsCheck && _kvOptsCheck.kvCacheDtype) || 'auto';
+      const _kvSelected = sv('vllm_kv_cache_dtype', _kvAutoDefault);
+      const vllmKvCacheOpts = ['auto','fp8'].map(d => `<option value="${d}"${_kvSelected===d?' selected':''}>${d}</option>`).join('');
      const _l = (name, tip) => `<span>${name}<span class="hwfit-hint" title="${tip}">?</span></span>`;
      const _ggufChoices = _runnableGgufFiles(m);
      const _savedGguf = String(sv('gguf_file', '') || '');
@@ -572,12 +580,22 @@ function _rerenderCachedModels() {
      const _arrowTitle = _modelPresets.length > 0
        ? `${_modelPresets.length} saved launch config${_modelPresets.length === 1 ? '' : 's'} for ${_repoShort} — click ▾ to load or delete`
        : `No saved launch configs for ${_repoShort} yet — click Save to add one`;
-      let _slotsHtml = `<div class="cookbook-serve-slots cookbook-saved-split">`
+      // Wrap the Save split in a <label> so it picks up the same "field
+      // title + ?-help" treatment as Backend / venv / Port / GPUs sitting
+      // beside it in Row 1. Button text is "Save" (the action), label is
+      // "Settings" (what the saved blob represents).
+      let _slotsHtml = `<label>${_l('Settings','Saved launch configurations for this model — click ▾ to load or delete')}`
+        + `<div class="cookbook-serve-slots cookbook-saved-split">`
        + `<button type="button" class="cookbook-slot-btn cookbook-saved-save" title="Save current config"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M19 21H5a2 2 0 0 1-2-2V5a2 2 0 0 1 2-2h11l5 5v11a2 2 0 0 1-2 2z"/><polyline points="17 21 17 13 7 13 7 21"/><polyline points="7 3 7 8 15 8"/></svg>Save</button>`
        + `<button type="button" class="cookbook-slot-btn cookbook-saved-arrow" title="${esc(_arrowTitle)}">${_arrowLabel}</button>`
-        + `</div>`;
+        + `</div></label>`;

      let panelHtml = `<div class="hwfit-serve-panel">`;
+      // Runtime-readiness note pinned at the top of the serve area so the
+      // user sees "vLLM ready on …" before scrolling into the configure
+      // form. Hidden until the readiness probe returns. The × button
+      // dismisses it for this panel only (re-shows on re-expand).
+      panelHtml += `<div class="hwfit-serve-runtime-note" style="display:none;font-size:11px;line-height:1.35;color:var(--fg-muted);margin:0 0 8px;padding:6px 28px 6px 10px;border-radius:5px;background:color-mix(in srgb, var(--fg) 4%, transparent);border:1px solid color-mix(in srgb, var(--border) 60%, transparent);position:relative;"><span class="hwfit-serve-runtime-text"></span><button type="button" class="hwfit-serve-runtime-close" title="Dismiss" aria-label="Dismiss" style="position:absolute;top:-8px;right:5px;background:none;border:0;color:inherit;cursor:pointer;padding:2px 4px;line-height:1;font-size:13px;opacity:0.6;"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" aria-hidden="true"><line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/></svg></button></div>`;
      // Warn when serving a model whose download hasn't fully completed —
      // the user CAN still hit Launch (vLLM/llama-server will start, then
      // crash trying to read missing shards), but they should know.
@@ -596,9 +614,19 @@ function _rerenderCachedModels() {
        ? [['llamacpp','llama.cpp'],['ollama','Ollama']]
        : [['vllm','vLLM'],['sglang','SGLang'],['llamacpp','llama.cpp'],['ollama','Ollama'],['diffusers','Diffusers']];
      const backendOpts = _backendChoices.map(([v,l]) => `<option value="${v}"${defaultBackend===v?' selected':''}>${l}</option>`).join('');
-      panelHtml += `<label>${_l('Backend','Inference engine: vLLM, SGLang, llama.cpp, Ollama, or Diffusers')}<select class="hwfit-sf" data-field="backend">${backendOpts}</select></label>`;
+      // Custom Backend picker — native <select> can't host SVG inside
+      // options, so we render a button + menu that show the backend logo
+      // beside its name. The hidden <select.hwfit-sf data-field="backend">
+      // stays as the source-of-truth so every existing change handler
+      // (updateBackendVisibility, runtime readiness, command builder)
+      // still fires via dispatchEvent('change') on selection.
+      panelHtml += `<label>${_l('Backend','Inference engine: vLLM, SGLang, llama.cpp, Ollama, or Diffusers')}<div class="hwfit-backend-picker" data-backend-picker style="position:relative;width:100%;"><select class="hwfit-sf hwfit-backend-source" data-field="backend" style="display:none;">${backendOpts}</select><button type="button" class="hwfit-backend-btn" data-backend-btn aria-haspopup="listbox" aria-expanded="false" style="display:flex;align-items:center;gap:6px;width:100%;height:28px;padding:0 8px;background:var(--bg);color:var(--fg);border:1px solid var(--border);border-radius:4px;font:inherit;font-size:11px;cursor:pointer;text-align:left;"><span class="hwfit-backend-btn-icon" data-backend-icon-slot aria-hidden="true" style="display:inline-flex;align-items:center;justify-content:center;width:16px;height:16px;color:var(--accent, var(--red));flex-shrink:0;"></span><span class="hwfit-backend-btn-label" data-backend-label style="flex:1;min-width:0;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;"></span><svg width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true" style="opacity:0.6;flex-shrink:0;"><polyline points="6 9 12 15 18 9"/></svg></button><div class="hwfit-backend-menu" data-backend-menu role="listbox" hidden style="position:absolute;top:calc(100% + 4px);left:0;right:0;z-index:100;background:var(--panel, var(--bg));border:1px solid var(--border);border-radius:6px;box-shadow:0 6px 20px rgba(0,0,0,0.22);padding:4px;"></div></div></label>`;
      panelHtml += `<input type="hidden" class="hwfit-sf" data-field="host" value="${esc(_es.remoteHost || '')}" />`;
      panelHtml += `<label>${_l('venv','Path to Python venv or conda env activate script')}<input type="text" class="hwfit-sf hwfit-sf-wide" data-field="venv" value="${esc(sv('venv', _es.envPath || _srvVenv || ''))}" placeholder="~/venv" /></label>`;
+      // Dtype lives in Row 1 (next to venv) — it's the first knob people
+      // change when matching the model to the box, so it earns top-row
+      // real estate over Row 2's launch-tuning controls.
+      panelHtml += `<label>${_l('Dtype','Data type for weights. auto picks best for GPU')}<select class="hwfit-sf" data-field="dtype">${dtypeOpts}</select></label>`;
      const defaultPort = defaultBackend === 'ollama' ? '11434' : _nextAvailablePort();
      panelHtml += `<label>${_l('Port','HTTP port for the API server')}<input type="text" class="hwfit-sf" data-field="port" value="${esc(sv('port', defaultPort))}" /></label>`;
      const _activeGpus = (defaultGpus || '').split(',').map(s => s.trim()).filter(Boolean);
@@ -609,12 +637,16 @@ function _rerenderCachedModels() {
        const on = _activeGpus.includes(String(i));
        _gpuBtnsHtml += `<button type="button" class="cookbook-gpu-btn${on ? ' active' : ''}" data-gpu="${i}">${i}</button>`;
      }
-      panelHtml += `<label>${_l('GPUs','Toggle which GPUs to use')}<div class="cookbook-gpu-group">${_gpuBtnsHtml}</div><input type="hidden" class="hwfit-sf" data-field="gpus" value="${esc(defaultGpus)}" /></label>`;
-      // Save / saved-configs split button — moved into Row 1 (next to GPUs)
-      // so it shares the same baseline as the rest of the top controls.
+      // GPUs button strip moved to Row 2 (next to GPU Mem) below. 4px
+      // margin on the left, 8px on the right — extra 4px right-side gap
+      // separates the GPU chiclets from the GPU Mem field that follows
+      // (asked-for breathing room; 4px on either side felt cramped on
+      // the GPU-Mem boundary).
+      const _gpusLabelHtml = `<label class="hwfit-gpus-label" style="margin:0 8px 0 4px;">${_l('GPUs','Toggle which GPUs to use')}<div class="cookbook-gpu-group">${_gpuBtnsHtml}</div><input type="hidden" class="hwfit-sf" data-field="gpus" value="${esc(defaultGpus)}" /></label>`;
+      // Save / saved-configs split button — sits at the right end of Row 1.
      panelHtml += _slotsHtml;
      panelHtml += `</div>`;
-      panelHtml += `<div class="hwfit-serve-runtime-note" style="display:none;font-size:11px;line-height:1.35;color:var(--fg-muted);margin-top:-4px;"></div>`;
+      // (hwfit-serve-runtime-note moved to the top of the panel — see above.)
      if (_ggufChoices.length > 1) {
        // Show the GGUF File dropdown for BOTH llama.cpp and Ollama — Ollama
        // also needs to know which exact .gguf to import via the new
@@ -631,15 +663,22 @@ function _rerenderCachedModels() {
      // TP / Context / GPU / GPU Mem / Max Seqs / Dtype. Everything else
      // (Swap, KV Cache, Attention backend, Env vars, llama.cpp batch/ubatch)
      // moved to the Advanced fold below to keep this row scannable.
-      panelHtml += `<div class="hwfit-serve-row hwfit-backend-vllm hwfit-backend-sglang hwfit-backend-llamacpp hwfit-backend-ollama">`;
+      panelHtml += `<div class="hwfit-serve-row hwfit-serve-row-core hwfit-backend-vllm hwfit-backend-sglang hwfit-backend-llamacpp hwfit-backend-ollama">`;
+      // Order: TP → Context → Max Seqs → GPUs → GPU Mem.
+      // Dtype moved up to Row 1. GPUs moved here next to GPU Mem so the
+      // "which devices + how much of them" decisions sit adjacent. Max
+      // Seqs follows Context per the "request-shape" cluster.
      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang">${_l('TP','Tensor Parallelism — split model across N GPUs')}<select class="hwfit-sf" data-field="tp">${tpOpts}</select></label>`;
      // ctx resets to the model's max on every panel open (the real ctx slider
      // lives in the Scan/Download toolbar — see cookbook.js .hwfit-ctx-control).
      panelHtml += `<label>${_l('Context','Max tokens per request — resets to the model max on every open. Lower = less VRAM')}<input type="text" class="hwfit-sf" data-field="ctx" value="${esc(m.context_length || m.context || '20000')}" /></label>`;
-      panelHtml += `<label>${_l('GPU','Which GPU to use. Leave empty for default')}<input type="text" class="hwfit-sf" data-field="gpu_id" value="${esc(sv('gpu_id', ''))}" placeholder="auto" style="width:50px;" /></label>`;
-      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang">${_l('GPU Mem','Fraction of GPU memory (0.0–1.0). Lower if OOM')}<input type="text" class="hwfit-sf" data-field="gpu_mem" value="${esc(sv('gpu_mem', '0.90'))}" /></label>`;
      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang">${_l('Max Seqs','Maximum concurrent requests. Lower = less memory. Default 4 — prosumer GPUs often OOM on vLLM default 256 during CUDA graph capture.')}<input type="text" class="hwfit-sf" data-field="max_seqs" value="${esc(sv('max_seqs', '4'))}" placeholder="4" /></label>`;
-      panelHtml += `<label>${_l('Dtype','Data type for weights. auto picks best for GPU')}<select class="hwfit-sf" data-field="dtype">${dtypeOpts}</select></label>`;
+      // GPU "auto" field removed — the GPU button strip below already
+      // writes data-field="gpus" (the canonical comma-separated device
+      // list) and the command builders now read from that single source.
+      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang">${_l('GPU Mem','Fraction of GPU memory (0.0–1.0). Lower if OOM')}<input type="text" class="hwfit-sf" data-field="gpu_mem" value="${esc(sv('gpu_mem', '0.90'))}" /></label>`;
+      // GPUs button strip at the far right of Row 2.
+      panelHtml += _gpusLabelHtml;
      panelHtml += `</div>`;
      // ── Advanced (collapsed by default) ──
      // Everything below the fold is tuning users only touch occasionally:
@@ -667,7 +706,10 @@ function _rerenderCachedModels() {
      // tuning, or any other KEY=VALUE pair that doesn't have a dedicated
      // field. After the venv activate runs, $VIRTUAL_ENV / $PATH / etc. are
      // already exported so they expand correctly here.
-      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang" style="flex:1 1 100%;">${_l('Env','Extra KEY=VALUE env-var pairs prepended to the launch (space-separated). Example: CUDACXX=$VIRTUAL_ENV/lib/python3.10/site-packages/nvidia/cuda_nvcc/bin/nvcc — points flashinfer at the venv-bundled nvcc when the system one is too old for your GPU.')}<input type="text" class="hwfit-sf" data-field="extra_env" value="${esc(sv('extra_env',''))}" placeholder="CUDACXX=/path/to/nvcc NCCL_P2P_DISABLE=1" style="width:100%;" /></label>`;
+      // grid-column: 1 / -1 makes Env span every column of the Advanced
+      // row's CSS grid (the old flex:1 1 100% did nothing in a grid
+      // container — left an empty trailing column gap on wide modals).
+      panelHtml += `<label class="hwfit-backend-vllm hwfit-backend-sglang" style="grid-column:1 / -1;">${_l('Env','Extra KEY=VALUE env-var pairs prepended to the launch (space-separated). Example: CUDACXX=$VIRTUAL_ENV/lib/python3.10/site-packages/nvidia/cuda_nvcc/bin/nvcc — points flashinfer at the venv-bundled nvcc when the system one is too old for your GPU.')}<input type="text" class="hwfit-sf" data-field="extra_env" value="${esc(sv('extra_env',''))}" placeholder="CUDACXX=/path/to/nvcc NCCL_P2P_DISABLE=1" style="width:100%;" /></label>`;
      panelHtml += `</div>`;
      // Advanced llama.cpp row (Batch / UBatch — moved out of Core for the
      // same "rarely touched" reason as the vLLM extras above).
@@ -686,11 +728,36 @@ function _rerenderCachedModels() {
      panelHtml += `<label>Height${_h('Default output height')} <input type="text" class="hwfit-sf" data-field="diff_height" value="${esc(sv('diff_height', ''))}" placeholder="1024" /></label>`;
      panelHtml += `</div>`;
      // Row 3: Checkboxes (vLLM)
+      // Order: Trust Remote → Auto Tool → Reasoning Parser (when the
+      // model has one) → Enforce Eager → Prefix Caching. Reasoning
+      // Parser was previously in a separate row below; the user wanted
+      // it inline with the other vLLM toggles between Auto Tool and
+      // Enforce Eager so the "what the model needs" decisions sit
+      // together at the top.
+      const _opts2_row3 = _detectModelOptimizations(repo);
+      const _rp_flag = _opts2_row3.flags.find(f => f.includes('--reasoning-parser'));
+      const _rp_name = _rp_flag ? _rp_flag.split(' ')[1] : '';
      panelHtml += `<div class="hwfit-serve-checks hwfit-backend-vllm hwfit-backend-sglang">`;
-      panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="enforce_eager"${sv('enforce_eager',false)?' checked':''} /> Enforce Eager${_h('Disable CUDA graphs. Slower but uses less memory')}</label>`;
      panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="trust_remote"${sv('trust_remote',false)?' checked':''} /> Trust Remote Code${_h('Allow model to run custom code from HuggingFace')}</label>`;
-      panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="prefix_cache"${sv('prefix_cache',false)?' checked':''} /> Prefix Caching${_h('Cache shared prompt prefixes across requests')}</label>`;
      panelHtml += `<label class="hwfit-sf-cb hwfit-backend-vllm"><input type="checkbox" class="hwfit-sf" data-field="auto_tool"${sv('auto_tool',false)?' checked':''} /> Auto Tool Choice${_h('Enable function/tool calling for agent mode')}</label>`;
+      if (_rp_name) panelHtml += `<label class="hwfit-sf-cb hwfit-backend-vllm"><input type="checkbox" class="hwfit-sf" data-field="reasoning_parser" data-parser="${_rp_name}" /> Reasoning Parser <span class="hwfit-parser-tag">${_rp_name}</span></label>`;
+      panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="enforce_eager"${sv('enforce_eager',false)?' checked':''} /> Enforce Eager${_h('Disable CUDA graphs. Slower but uses less memory')}</label>`;
+      panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="prefix_cache"${sv('prefix_cache',false)?' checked':''} /> Prefix Caching${_h('Cache shared prompt prefixes across requests')}</label>`;
+      // Inline the previously-second vLLM checks row so Expert Parallel /
+      // Speculative / MoE Env sit next to Prefix Caching with no gap. All
+      // three are vLLM-only — class-gated so they hide on SGLang.
+      if (_opts2_row3.flags.includes('--enable-expert-parallel')) panelHtml += `<label class="hwfit-sf-cb hwfit-backend-vllm"><input type="checkbox" class="hwfit-sf" data-field="expert_parallel" /> Expert Parallel</label>`;
+      {
+        const _specDef = _opts2_row3.spec || { method: 'mtp', tokens: 3 };
+        const _specMethod = sv('spec_method', _specDef.method);
+        const _specTokens = sv('spec_tokens', String(_specDef.tokens));
+        const _specMethods = ['mtp', 'qwen3_next_mtp', 'eagle', 'medusa', 'ngram'];
+        if (!_specMethods.includes(_specMethod)) _specMethods.unshift(_specMethod);
+        const _specOpts = _specMethods.map(m =>
+          `<option value="${m}"${m === _specMethod ? ' selected' : ''}>${m}</option>`).join('');
+        panelHtml += `<label class="hwfit-sf-cb hwfit-backend-vllm hwfit-spec-group"><input type="checkbox" class="hwfit-sf" data-field="speculative" /> Speculative <select class="hwfit-sf hwfit-spec-method" data-field="spec_method" title="vLLM --speculative-config method">${_specOpts}</select><input type="number" class="hwfit-sf hwfit-spec-tokens hwfit-spec-tokens-bare" data-field="spec_tokens" value="${esc(_specTokens)}" min="1" max="10" title="num_speculative_tokens" style="width:44px;" /><span class="hwfit-help-chip hwfit-help-chip-inline" title="MTP / speculative decoding is supported on a few model families only — turn it on when the model card explicitly recommends it. On supported models it can boost inference throughput up to ~3×; on unsupported models it will either be ignored or fail to launch." style="margin-left:6px;">?</span></label>`;
+      }
+      if (_opts2_row3.envVars.length) panelHtml += `<label class="hwfit-sf-cb hwfit-backend-vllm"><input type="checkbox" class="hwfit-sf" data-field="moe_env" /> MoE Env Vars</label>`;
      panelHtml += `</div>`;
      // Row 2c: llama.cpp fit/perf flags (set by Auto profiles, editable by hand)
      const _kvOpts = ['', 'q4_0', 'q8_0', 'f16'].map(k => `<option value="${k}"${sv('cache_type','')===k?' selected':''}>${k||'default'}</option>`).join('');
@@ -739,33 +806,16 @@ function _rerenderCachedModels() {
      panelHtml += `</div><div class="hwfit-serve-row hwfit-backend-diffusers">`;
      panelHtml += `<label>Harmonize GPU${_h('Separate GPU for img2img/harmonize. Leave empty to use same GPU')}<input type="text" class="hwfit-sf" data-field="diff_harmonize_gpu" value="${esc(sv('diff_harmonize_gpu', ''))}" placeholder="auto" style="width:50px;" /></label>`;
      panelHtml += `</div>`;
-      // Row 4: Extra args
-      panelHtml += `<div class="hwfit-serve-extra">`;
-      panelHtml += `<label>Extra args<input type="text" class="hwfit-sf" data-field="extra" value="${esc(sv('extra', ''))}" placeholder="--flag value" /></label>`;
-      panelHtml += `</div>`;
      // Model-specific optimizations. The checks row always renders for the
      // vLLM backend so the Speculative (MTP) control is ALWAYS reachable —
      // even for models the auto-detector doesn't recognize. Expert-parallel,
      // reasoning-parser and MoE-env still only appear when auto-detected.
-      const _opts2 = _detectModelOptimizations(repo);
-      panelHtml += `<div class="hwfit-serve-checks hwfit-backend-vllm">`;
-      if (_opts2.flags.includes('--enable-expert-parallel')) panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="expert_parallel" /> Expert Parallel</label>`;
-      if (_opts2.flags.some(f => f.includes('--reasoning-parser'))) { const rp = _opts2.flags.find(f => f.includes('--reasoning-parser')).split(' ')[1]; panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="reasoning_parser" data-parser="${rp}" /> Reasoning Parser <span class="hwfit-parser-tag">${rp}</span></label>`; }
-      {
-        // Speculative decoding (vLLM --speculative-config). Default OFF; the
-        // method/token defaults come from auto-detection when available,
-        // else fall back to MTP/3. Toggling the checkbox is what actually
-        // adds the flag at launch (see cookbook.js command builder).
-        const _specDef = _opts2.spec || { method: 'mtp', tokens: 3 };
-        const _specMethod = sv('spec_method', _specDef.method);
-        const _specTokens = sv('spec_tokens', String(_specDef.tokens));
-        const _specMethods = ['mtp', 'qwen3_next_mtp', 'eagle', 'medusa', 'ngram'];
-        if (!_specMethods.includes(_specMethod)) _specMethods.unshift(_specMethod);
-        const _specOpts = _specMethods.map(m =>
-          `<option value="${m}"${m === _specMethod ? ' selected' : ''}>${m}</option>`).join('');
-        panelHtml += `<label class="hwfit-sf-cb hwfit-spec-group"><input type="checkbox" class="hwfit-sf" data-field="speculative" /> Speculative <select class="hwfit-sf hwfit-spec-method" data-field="spec_method" title="vLLM --speculative-config method">${_specOpts}</select><span class="hwfit-numstep"><button type="button" class="hwfit-numstep-btn" data-step="-1" tabindex="-1" aria-label="Decrease">‹</button><input type="number" class="hwfit-sf hwfit-spec-tokens" data-field="spec_tokens" value="${esc(_specTokens)}" min="1" max="10" title="num_speculative_tokens" /><button type="button" class="hwfit-numstep-btn" data-step="1" tabindex="-1" aria-label="Increase">›</button></span><span class="hwfit-help-chip hwfit-help-chip-inline" title="MTP / speculative decoding is supported on a few model families only — turn it on when the model card explicitly recommends it. On supported models it can boost inference throughput up to ~3×; on unsupported models it will either be ignored or fail to launch." style="margin-left:6px;">?</span></label>`;
-      }
-      if (_opts2.envVars.length) panelHtml += `<label class="hwfit-sf-cb"><input type="checkbox" class="hwfit-sf" data-field="moe_env" /> MoE Env Vars</label>`;
+      // Expert Parallel / Speculative / MoE Env moved into Row 3 above so
+      // the vLLM-only toggles sit next to Prefix Caching with no gap.
+      // Extra args sits below the vLLM checks (Reasoning Parser + Spec)
+      // so it reads as "after the advanced toggles, any other flags".
+      panelHtml += `<div class="hwfit-serve-extra">`;
+      panelHtml += `<label>Extra args<input type="text" class="hwfit-sf" data-field="extra" value="${esc(sv('extra', ''))}" placeholder="--flag value" /></label>`;
      panelHtml += `</div>`;
      // ── End Advanced fold ──
      panelHtml += `</details>`;
@@ -958,37 +1008,183 @@ function _rerenderCachedModels() {
        if (ok === false) clearInterval(_vramTimer);
      }, 4000);

-      // Show/hide backend-specific sections
+      // Backend icons — accent color, rendered via currentColor. vLLM gets
+      // a stylized double-V mark, the others fall back to a recognizable
+      // glyph for the engine family. Shown beside each option in the
+      // custom picker so the dropdown lists "[V] vLLM", "[⚡] SGLang", etc.
+      const _BACKEND_GLYPHS = {
+        vllm:   '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.4" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><path d="M3 4l7 16 7-16"/><path d="M14 4l4 9 3-9"/></svg>',
+        sglang: '<svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor" stroke="none" aria-hidden="true"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>',
+        llamacpp: '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><circle cx="12" cy="12" r="9"/><path d="M8 12h8M12 8v8"/></svg>',
+        ollama: '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><path d="M6 10a6 6 0 0 1 12 0v4a4 4 0 0 1-8 0v-1"/><circle cx="10" cy="9" r="1"/><circle cx="14" cy="9" r="1"/></svg>',
+        diffusers: '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" aria-hidden="true"><circle cx="12" cy="12" r="4"/><path d="M12 2v3M12 19v3M2 12h3M19 12h3M5 5l2 2M17 17l2 2M5 19l2-2M17 7l2-2"/></svg>',
+      };
+
+      // ── Custom Backend picker wiring ────────────────────────────────
+      // Reads the option list from the hidden <select.hwfit-backend-source>
+      // so the canonical (value, label) pairs come from one place.
+      const _backendPicker = panel.querySelector('[data-backend-picker]');
+      const _backendSource = panel.querySelector('.hwfit-backend-source');
+      const _backendBtn = panel.querySelector('[data-backend-btn]');
+      const _backendMenu = panel.querySelector('[data-backend-menu]');
+      const _backendBtnLabel = panel.querySelector('[data-backend-label]');
+      const _backendBtnIconSlot = _backendBtn?.querySelector('[data-backend-icon-slot]');
+
+      function _setBackendBtnState(v) {
+        if (!_backendBtn) return;
+        const opt = _backendSource?.querySelector(`option[value="${CSS.escape(v)}"]`);
+        const label = opt ? opt.textContent : v;
+        if (_backendBtnLabel) _backendBtnLabel.textContent = label;
+        if (_backendBtnIconSlot) _backendBtnIconSlot.innerHTML = _BACKEND_GLYPHS[v] || _BACKEND_GLYPHS.vllm;
+      }
+
+      function _renderBackendMenu() {
+        if (!_backendMenu || !_backendSource) return;
+        const items = Array.from(_backendSource.options).map(o => ({ value: o.value, label: o.textContent }));
+        _backendMenu.innerHTML = items.map(it => `
+          <button type="button" role="option" class="hwfit-backend-item" data-value="${it.value}" style="all:unset;display:flex;align-items:center;gap:8px;width:100%;padding:6px 9px;border-radius:5px;font-size:12px;cursor:pointer;color:var(--fg);box-sizing:border-box;">
+            <span class="hwfit-backend-item-icon" style="display:inline-flex;align-items:center;justify-content:center;width:14px;height:14px;color:var(--accent, var(--red));flex-shrink:0;">${_BACKEND_GLYPHS[it.value] || _BACKEND_GLYPHS.vllm}</span>
+            <span class="hwfit-backend-item-label" style="flex:1;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;">${it.label}</span>
+          </button>
+        `).join('');
+        // Hover styling (no global CSS rule — keep it self-contained).
+        _backendMenu.querySelectorAll('.hwfit-backend-item').forEach(btn => {
+          btn.addEventListener('mouseenter', () => { btn.style.background = 'color-mix(in srgb, var(--fg) 8%, transparent)'; });
+          btn.addEventListener('mouseleave', () => { btn.style.background = ''; });
+          btn.addEventListener('click', (ev) => {
+            ev.preventDefault();
+            ev.stopPropagation();
+            const v = btn.dataset.value;
+            if (_backendSource && _backendSource.value !== v) {
+              _backendSource.value = v;
+              _backendSource.dispatchEvent(new Event('change', { bubbles: true }));
+            }
+            _setBackendBtnState(v);
+            _closeBackendMenu();
+          });
+        });
+      }
+
+      function _openBackendMenu() {
+        if (!_backendMenu || !_backendBtn) return;
+        _backendMenu.hidden = false;
+        _backendBtn.setAttribute('aria-expanded', 'true');
+      }
+      function _closeBackendMenu() {
+        if (!_backendMenu || !_backendBtn) return;
+        _backendMenu.hidden = true;
+        _backendBtn.setAttribute('aria-expanded', 'false');
+      }
+      if (_backendBtn) {
+        _backendBtn.addEventListener('click', (ev) => {
+          ev.preventDefault();
+          ev.stopPropagation();
+          if (_backendMenu.hidden) _openBackendMenu();
+          else _closeBackendMenu();
+        });
+        document.addEventListener('click', (ev) => {
+          if (!_backendMenu.hidden && !_backendPicker?.contains(ev.target)) _closeBackendMenu();
+        });
+        document.addEventListener('keydown', (ev) => {
+          if (ev.key === 'Escape' && !_backendMenu.hidden) {
+            ev.stopPropagation();
+            _closeBackendMenu();
+          }
+        }, { capture: true });
+      }
+      _renderBackendMenu();
+      _setBackendBtnState(_backendSource?.value || defaultBackend);
+
      function updateBackendVisibility() {
        const b = panel.querySelector('[data-field="backend"]')?.value || 'vllm';
        panel.querySelectorAll('[class*="hwfit-backend-"]').forEach(el => {
+          // Skip the entire backend-picker subtree — the picker's own
+          // classes (`hwfit-backend-picker`, `-btn`, `-menu`, `-item`,
+          // `-btn-icon`, `-btn-label`, `-item-icon`, `-item-label`) all
+          // match the wildcard and would get hidden as if they were
+          // "backend-specific form sections", which left the dropdown
+          // looking empty / collapsed.
+          if (el.closest('.hwfit-backend-picker')) return;
          const show = el.classList.contains(`hwfit-backend-${b}`);
          el.style.display = show ? '' : 'none';
        });
+        _setBackendBtnState(b);
      }
      updateBackendVisibility();

      async function updateRuntimeReadinessNote() {
        const note = panel.querySelector('.hwfit-serve-runtime-note');
        if (!note) return;
+        // Mirror the message into a small chip next to the model title at
+        // the top of the card, so the readiness state is visible without
+        // having to look down into the panel body.
+        // Clean up any title chip from previous versions — the readiness
+        // text now lives inside the panel at the top, not in the card title.
+        const card = panel.closest('.doclib-card, .memory-item');
+        const titleEl = card ? card.querySelector('.memory-item-title') : null;
+        const titleChip = titleEl ? titleEl.querySelector('.hwfit-serve-runtime-chip') : null;
+        if (titleChip) titleChip.remove();
        const backend = panel.querySelector('[data-field="backend"]')?.value || 'vllm';
+        const noteText = note.querySelector('.hwfit-serve-runtime-text');
+        const _writeNote = (s) => { if (noteText) noteText.textContent = s; else note.textContent = s; };
        if (!['vllm', 'sglang', 'llamacpp', 'diffusers'].includes(backend)) {
          note.style.display = 'none';
-          note.textContent = '';
+          _writeNote('');
          return;
        }
+        // Wire dismiss once per note element.
+        const _closeBtn = note.querySelector('.hwfit-serve-runtime-close');
+        if (_closeBtn && !_closeBtn._wired) {
+          _closeBtn._wired = true;
+          _closeBtn.addEventListener('click', (ev) => {
+            ev.preventDefault();
+            ev.stopPropagation();
+            note.style.display = 'none';
+            panel._runtimeNoteDismissed = true;
+          });
+        }
+        // If the user dismissed it earlier on this panel, don't re-show.
+        if (panel._runtimeNoteDismissed) return;
        const seq = (panel._runtimeReadinessSeq || 0) + 1;
        panel._runtimeReadinessSeq = seq;
        note.style.display = '';
-        note.textContent = 'Checking runtime on selected server...';
+        _writeNote('Checking runtime on selected server…');
+        note.style.borderColor = '';
+        note.style.color = 'var(--fg-muted)';
        try {
          const { pkg, target } = await _fetchServeRuntimePackage(panel, backend);
          if (panel._runtimeReadinessSeq !== seq) return;
-          note.textContent = _runtimeNoteText(backend, pkg, target);
-          note.style.color = pkg?.installed ? 'var(--fg-muted)' : 'var(--red)';
+          _writeNote(_runtimeNoteText(backend, pkg, target));
+          if (!pkg?.installed) {
+            note.style.color = 'var(--red)';
+            note.style.borderColor = 'color-mix(in srgb, var(--red) 40%, transparent)';
+            note.style.background = 'color-mix(in srgb, var(--red) 8%, transparent)';
+            // Append an accent-color link straight to the Dependencies
+            // recipe panel for this backend so the user has one click
+            // to the fix instead of hunting for the right row.
+            if (noteText) {
+              const pkgName = pkg?.name || ({ vllm: 'vllm', sglang: 'sglang', llamacpp: 'llama_cpp', diffusers: 'diffusers' }[backend]);
+              const repo = (panel.closest('.doclib-card, .memory-item')?.dataset?.repo) || '';
+              const link = document.createElement('a');
+              link.href = '#';
+              link.textContent = ' Install in Dependencies →';
+              link.style.cssText = 'color:var(--accent, var(--red));text-decoration:underline;font-weight:600;margin-left:4px;';
+              link.addEventListener('click', (ev) => {
+                ev.preventDefault();
+                if (pkgName) openCookbookDependencies(pkgName, { expandRecipe: pkgName, model: repo });
+              });
+              noteText.appendChild(link);
+            }
+          } else {
+            // Healthy / ready → green so the user reads "good to go" at a
+            // glance instead of scanning fg-muted for a state.
+            note.style.color = 'var(--green, #4caf50)';
+            note.style.borderColor = 'color-mix(in srgb, var(--green, #4caf50) 40%, transparent)';
+            note.style.background = 'color-mix(in srgb, var(--green, #4caf50) 8%, transparent)';
+          }
        } catch (err) {
          if (panel._runtimeReadinessSeq !== seq) return;
-          note.textContent = `Runtime readiness unavailable: ${err?.message || err}`;
+          _writeNote(`Runtime readiness unavailable: ${err?.message || err}`);
          note.style.color = 'var(--fg-muted)';
        }
      }
@@ -1688,15 +1884,39 @@ function _rerenderCachedModels() {
      // Cancel button — collapses the serve config panel (same effect as
      // tapping the row to toggle it shut). Mobile users wanted an explicit
      // "back out" affordance next to Launch.
-      panel.querySelector('.hwfit-serve-cancel')?.addEventListener('click', (ev) => {
-        ev.stopPropagation();
+      const _collapsePanel = () => {
        panel._cleanupRuntimeReadiness?.();
        panel.remove();
        item.classList.remove('doclib-card-expanded');
        item.style.flexDirection = '';
        item.style.alignItems = '';
        if (list) { list.style.minHeight = ''; list.style.maxHeight = ''; }
+      };
+      panel.querySelector('.hwfit-serve-cancel')?.addEventListener('click', (ev) => {
+        ev.stopPropagation();
+        _collapsePanel();
      });
+      // Esc anywhere on the page closes the open serve panel. Skips when
+      // the user is typing in a field — they want Esc to deselect / blur
+      // those, not collapse the form they're configuring.
+      const _onEscClose = (ev) => {
+        if (ev.key !== 'Escape') return;
+        if (!panel.isConnected) {
+          document.removeEventListener('keydown', _onEscClose, true);
+          return;
+        }
+        const t = ev.target;
+        const inField = t && (
+          t.tagName === 'INPUT' || t.tagName === 'TEXTAREA' || t.tagName === 'SELECT' || t.isContentEditable
+        );
+        if (inField) return;
+        // Skip when one of the dropdown/menu popovers is open — the
+        // popovers handle their own Esc and use stopPropagation, so any
+        // Esc that bubbles here means nothing else claimed it.
+        ev.stopPropagation();
+        _collapsePanel();
+      };
+      document.addEventListener('keydown', _onEscClose, true);

      // Launch button
      panel.querySelector('.hwfit-serve-launch').addEventListener('click', async (ev) => {
@@ -1751,6 +1971,50 @@ function _rerenderCachedModels() {
          else serveState[el.dataset.field] = el.value;
        });
        serveState.backend = serveState.backend || (_detectBackend(m).backend) || 'vllm';
+
+        // Pre-launch: check our own task list for a serve already running
+        // on this host. Offer to stop+launch as the default action — the
+        // SSH-based port probe below is more thorough but it can miss
+        // when SSH glitches or `ss` isn't installed. This catches the
+        // common case instantly without waiting for a network round-trip.
+        try {
+          const _runningMod = await import('./cookbookRunning.js');
+          const _hostStr = _envState.remoteHost || '';
+          const _active = (_runningMod._loadTasks ? _runningMod._loadTasks() : []).filter(t =>
+            t && t.type === 'serve'
+            && (t.remoteHost || '') === _hostStr
+            && (t.status === 'running' || t.status === 'ready' || t._serveReady)
+          );
+          if (_active.length) {
+            const _names = _active.map(t => t.payload?.repo_id || t.repo || t.name || '?').filter(Boolean);
+            const _ok = await window.styledConfirm(
+              `${_active.length} model${_active.length === 1 ? '' : 's'} already serving on ${_hostStr || 'local'} (${_names.join(', ')}). Port 8000 will collide. Stop the running model and launch this one?`,
+              { title: 'Server already running', confirmText: 'Stop & launch', cancelText: 'Cancel' },
+            );
+            if (!_ok) { _restoreLaunchBtn(); return; }
+            // Kill each active serve; prefer the rendered Stop button so
+            // endpoint cleanup + Ollama unload run normally. Fall back to
+            // a raw tmux kill when the Active tab isn't in the DOM.
+            for (const t of _active) {
+              try {
+                const _el = document.querySelector(`.cookbook-task[data-task-id="${t.sessionId}"]`);
+                const _btn = _el?.querySelector('.cookbook-task-action-stop');
+                if (_btn) {
+                  _btn.click();
+                } else if (_runningMod._tmuxGracefulKill) {
+                  await fetch('/api/shell/exec', {
+                    method: 'POST', credentials: 'same-origin',
+                    headers: { 'Content-Type': 'application/json' },
+                    body: JSON.stringify({ command: _runningMod._tmuxGracefulKill(t) }),
+                  });
+                }
+              } catch (_killErr) { /* best-effort */ }
+            }
+            // Give the OS a beat to release port 8000.
+            await new Promise(r => setTimeout(r, 2500));
+          }
+        } catch (_e) { /* best-effort */ }
+
        const backendWarning = _serveBackendWarning(m, repo, serveState.backend, serveState);
        if (backendWarning) {
          _restoreLaunchBtn();
@@ -87,7 +87,8 @@ import * as Modals from './modalManager.js';
  }

  function _accountCanSend(account) {
-    return !!(account && account.smtp_host && account.smtp_user && account.has_smtp_password);
+    if (!account || !account.smtp_host || !account.smtp_user) return false;
+    return !!(account.has_smtp_password || account.oauth_provider);
  }

  async function _resolveComposeSendAccountId() {
@@ -2472,6 +2473,8 @@ import * as Modals from './modalManager.js';
    }
    // Hide toolbar items that have no clean WYSIWYG equivalent in email (Code).
    document.querySelectorAll('.md-toolbar-email-hide').forEach(el => { el.style.display = 'none'; });
+    // Show email-only toolbar items (AI reply button).
+    document.querySelectorAll('.md-toolbar-email-only').forEach(el => { el.style.display = 'inline-flex'; });
    if (emailHeader) emailHeader.style.display = '';
    if (emailActions) emailActions.style.display = '';
    // Emails have their own complete footer (Close / More / Send), so hide the
@@ -2864,6 +2867,8 @@ import * as Modals from './modalManager.js';
    if (emailActions) emailActions.style.display = 'none';
    // Restore toolbar items that were hidden for email (Code dropdown).
    document.querySelectorAll('.md-toolbar-email-hide').forEach(el => { el.style.display = ''; });
+    // Re-hide email-only toolbar items (AI reply button).
+    document.querySelectorAll('.md-toolbar-email-only').forEach(el => { el.style.display = 'none'; });
    // Restore the generic documents action bar + its bottom footer (Close /
    // Copy / Export) for non-email docs.
    const docActions = document.getElementById('doc-editor-actions');
@@ -3206,7 +3211,95 @@ import * as Modals from './modalManager.js';
    renderTabs();
  }

-  async function _aiReply() {
+  // Fast/Full + optional context popover for the doc-editor email Reply button.
+  // Mirrors the email reader's AI reply choice popover so the UX is identical:
+  // textarea for an optional steering note, then Fast (lightning) or Full
+  // (concentric dot) buttons; both feed into _aiReply with the chosen mode.
+  let _docAiReplyChoiceMenu = null;
+  function _closeDocAiReplyChoice() {
+    if (_docAiReplyChoiceMenu) {
+      try { _docAiReplyChoiceMenu.remove(); } catch (_) {}
+      _docAiReplyChoiceMenu = null;
+    }
+  }
+  function _showDocAiReplyChoice(btn) {
+    _closeDocAiReplyChoice();
+    if (!btn) return;
+    const rect = btn.getBoundingClientRect();
+    const menu = document.createElement('div');
+    menu.className = 'doc-ai-reply-choice';
+    const menuMaxW = Math.min(240, window.innerWidth - 16);
+    const left = Math.max(8, Math.min(rect.left, window.innerWidth - menuMaxW - 8));
+    const estHeight = 150;
+    const spaceBelow = window.innerHeight - rect.bottom - 8;
+    const spaceAbove = rect.top - 8;
+    const top = (spaceBelow >= estHeight || spaceBelow >= spaceAbove)
+      ? Math.max(8, Math.min(rect.bottom + 6, window.innerHeight - estHeight - 8))
+      : Math.max(8, rect.top - estHeight - 6);
+    menu.style.cssText = [
+      'position:fixed',
+      `left:${left}px`,
+      `top:${top}px`,
+      `max-width:${menuMaxW}px`,
+      'box-sizing:border-box',
+      'z-index:10060',
+      'display:flex',
+      'gap:6px',
+      'padding:6px',
+      'background:var(--bg,#111)',
+      'border:1px solid var(--border,#333)',
+      'border-radius:7px',
+      'box-shadow:0 8px 24px rgba(0,0,0,.28)',
+    ].join(';');
+    menu.innerHTML = `
+      <div style="display:flex;flex-direction:column;gap:6px;min-width:200px;">
+        <textarea data-note-input rows="2" placeholder="Add context (optional)" style="width:100%;box-sizing:border-box;resize:vertical;min-height:42px;font-family:inherit;font-size:11px;padding:5px 6px;border-radius:5px;border:1px solid var(--border,#333);background:var(--bg-elev,#1a1a1a);color:var(--fg);"></textarea>
+        <div style="display:flex;align-items:center;gap:4px;">
+          <button class="memory-toolbar-btn" data-mode="ai-reply-fast" title="Shorter, faster draft" style="display:inline-flex;align-items:center;justify-content:center;gap:5px;flex:1;">
+            <svg width="11" height="11" viewBox="0 0 24 24" fill="var(--accent, var(--red))" aria-hidden="true"><polygon points="13 2 3 14 12 14 11 22 21 10 12 10 13 2"/></svg>
+            Fast
+          </button>
+          <button class="memory-toolbar-btn" data-mode="ai-reply-full" title="Fuller reply with more context" style="display:inline-flex;align-items:center;justify-content:center;gap:5px;flex:1;">
+            <svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" aria-hidden="true" style="color:var(--accent, var(--red));"><circle cx="12" cy="12" r="6"/></svg>
+            Full
+          </button>
+        </div>
+      </div>
+    `;
+    const noteInput = menu.querySelector('[data-note-input]');
+    setTimeout(() => noteInput?.focus(), 0);
+    menu.addEventListener('mousedown', (ev) => ev.stopPropagation());
+    menu.addEventListener('click', async (ev) => {
+      const choice = ev.target.closest('[data-mode]');
+      if (!choice) return;
+      ev.preventDefault();
+      ev.stopPropagation();
+      const mode = choice.getAttribute('data-mode') || 'ai-reply-fast';
+      const noteHint = (noteInput?.value || '').trim();
+      _closeDocAiReplyChoice();
+      await _aiReply({ mode, noteHint });
+    });
+    document.body.appendChild(menu);
+    _docAiReplyChoiceMenu = menu;
+    const outsideClose = (ev) => {
+      if (menu.contains(ev.target)) return;
+      document.removeEventListener('click', outsideClose, true);
+      _closeDocAiReplyChoice();
+    };
+    setTimeout(() => document.addEventListener('click', outsideClose, true), 0);
+    // Esc to close.
+    const escClose = (ev) => {
+      if (ev.key === 'Escape') {
+        ev.stopPropagation();
+        document.removeEventListener('keydown', escClose, true);
+        _closeDocAiReplyChoice();
+      }
+    };
+    document.addEventListener('keydown', escClose, true);
+  }
+
+  async function _aiReply(opts = {}) {
+    const { mode = 'auto', noteHint = '' } = (opts || {});
    const to = document.getElementById('doc-email-to')?.value?.trim() || '';
    const subject = document.getElementById('doc-email-subject')?.value?.trim() || '';
    const textarea = document.getElementById('doc-editor-textarea');
@@ -3251,32 +3344,43 @@ import * as Modals from './modalManager.js';
    if (btn) { btn.disabled = true; btn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:3px"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>Drafting...'; }

    try {
+      // Empty-compose path: if there's no original body, send a placeholder
+      // so the backend's "no body" guard doesn't fail. The user_hint carries
+      // the user's compose intent; the model uses To/Subject + that hint.
+      const bodyForApi = currentBody || (noteHint ? '(no prior email — compose a new message based on the To, Subject, and user instructions)' : currentBody);
+      const fastFlag = mode === 'ai-reply-fast' ? true
+                     : mode === 'ai-reply-full' ? false
+                     : shouldUseFastAiReply();
      const res = await fetch(`${API_BASE}/api/email/ai-reply`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({
          to: to,
          subject: subject,
-          original_body: currentBody,
+          original_body: bodyForApi,
          model: currentModel,
          session_id: currentSessionId,
          message_id: inReplyTo,
          uid: sourceUid,
          folder: sourceFolder,
-          fast: shouldUseFastAiReply(),
+          fast: fastFlag,
+          user_hint: noteHint || '',
        }),
      });
      const data = await res.json();
      if (data.success && data.reply) {
-        const cleanReply = cleanAiReplyText(data.reply);
-        const lines = currentBody.split('\n');
-        const quoteIdx = lines.findIndex(l => l.startsWith('On ') && l.includes(' wrote:'));
-        let newBody = '';
-        if (quoteIdx > 0) {
-          newBody = cleanReply + '\n\n' + lines.slice(quoteIdx).join('\n');
-        } else {
-          newBody = cleanReply + (currentBody ? '\n\n' + currentBody : '');
-        }
+        let cleanReply = cleanAiReplyText(data.reply);
+        // Strip any "On <date>, <name> wrote:" attribution + everything
+        // after it from the AI's output — the model sometimes re-quotes
+        // the original thread, and we already have the real quote in
+        // currentBody. Without this, AI's invented quote stacked on top
+        // of the real one and looked like the history had been "edited".
+        cleanReply = cleanReply.replace(/\n*On\b[\s\S]*?\bwrote:[\s\S]*$/m, '').trim();
+        // Never overwrite the existing draft (user's typed text + the
+        // quoted history below it). Always prepend the AI suggestion so
+        // the user can read it, copy parts, or delete it — but their
+        // own work and the original quote are untouched.
+        const newBody = currentBody ? cleanReply + '\n\n' + currentBody : cleanReply;
        await _streamEmailBodyText(textarea, newBody);
        if (uiModule) uiModule.showToast(`AI draft inserted (${data.model_used || 'AI'})`);
      } else {
@@ -3285,7 +3389,7 @@ import * as Modals from './modalManager.js';
    } catch (e) {
      if (uiModule) uiModule.showError('Failed to generate AI reply');
    } finally {
-      if (btn) { btn.disabled = false; btn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:3px"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg>AI Reply'; }
+      if (btn) { btn.disabled = false; btn.innerHTML = '<svg width="12" height="12" viewBox="0 0 24 24" fill="currentColor" style="color:var(--accent, var(--red));flex-shrink:0;position:relative;top:-1px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg><span style="font-size:11px;margin-left:4px;">Reply</span>'; }
    }
  }

@@ -3813,7 +3917,6 @@ import * as Modals from './modalManager.js';
        <button id="doc-export-pdf-btn" class="doc-action-icon-btn" title="Export PDF" style="display:none;opacity:0.7;gap:4px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z"/><polyline points="14 2 14 8 20 8"/><line x1="12" y1="18" x2="12" y2="12"/><polyline points="9 15 12 18 15 15"/></svg> <span style="font-size:11px;">Export PDF</span></button>
        <button id="doc-pdf-view-btn" class="doc-action-icon-btn" title="Toggle PDF view" style="display:none;opacity:0.7;gap:4px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M14 2H6a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8z"/><polyline points="14 2 14 8 20 8"/></svg> <span style="font-size:11px;">PDF</span></button>
        <select id="doc-language-select" class="doc-language-select">
-          <option value="">type</option>
          <option value="python">python</option>
          <option value="javascript">javascript</option>
          <option value="typescript">typescript</option>
@@ -3851,22 +3954,24 @@ import * as Modals from './modalManager.js';
        </button>
        <div id="doc-email-fields" class="doc-email-fields">
          <div class="email-field" style="position:relative">
-            <label>To</label>
+            <span class="email-field-prefix">To</span>
            <input type="text" id="doc-email-to" placeholder="recipient@example.com" autocomplete="off" />
            <div id="doc-email-to-suggestions" class="email-autocomplete" style="display:none"></div>
            <button type="button" id="doc-email-show-cc" class="email-cc-toggle" title="Show Cc/Bcc">Cc</button>
          </div>
          <div class="email-field" id="doc-email-cc-row" style="display:none;position:relative">
-            <label>Cc</label>
-            <input type="text" id="doc-email-cc" placeholder="cc@example.com" autocomplete="off" />
+            <span class="email-field-prefix">Cc</span>
+            <input type="text" id="doc-email-cc" placeholder="cc@example.com, example2" autocomplete="off" />
            <div id="doc-email-cc-suggestions" class="email-autocomplete" style="display:none"></div>
+            <button type="button" class="email-cc-close" data-cc-close title="Hide Cc/Bcc" aria-label="Hide Cc/Bcc"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round"><line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/></svg></button>
          </div>
          <div class="email-field" id="doc-email-bcc-row" style="display:none;position:relative">
-            <label>Bcc</label>
+            <span class="email-field-prefix">Bcc</span>
            <input type="text" id="doc-email-bcc" placeholder="bcc@example.com" autocomplete="off" />
            <div id="doc-email-bcc-suggestions" class="email-autocomplete" style="display:none"></div>
+            <button type="button" class="email-cc-close" data-cc-close title="Hide Cc/Bcc" aria-label="Hide Cc/Bcc"><svg width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round"><line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/></svg></button>
          </div>
-          <div class="email-field"><label>Subject</label><input type="text" id="doc-email-subject" placeholder="Subject" /></div>
+          <div class="email-field" style="position:relative"><span class="email-field-prefix">Subject</span><input type="text" id="doc-email-subject" placeholder="" /></div>
          <div id="doc-email-attachments" class="email-attachments" style="display:none"></div>
          <div id="doc-email-compose-atts" class="email-compose-atts" style="display:none"></div>
        </div>
@@ -3879,13 +3984,14 @@ import * as Modals from './modalManager.js';
      <div class="doc-md-toolbar" id="doc-md-toolbar" style="display:none">
        <div class="md-toolbar-items" id="md-toolbar-items">
          <span class="md-view-toggle" id="doc-md-view-toggle" style="display:none" role="group" aria-label="Edit or preview">
-            <button type="button" class="md-view-opt" data-mdview="edit" title="Edit source"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M11 4H4a2 2 0 0 0-2 2v14a2 2 0 0 0 2 2h14a2 2 0 0 0 2-2v-7"/><path d="M18.5 2.5a2.12 2.12 0 0 1 3 3L12 15l-4 1 1-4 9.5-9.5z"/></svg></button>
-            <button type="button" class="md-view-opt" data-mdview="preview" title="Preview"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></button>
+            <button type="button" class="md-view-opt" data-mdview="edit" title="Edit source (Ctrl+Alt+M to toggle)"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M11 4H4a2 2 0 0 0-2 2v14a2 2 0 0 0 2 2h14a2 2 0 0 0 2-2v-7"/><path d="M18.5 2.5a2.12 2.12 0 0 1 3 3L12 15l-4 1 1-4 9.5-9.5z"/></svg></button>
+            <button type="button" class="md-view-opt" data-mdview="preview" title="Preview (Ctrl+Alt+M to toggle)"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M1 12s4-8 11-8 11 8 11 8-4 8-11 8-11-8-11-8z"/><circle cx="12" cy="12" r="3"/></svg></button>
          </span>
          <span class="md-view-toggle" id="doc-render-view-toggle" style="display:none" role="group" aria-label="Code or run">
            <button type="button" class="md-view-opt" data-renderview="code" title="Edit code"><svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polyline points="16 18 22 12 16 6"/><polyline points="8 6 2 12 8 18"/></svg></button>
            <button type="button" class="md-view-opt" data-renderview="run" title="Run / Preview"><svg width="13" height="13" viewBox="0 0 24 24" fill="currentColor" stroke="none"><polygon points="5 3 19 12 5 21 5 3"/></svg></button>
          </span>
+          <button id="doc-email-ai-reply-btn" class="doc-action-icon-btn md-toolbar-email-only" type="button" title="Draft a reply with AI (Fast / Full + optional context)" style="display:none;align-items:center;gap:4px;"><svg width="12" height="12" viewBox="0 0 24 24" fill="currentColor" style="color:var(--accent, var(--red));flex-shrink:0;position:relative;top:-1px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg><span style="font-size:11px;">Reply</span></button>
          <button id="doc-fontsize-btn" class="doc-action-icon-btn" title="Font size" style="position:relative;width:28px;height:26px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="opacity:0.7;"><path d="M4 7V4h16v3"/><path d="M12 4v16"/><path d="M8 20h8"/></svg><span class="doc-fontsize-levels"><i data-sz="s">S</i><i data-sz="m">M</i><i data-sz="l">L</i></span></button>
          <button id="doc-diff-toggle-btn" class="doc-action-icon-btn" title="Compare changes" style="opacity:0.7;display:none;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M12 3v18"/><path d="M5 12H2l5-5 5 5H9"/><path d="M19 12h3l-5 5-5-5h3"/></svg></button>
          <span class="md-toolbar-sep"></span>
@@ -4395,6 +4501,24 @@ import * as Modals from './modalManager.js';
        }
      });
    }
+    // Ctrl+Alt+M (and Cmd+Opt+M on mac) flips Edit ↔ Preview on a markdown
+    // doc. Bound once globally; gated on the doc panel being open and the
+    // active doc being markdown so it doesn't fire while the user is typing
+    // in a non-markdown context.
+    if (!window._docMdToggleBound) {
+      window._docMdToggleBound = true;
+      document.addEventListener('keydown', (e) => {
+        if ((e.ctrlKey || e.metaKey) && e.altKey && !e.shiftKey && (e.key === 'm' || e.key === 'M' || e.code === 'KeyM')) {
+          if (!isOpen) return;
+          const doc = activeDocId && docs.get(activeDocId);
+          const lang = (doc?.language || 'markdown').toLowerCase();
+          if (lang !== 'markdown') return;
+          e.preventDefault();
+          toggleMarkdownPreview();
+          _syncHeaderActions();
+        }
+      });
+    }
    document.getElementById('doc-email-draft-btn')?.addEventListener('click', () => {
      document.getElementById('doc-email-more-menu').style.display = 'none';
      _saveDraft();
@@ -4409,7 +4533,11 @@ import * as Modals from './modalManager.js';
      document.getElementById('doc-email-more-menu').style.display = 'none';
      _scheduleSend(anchor);
    });
-    document.getElementById('doc-email-ai-reply-btn')?.addEventListener('click', _aiReply);
+    document.getElementById('doc-email-ai-reply-btn')?.addEventListener('click', (ev) => {
+      ev.preventDefault();
+      ev.stopPropagation();
+      _showDocAiReplyChoice(ev.currentTarget);
+    });

    const collapseBtn = document.getElementById('doc-email-collapse-btn');
    if (collapseBtn && !collapseBtn._emailCollapseWired) {
@@ -4489,6 +4617,25 @@ import * as Modals from './modalManager.js';
      _syncEmailHeaderSummary();
    });

+    // Cc/Bcc close — X buttons inside the Cc and Bcc fields hide both
+    // rows + clear their inputs + restore the Cc opener on the To row.
+    document.querySelectorAll('[data-cc-close]').forEach(closeBtn => {
+      closeBtn.addEventListener('click', (ev) => {
+        ev.stopPropagation();
+        const ccRow = document.getElementById('doc-email-cc-row');
+        const bccRow = document.getElementById('doc-email-bcc-row');
+        const ccInput = document.getElementById('doc-email-cc');
+        const bccInput = document.getElementById('doc-email-bcc');
+        if (ccRow) ccRow.style.display = 'none';
+        if (bccRow) bccRow.style.display = 'none';
+        if (ccInput) ccInput.value = '';
+        if (bccInput) bccInput.value = '';
+        const ccToggle = document.getElementById('doc-email-show-cc');
+        if (ccToggle) ccToggle.style.display = '';
+        _syncEmailHeaderSummary();
+      });
+    });
+
    // Autocomplete for To / Cc / Bcc — typed fragment after the last
    // comma triggers contact search; Enter / Tab / click on a suggestion
    // appends "<email>, " so the user can keep typing more recipients.
@@ -8527,6 +8674,19 @@ import * as Modals from './modalManager.js';
    // `body:has(.doc-editor-pane.doc-fullscreen) .doc-divider-collapse` slides
    // it into a forced-inside position). Hiding the divider here would hide
    // the chevron with it.
+
+    // Hide the tab bar during the layout shift so any in-flight smooth
+    // scroll / reflow doesn't visibly "fly" the active tab across the
+    // pane as it expands. Restored after the layout settles.
+    const tabBar = document.getElementById('doc-tab-bar');
+    if (tabBar) {
+      tabBar.style.visibility = 'hidden';
+      clearTimeout(tabBar._fsHideTimer);
+      tabBar._fsHideTimer = setTimeout(() => {
+        tabBar.style.visibility = '';
+      }, 240);
+    }
+
    if (pane.classList.contains('doc-fullscreen')) {
      pane.classList.remove('doc-fullscreen');
      if (container) container.style.display = '';
@@ -22,8 +22,8 @@ const _replyIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none"
 const _archiveIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><rect x="2" y="3" width="20" height="5" rx="1"/><path d="M4 8v11a2 2 0 0 0 2 2h12a2 2 0 0 0 2-2V8"/><path d="M10 12h4"/></svg>';
 const _deleteIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 6h18"/><path d="M8 6V4a2 2 0 0 1 2-2h4a2 2 0 0 1 2 2v2"/><path d="M19 6v14a2 2 0 0 1-2 2H7a2 2 0 0 1-2-2V6"/></svg>';
 const _unreadIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor"/></svg>';
-const _starIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polygon points="12 2 15.09 8.26 22 9.27 17 14.14 18.18 21.02 12 17.77 5.82 21.02 7 14.14 2 9.27 8.91 8.26 12 2"/></svg>';
-const _starFilledIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><polygon points="12 2 15.09 8.26 22 9.27 17 14.14 18.18 21.02 12 17.77 5.82 21.02 7 14.14 2 9.27 8.91 8.26 12 2"/></svg>';
+const _starIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M19 21l-7-5-7 5V5a2 2 0 0 1 2-2h10a2 2 0 0 1 2 2z"/></svg>';
+const _starFilledIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M19 21l-7-5-7 5V5a2 2 0 0 1 2-2h10a2 2 0 0 1 2 2z"/></svg>';
 const _bellIcon = '<svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M18 8A6 6 0 0 0 6 8c0 7-3 9-3 9h18s-3-2-3-9"/><path d="M13.73 21a2 2 0 0 1-3.46 0"/></svg>';
 const _icon = (svg) => `<span class="dropdown-icon">${svg}</span>`;
 const _replySeparator = '---------- Previous message ----------';
@@ -74,6 +74,11 @@ window.addEventListener('email-answered', (e) => {
    item.classList.remove('email-unread');
    const check = item.querySelector('.email-done-check');
    if (check) check.classList.add('active');
+    // Auto-mark from sending a reply — flash the row so the user sees the
+    // state change without staring at it. Class self-removes after the
+    // animation so it doesn't replay on re-renders.
+    item.classList.add('email-auto-done-flash');
+    setTimeout(() => item.classList.remove('email-auto-done-flash'), 1200);
  });
 });
 let _loading = false;
@@ -113,19 +118,19 @@ export function init(documentModule) {
      } catch (_) {}
      if (opts.compose) { _composeNew(); return; }
      if (opts.email) {
-        await _openEmail(opts.email, null, opts.emailData, opts.mode || 'reply');
+        await _openEmail(opts.email, null, opts.emailData, opts.mode || 'reply', opts.noteHint || '');
      }
    },
  });
  _watchDocOpenToReDockEmail();
 }

-export async function openReplyDraft(uid, folder = 'INBOX', mode = 'reply') {
+export async function openReplyDraft(uid, folder = 'INBOX', mode = 'reply', prefilledBody = '') {
  if (!uid) return;
  const previousFolder = _currentFolder;
  _currentFolder = folder || 'INBOX';
  try {
-    await _openEmail({ uid: String(uid), subject: '' }, null, null, mode || 'reply');
+    await _openEmail({ uid: String(uid), subject: '' }, null, null, mode || 'reply', '', prefilledBody || '');
  } finally {
    _currentFolder = previousFolder || _currentFolder;
  }
@@ -525,11 +530,6 @@ function _createEmailItem(em) {
      </div>
      <div class="email-subject">${_esc(em.subject)}${unreadIcon}${attachIcon}${tagPills}${spamTag}</div>
    </div>
-    <div class="email-menu-wrap">
-      <button class="hamburger email-menu-btn" title="Actions">
-        <svg width="14" height="14" viewBox="0 0 24 24" fill="currentColor"><circle cx="12" cy="5" r="2"/><circle cx="12" cy="12" r="2"/><circle cx="12" cy="19" r="2"/></svg>
-      </button>
-    </div>
  `;

  // Click sender name → filter list to that sender
@@ -562,17 +562,10 @@ function _createEmailItem(em) {

  // Click to open — do NOT close sidebar
  item.addEventListener('click', (e) => {
-    if (e.target.closest('.email-menu-wrap')) return;
    if (item.dataset.swipeBlock === '1') return;
    _openEmail(em, item);
  });

-  const menuWrap = item.querySelector('.email-menu-wrap');
-  menuWrap.addEventListener('click', (e) => {
-    e.stopPropagation();
-    _showEmailMenu(em, menuWrap, item);
-  });
-
  // Swipe left to archive (mobile). Mirrors sidebar-layout.js swipe pattern.
  if ('ontouchstart' in window) {
    let startX = 0, startY = 0, dx = 0, dy = 0, swiping = false, swiped = false;
@@ -580,7 +573,6 @@ function _createEmailItem(em) {
    const VERT_CANCEL = 30;     // px vertical motion cancels swipe (treat as scroll)

    item.addEventListener('touchstart', (e) => {
-      if (e.target.closest('.email-menu-wrap')) return;
      const t = e.touches[0];
      startX = t.clientX; startY = t.clientY;
      dx = 0; dy = 0; swiping = true; swiped = false;
@@ -638,10 +630,13 @@ function _createEmailItem(em) {
  return item;
 }

-async function _openEmail(em, itemEl, preloadedData = null, mode = 'reply') {
+async function _openEmail(em, itemEl, preloadedData = null, mode = 'reply', noteHint = '', prefilledBody = '') {
  const aiReplyMode = mode === 'ai-reply-fast' ? 'fast' : (mode === 'ai-reply-full' ? 'full' : '');
  const wantsAiReply = mode === 'ai-reply' || !!aiReplyMode;
-  let aiSuggestedBody = null;
+  // Body pre-fill from the agent's open_email_reply tool call takes the
+  // same insertion slot as an AI-suggested body — both land just before
+  // the quoted-original block.
+  let aiSuggestedBody = (typeof prefilledBody === 'string' && prefilledBody.trim()) ? prefilledBody.trim() : null;
  if (wantsAiReply) {
    // Fall through to reply-all (not plain reply) so the generated AI
    // draft addresses everyone on the original thread. On single-
@@ -698,6 +693,7 @@ async function _openEmail(em, itemEl, preloadedData = null, mode = 'reply') {
              uid: String(em.uid || ''),
              folder: _currentFolder,
              fast: aiReplyMode ? aiReplyMode === 'fast' : _shouldUseFastAiReply(data),
+              user_hint: (noteHint || '').trim() || undefined,
            }),
          });
          const result = await res.json();
@@ -18,6 +18,80 @@ let selectedIds = new Set();

 const MEMORY_CATEGORIES = ['fact', 'identity', 'preference', 'contact', 'project', 'goal', 'task'];

+// Sort-option icons for the custom Memory sort picker (and Skills picker
+// once it reuses the same markup). Each value maps to a 13px Feather-style
+// SVG so the icon visually distinguishes Newest / Oldest / A-Z / Most used.
+const _MEMORY_SORT_ICONS = {
+  newest: '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><circle cx="12" cy="12" r="10"/><polyline points="12 6 12 12 16 14"/></svg>',
+  oldest: '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 12a9 9 0 1 0 3-6.7L3 8"/><polyline points="3 3 3 8 8 8"/><polyline points="12 7 12 12 16 14"/></svg>',
+  alpha:  '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 4h6"/><path d="M3 10h6"/><path d="M3 16h4"/><path d="M14 4l4 12"/><path d="M16 12h4"/><polyline points="17 18 21 14 17 10"/><line x1="21" y1="14" x2="13" y2="14"/></svg>',
+  uses:   '<svg width="13" height="13" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M8.5 14.5A2.5 2.5 0 0 0 11 12c0-1.38-.5-2-1-3-1.072-2.143-.224-4.054 2-6 .5 2.5 2 4.9 4 6.5 2 1.6 3 3.5 3 5.5a7 7 0 1 1-14 0c0-1.153.433-2.294 1-3a2.5 2.5 0 0 0 2.5 2.5z"/></svg>',
+};
+
+function _memorySortIcon(value) {
+  return _MEMORY_SORT_ICONS[value] || _MEMORY_SORT_ICONS.newest;
+}
+
+function _renderMemorySortPickerCurrent() {
+  const sel = document.getElementById('memory-sort');
+  const btn = document.getElementById('memory-sort-btn');
+  if (!sel || !btn) return;
+  const value = sel.value || 'newest';
+  const opt = sel.querySelector(`option[value="${CSS.escape(value)}"]`);
+  const label = opt ? opt.textContent : value;
+  const iconWrap = btn.querySelector('.memory-sort-icon-cur');
+  const labelEl = btn.querySelector('.memory-sort-label');
+  if (iconWrap) iconWrap.innerHTML = _memorySortIcon(value);
+  if (labelEl) labelEl.textContent = label;
+}
+
+function _initMemorySortPicker() {
+  const sel = document.getElementById('memory-sort');
+  const picker = document.getElementById('memory-sort-picker');
+  const btn = document.getElementById('memory-sort-btn');
+  const menu = document.getElementById('memory-sort-menu');
+  if (!sel || !picker || !btn || !menu || picker._wired) return;
+  picker._wired = true;
+
+  const items = Array.from(sel.children)
+    .filter(o => o.tagName === 'OPTION')
+    .map(o => ({ value: o.value, label: o.textContent }));
+
+  menu.innerHTML = items.map(it => `
+    <button type="button" role="option" class="memory-sort-item" data-value="${it.value}">
+      <span class="memory-sort-item-icon">${_memorySortIcon(it.value)}</span>
+      <span class="memory-sort-item-label">${it.label}</span>
+    </button>
+  `).join('');
+
+  const close = () => { menu.hidden = true; btn.setAttribute('aria-expanded', 'false'); };
+  const open  = () => { menu.hidden = false; btn.setAttribute('aria-expanded', 'true'); };
+
+  btn.addEventListener('click', (e) => {
+    e.stopPropagation();
+    if (menu.hidden) open(); else close();
+  });
+  menu.addEventListener('click', (e) => {
+    const item = e.target.closest('.memory-sort-item');
+    if (!item) return;
+    sel.value = item.dataset.value;
+    sel.dispatchEvent(new Event('change', { bubbles: true }));
+    _renderMemorySortPickerCurrent();
+    close();
+  });
+  document.addEventListener('click', (e) => {
+    if (!menu.hidden && !picker.contains(e.target)) close();
+  });
+  document.addEventListener('keydown', (e) => {
+    if (e.key === 'Escape' && !menu.hidden) {
+      e.stopPropagation();
+      close();
+    }
+  }, { capture: true });
+
+  _renderMemorySortPickerCurrent();
+}
+
 function _ensureNewMemoryCategorySelect() {
  const sel = document.getElementById('new-memory-category');
  if (!sel || sel.dataset.wired === '1') return;
@@ -334,13 +408,16 @@ export async function loadMemories() {

 // ---- Bulk select mode ----

+const _SELECT_BTN_DOT_SVG = '<svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:3px;"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor" stroke="none"/></svg>';
+const _SELECT_BTN_X_SVG = '<svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" style="vertical-align:-2px;margin-right:3px;"><line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/></svg>';
+
 function enterSelectMode() {
  selectMode = true;
  selectedIds.clear();
  const bulkBar = document.getElementById('memory-bulk-bar');
  const selectBtn = document.getElementById('memory-select-btn');
  if (bulkBar) bulkBar.classList.remove('hidden');
-  if (selectBtn) { selectBtn.classList.add('active'); selectBtn.textContent = 'Cancel'; }
+  if (selectBtn) { selectBtn.classList.add('active'); selectBtn.innerHTML = _SELECT_BTN_X_SVG + 'Cancel'; }
  updateBulkCount();
  renderMemoryList();
 }
@@ -352,7 +429,7 @@ function exitSelectMode() {
  const selectBtn = document.getElementById('memory-select-btn');
  const selectAll = document.getElementById('memory-select-all');
  if (bulkBar) bulkBar.classList.add('hidden');
-  if (selectBtn) { selectBtn.classList.remove('active'); selectBtn.textContent = 'Select'; }
+  if (selectBtn) { selectBtn.classList.remove('active'); selectBtn.innerHTML = _SELECT_BTN_DOT_SVG + 'Select'; }
  if (selectAll) selectAll.checked = false;
  renderMemoryList();
 }
@@ -449,7 +526,7 @@ export async function tidyMemories() {
    const data = await res.json();
    if ((data.removed || 0) === 0) {
      if (tidySpinner) tidySpinner.destroy();
-      if (tidyBtn) { tidyBtn.disabled = false; tidyBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy'; }
+      if (tidyBtn) { tidyBtn.disabled = false; tidyBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;color:var(--accent, var(--red));"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy'; }
      showToast('Already clean');
      return;
    }
@@ -492,7 +569,7 @@ export async function tidyMemories() {
      tidyBtn.disabled = false;
      tidyBtn.style.border = '';
      tidyBtn.style.background = '';
-      tidyBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy';
+      tidyBtn.innerHTML = '<svg width="11" height="11" viewBox="0 0 24 24" fill="currentColor" style="vertical-align:-1px;margin-right:2px;color:var(--accent, var(--red));"><path d="M12 0L14.59 8.41L23 12L14.59 15.59L12 24L9.41 15.59L1 12L9.41 8.41Z"/></svg> Tidy';
    }
  }
 }
@@ -1387,6 +1464,7 @@ document.addEventListener('DOMContentLoaded', () => {
      renderMemoryList();
    });
  }
+  _initMemorySortPicker();

  const tidyBtn = document.getElementById('memory-tidy-btn');
  if (tidyBtn) tidyBtn.addEventListener('click', tidyMemories);
@@ -302,6 +302,7 @@ function _anchorLeftDock(content) {
  }
 }

+export function collapseSidebarToRail() { return _collapseSidebarToRail(); }
 function _collapseSidebarToRail() {
  const sidebar = document.getElementById('sidebar');
  const rail = document.getElementById('icon-rail');
@@ -808,7 +809,10 @@ export function makeEdgeDockController(modal, side = 'right', dockClass) {
    handle.style.bottom = '0';
    handle.style.width = '10px';
    handle.style.cursor = 'col-resize';
-    handle.style.background = 'linear-gradient(to right, transparent 0 3px, color-mix(in srgb, var(--accent, var(--red)) 35%, transparent) 3px 7px, transparent 7px 10px)';
+    // Invisible at rest, accent stripe fades in on hover (see
+    // .edge-dock-resize-handle CSS rule).
+    handle.style.background = 'transparent';
+    handle.style.transition = 'background 0.18s ease';
    handle.style.pointerEvents = 'auto';
    handle.style.touchAction = 'none';
    handle.style.display = 'none';
@@ -147,4 +147,31 @@ export function providerLabel(endpointUrl) {
  return host.replace(/^api\./i, "");
 }

-export default { providerLogo, providerLabel };
+// Map endpoint URL → logo SVG using the same model-id regex catalog.
+// Tests host + port + path so loopback servers (e.g. Ollama on
+// localhost:11434) still match by port. Falls back to null when nothing
+// recognises the URL, so callers can render a neutral placeholder.
+export function providerLogoFromUrl(url) {
+  if (!url) return null;
+  let host = '', port = '', path = '';
+  try {
+    const u = new URL(url);
+    host = u.hostname; port = u.port; path = u.pathname || '';
+  } catch (_) {
+    const raw = String(url).replace(/^[a-z]+:\/\//i, '');
+    const slashIdx = raw.indexOf('/');
+    const hostport = slashIdx >= 0 ? raw.slice(0, slashIdx) : raw;
+    path = slashIdx >= 0 ? raw.slice(slashIdx) : '';
+    const colon = hostport.lastIndexOf(':');
+    host = colon >= 0 ? hostport.slice(0, colon) : hostport;
+    port = colon >= 0 ? hostport.slice(colon + 1) : '';
+  }
+  // Build candidate strings to test against the provider catalog.
+  const candidates = [host, port ? `${host}:${port}` : '', port ? `:${port}` : '', path].filter(Boolean);
+  for (const [re, svg] of _PROVIDERS) {
+    if (candidates.some(c => re.test(c))) return svg;
+  }
+  return null;
+}
+
+export default { providerLogo, providerLabel, providerLogoFromUrl };
@@ -7,6 +7,26 @@ import createResearchSynapse from '../researchSynapse.js';
 import spinnerModule from '../spinner.js';
 import { sortModelIds } from '../modelSort.js';

+// Rotating research textarea placeholders — pick one at random each
+// time the panel is rendered so the example keeps feeling fresh.
+const _RESEARCH_HINTS = [
+  "e.g. Trace Odysseus's ten-year journey home from Troy — every island, monster, and detour, and why each one cost him",
+  "e.g. Compare Rust and Go for building a high-throughput web API in 2026",
+  "e.g. Fact-check whether honey actually never spoils",
+  "e.g. How to roast a duck so the skin stays crispy",
+  "e.g. The collapse of Bronze Age civilizations — leading theories and the evidence behind each",
+  "e.g. Best M.2 NVMe SSDs under $200 for a home AI workstation",
+  "e.g. Why do cats knead with their paws? Cover the leading behavioural explanations",
+  "e.g. Side effects and benefits of long-term creatine supplementation",
+  "e.g. How does end-to-end encryption work in Signal, step by step",
+  "e.g. The history of the printing press in East Asia, 700 CE → 1600 CE",
+];
+function _pickResearchHint() {
+  const i = Math.floor(Math.random() * _RESEARCH_HINTS.length);
+  // Escape double-quotes so we can safely splice into a placeholder="…" attribute.
+  return _RESEARCH_HINTS[i].replace(/"/g, '&quot;');
+}
+
 // jobId -> { synapse, status } — survives across _renderJobs() rebuilds so
 // the SVG keeps its accumulated nodes/edges between progress events.
 const _jobSynapses = new Map();
@@ -49,13 +69,12 @@ try { _settingsCollapsed = localStorage.getItem(_COLLAPSE_KEY) === '1'; } catch

 function _saveSettingsToStorage() {
  try {
-    const activeCat = document.querySelector('.research-cat.active');
    localStorage.setItem(_SETTINGS_KEY, JSON.stringify({
      max_rounds: document.getElementById('research-rounds')?.value || '0',
      search_provider: document.getElementById('research-search-provider')?.value || '',
      endpoint_id: document.getElementById('research-endpoint')?.value || '',
      model: document.getElementById('research-model')?.value || '',
-      category: activeCat?.dataset.cat || '',
+      category: document.getElementById('research-category')?.value || '',
    }));
  } catch {}
 }
@@ -346,15 +365,14 @@ function _buildPanelHTML() {
    </div>
    <div class="modal-body research-pane-body" data-no-swipe-dismiss>
      <div class="research-new-job">
-        <div style="display:flex;align-items:baseline;gap:8px;margin-bottom:2px;">
-          <h2 style="margin:0;padding:0;line-height:1;">Research <span id="research-stats" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal"></span></h2>
+        <div style="display:flex;align-items:center;gap:8px;margin-bottom:2px;">
+          <h2 style="margin:0;padding:0;line-height:1;display:inline-flex;align-items:center;gap:6px;"><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="var(--accent, var(--red))" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="flex-shrink:0;"><path d="M6 18h8"/><path d="M3 22h18"/><path d="M14 22a7 7 0 1 0 0-14h-1"/><path d="M9 14h2"/><path d="M9 12a2 2 0 0 1-2-2V6h4v4a2 2 0 0 1-2 2Z"/><path d="M12 6V3a1 1 0 0 0-1-1H9a1 1 0 0 0-1 1v3"/></svg>Research <span id="research-stats" class="memory-count" style="font-size:0.6em;opacity:0.6;font-weight:normal;position:relative;top:4px;"></span></h2>
        </div>
-        <p class="memory-desc doclib-desc" style="margin-top:6px;display:flex;align-items:center;gap:6px;">
-          <svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="flex-shrink:0;opacity:0.8;"><path d="M6 18h8"/><path d="M3 22h18"/><path d="M14 22a7 7 0 1 0 0-14h-1"/><path d="M9 14h2"/><path d="M9 12a2 2 0 0 1-2-2V6h4v4a2 2 0 0 1-2 2Z"/><path d="M12 6V3a1 1 0 0 0-1-1H9a1 1 0 0 0-1 1v3"/></svg>
+        <p class="memory-desc doclib-desc" style="margin-top:2px;display:flex;align-items:center;gap:6px;flex-wrap:wrap;">
          <span>Multi-step web research with an LLM-in-the-loop agent</span>
+          <span id="research-no-past-hint" style="display:none;font-size:11px;opacity:0.7;position:relative;top:-4px;">— past runs in <button type="button" class="research-library-link" style="background:none;border:none;padding:0;font:inherit;color:var(--accent, var(--red));cursor:pointer;text-decoration:underline;">Library, Research</button></span>
        </p>
-        <div id="research-no-past-hint" class="memory-desc doclib-desc" style="display:none;margin-top:-2px;font-size:11px;opacity:0.7;">All past research found in <button type="button" class="research-library-link">Library, Research</button></div>
-        <textarea id="research-query" class="research-query" placeholder="e.g. Trace Odysseus's ten-year journey home from Troy — every island, monster, and detour, and what each one cost him." rows="4"></textarea>
+        <textarea id="research-query" class="research-query" placeholder="${_pickResearchHint()}" rows="4"></textarea>
        <div class="research-category-row" id="research-category-row">
          <button class="research-cat active" data-cat="" title="LLM auto-detects the best format">Auto</button>
          <button class="research-cat" data-cat="product">Product</button>
@@ -363,13 +381,23 @@ function _buildPanelHTML() {
          <button class="research-cat" data-cat="factcheck">Fact-check</button>
        </div>
        <button id="research-settings-toggle" class="research-settings-toggle${chevronCls}">
-          Settings<span class="research-settings-chevron">${_chevronIcon}</span>
+          <svg width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:4px;opacity:0.85;flex-shrink:0;"><circle cx="12" cy="12" r="3"/><path d="M19.4 15a1.65 1.65 0 0 0 .33 1.82l.06.06a2 2 0 0 1 0 2.83 2 2 0 0 1-2.83 0l-.06-.06a1.65 1.65 0 0 0-1.82-.33 1.65 1.65 0 0 0-1 1.51V21a2 2 0 0 1-2 2 2 2 0 0 1-2-2v-.09A1.65 1.65 0 0 0 9 19.4a1.65 1.65 0 0 0-1.82.33l-.06.06a2 2 0 0 1-2.83 0 2 2 0 0 1 0-2.83l.06-.06a1.65 1.65 0 0 0 .33-1.82 1.65 1.65 0 0 0-1.51-1H3a2 2 0 0 1-2-2 2 2 0 0 1 2-2h.09A1.65 1.65 0 0 0 4.6 9a1.65 1.65 0 0 0-.33-1.82l-.06-.06a2 2 0 0 1 0-2.83 2 2 0 0 1 2.83 0l.06.06a1.65 1.65 0 0 0 1.82.33H9a1.65 1.65 0 0 0 1-1.51V3a2 2 0 0 1 2-2 2 2 0 0 1 2 2v.09a1.65 1.65 0 0 0 1 1.51 1.65 1.65 0 0 0 1.82-.33l.06-.06a2 2 0 0 1 2.83 0 2 2 0 0 1 0 2.83l-.06.06a1.65 1.65 0 0 0-.33 1.82V9a1.65 1.65 0 0 0 1.51 1H21a2 2 0 0 1 2 2 2 2 0 0 1-2 2h-.09a1.65 1.65 0 0 0-1.51 1z"/></svg>Settings<span class="research-settings-chevron">${_chevronIcon}</span>
        </button>
        <div id="research-settings-body" class="research-settings-row"${settingsHidden}>
          <label class="research-setting">
-            <span class="research-setting-label">Rounds</span>
+            <span class="research-setting-label">Rounds <span class="hwfit-help-chip hwfit-help-chip-inline" title="How many search → read → reflect rounds the agent runs. More rounds = deeper coverage, longer wait, more tokens.">?</span></span>
            <select id="research-rounds">${roundOpts}</select>
          </label>
+          <label class="research-setting">
+            <span class="research-setting-label">Format <span class="hwfit-help-chip hwfit-help-chip-inline" title="Auto lets the LLM pick the output shape. Override when you specifically want a Compare table, How-to, Product, or Fact-check.">?</span></span>
+            <select id="research-category">
+              <option value="" selected>Auto</option>
+              <option value="product">Product</option>
+              <option value="comparison">Compare</option>
+              <option value="howto">How-to</option>
+              <option value="factcheck">Fact-check</option>
+            </select>
+          </label>
          <label class="research-setting">
            <span class="research-setting-label">Search engine</span>
            <select id="research-search-provider">${providerOpts}</select>
@@ -418,8 +446,8 @@ function _dismissKeyboard(input) {

 /** Reset the category selector back to "Auto" (called after each start). */
 function _resetCategoryToAuto() {
-  document.querySelectorAll('.research-cat').forEach(b =>
-    b.classList.toggle('active', (b.dataset.cat || '') === ''));
+  const sel = document.getElementById('research-category');
+  if (sel) sel.value = '';
 }

 function _wireEvents(pane) {
@@ -433,13 +461,6 @@ function _wireEvents(pane) {
  pane.querySelector('#research-start-btn').addEventListener('click', _handleStart);
  pane.querySelector('#research-add-btn').addEventListener('click', _handleAdd);

-  pane.querySelectorAll('.research-cat').forEach(btn => {
-    btn.addEventListener('click', () => {
-      pane.querySelectorAll('.research-cat').forEach(b => b.classList.remove('active'));
-      btn.classList.add('active');
-    });
-  });
-
  pane.querySelector('#research-settings-toggle').addEventListener('click', () => {
    const body = document.getElementById('research-settings-body');
    const btn = document.getElementById('research-settings-toggle');
@@ -465,8 +486,7 @@ function _wireEvents(pane) {
 }

 function _readSettings() {
-  const activeCat = document.querySelector('.research-cat.active');
-  const category = activeCat?.dataset.cat || undefined;
+  const category = document.getElementById('research-category')?.value || undefined;
  const settings = {
    max_rounds: parseInt(document.getElementById('research-rounds')?.value || '0', 10),
    search_provider: document.getElementById('research-search-provider')?.value || undefined,
@@ -505,9 +525,8 @@ function _editJob(job) {
  }
  // Restore category
  const cat = job.category || '';
-  document.querySelectorAll('.research-cat').forEach(b => {
-    b.classList.toggle('active', b.dataset.cat === cat);
-  });
+  const catSel = document.getElementById('research-category');
+  if (catSel) catSel.value = cat;
  // Restore settings
  const s = job.settings || {};
  const roundsEl = document.getElementById('research-rounds');
@@ -594,9 +613,8 @@ function _restoreSavedSettings() {
  const saved = _loadSettingsFromStorage();
  if (!saved) return;
  if (saved.category !== undefined) {
-    document.querySelectorAll('.research-cat').forEach(b => {
-      b.classList.toggle('active', b.dataset.cat === saved.category);
-    });
+    const catSel = document.getElementById('research-category');
+    if (catSel) catSel.value = saved.category;
  }
  // Rounds intentionally defaults to "Auto" on every open — don't restore.
  // Users can pick a specific cap each time if needed.
@@ -785,22 +803,26 @@ function _renderJobs() {
    });
    const body = document.createElement('div');
    body.className = 'research-section-body';
-    // Hint inside the "Past research" header (second line, styled like the main
-    // Research description) — past research is kept in the Library's Research tab.
+    // Past Research header: link goes INLINE next to the title instead
+    // of on a second row. Append it to the title span as a small chip.
    if (key === 'past') {
-      const hint = document.createElement('div');
-      hint.className = 'memory-desc doclib-desc research-library-hint';
-      hint.innerHTML = 'All past research found in <button type="button" class="research-library-link">Library, Research</button>';
-      hint.querySelector('.research-library-link').addEventListener('click', (e) => {
-        e.stopPropagation();
-        // Close the research panel first so the Library opens ABOVE it on mobile
-        // (otherwise it stacks under the full-screen panel).
-        closePanel();
-        if (window.documentModule && window.documentModule.openLibrary) {
-          window.documentModule.openLibrary({ tab: 'research' });
-        }
-      });
-      header.appendChild(hint);
+      const titleEl = header.querySelector('.research-section-title');
+      if (titleEl) {
+        const hint = document.createElement('span');
+        hint.className = 'research-library-hint research-library-hint-inline';
+        hint.style.cssText = 'margin-left:8px;font-size:10.5px;opacity:0.65;font-weight:normal;';
+        hint.innerHTML = '— all in <button type="button" class="research-library-link" style="background:none;border:none;padding:0;font:inherit;color:var(--accent, var(--red));cursor:pointer;text-decoration:underline;">Library, Research</button>';
+        hint.querySelector('.research-library-link').addEventListener('click', (e) => {
+          e.stopPropagation();
+          // Close the research panel first so the Library opens ABOVE it on mobile
+          // (otherwise it stacks under the full-screen panel).
+          closePanel();
+          if (window.documentModule && window.documentModule.openLibrary) {
+            window.documentModule.openLibrary({ tab: 'research' });
+          }
+        });
+        titleEl.appendChild(hint);
+      }
    }
    arr.forEach(j => body.appendChild(_buildJobCard(j)));
    sec.appendChild(header);
@@ -2258,8 +2258,8 @@ if (document.readyState === 'loading') {
 // Shared global listener to close all session dropdowns on click-away or Escape
 function _initDropdownDismiss() {
  document.addEventListener('click', (e) => {
-    if (e.target.closest('.session-dropdown-menu')) return;
-    document.querySelectorAll('.session-dropdown-menu').forEach(d => d.style.display = 'none');
+    if (e.target.closest('.session-dropdown-menu, .session-folder-submenu')) return;
+    document.querySelectorAll('.session-dropdown-menu, .session-folder-submenu').forEach(d => d.style.display = 'none');
  });
  // Watch the sidebar — when it's hidden (any path: hamburger, swipe, mobile
  // collapse), close any open session dropdowns so they don't orphan over
@@ -2268,14 +2268,16 @@ function _initDropdownDismiss() {
  if (_sb) {
    new MutationObserver(() => {
      if (_sb.classList.contains('hidden')) {
-        document.querySelectorAll('.session-dropdown-menu, .folder-submenu').forEach(d => d.style.display = 'none');
+        document.querySelectorAll('.session-dropdown-menu, .session-folder-submenu').forEach(d => d.style.display = 'none');
      }
    }).observe(_sb, { attributes: true, attributeFilter: ['class'] });
  }
  document.addEventListener('keydown', (e) => {
-    if (e.key === 'Escape') {
-      document.querySelectorAll('.session-dropdown-menu').forEach(d => d.style.display = 'none');
-    }
+    if (e.key !== 'Escape') return;
+    // Esc must dismiss both the parent dropdown AND the Move-to-folder
+    // submenu in one keypress — previously only the dropdown closed and
+    // the submenu was left orphaned on screen.
+    document.querySelectorAll('.session-dropdown-menu, .session-folder-submenu').forEach(d => d.style.display = 'none');
  });
 }

@@ -91,7 +91,18 @@ export async function loadSkills(cascade = false) {
  try {
    const res = await fetch(`${API}/api/skills`);
    const data = await res.json();
-    skills = data.skills || [];
+    // Dedupe by name (case-insensitive) — the API has occasionally
+    // returned the same skill twice (built-in shadow + user copy, or
+    // a write-then-read race), and rendering both made the duplicate
+    // detector mark BOTH entries as the "recommended" keeper.
+    const _seen = new Set();
+    skills = (data.skills || []).filter(sk => {
+      const k = String(sk?.name || sk?.id || '').toLowerCase();
+      if (!k) return true;
+      if (_seen.has(k)) return false;
+      _seen.add(k);
+      return true;
+    });
    _loadSkillApprovalThreshold();
    // Built-in capabilities are no longer surfaced in the Skills menu.
    loaded = true;
@@ -392,21 +403,11 @@ function _openSkillMenu(btn, card, sk, name, isPublished) {
  };
  if (isPublished) mk(_ICON.unpublish, 'Unpublish', {}, () => _setSkillStatus(name, 'draft'));
  else mk(_ICON.approve, 'Publish', {}, () => _setSkillStatus(name, 'published'));
-  mk(_ICON.edit, 'Edit', {}, async () => {
-    if (!card.classList.contains('doclib-card-expanded')) await _expandSkillCard(card, name);
-    _toggleSkillEdit(card, name);
-  });
-  mk(_ICON.test, 'Test', {}, () => _testSkill(card, name));
-  // Audit kicks off the bulk audit-all loop (test → judge → fix → retry → demote).
-  // Starts at the top of the list and walks down.
-  mk(_ICON.test, 'Audit', {}, () => _auditAllSkills());
-  mk(_ICON.del, 'Delete', { danger: true }, () => _deleteSkill(name, card));
-
-  // Select — enters bulk-select mode and pre-selects this skill. Same pattern
-  // as the email/documents/brain Select item, with the email bullet icon.
+  // Select — moved up to 2nd so it sits next to Publish/Unpublish
+  // (bulk actions cluster at the top of the menu).
  const selItem = document.createElement('button');
  selItem.className = 'skill-kebab-item';
-  selItem.innerHTML = '<span style="display:inline-flex;width:14px;height:14px;align-items:center;justify-content:center;"><span style="font-size:16px;line-height:1;">●</span></span><span>Select</span>';
+  selItem.innerHTML = '<svg class="memory-select-btn-icon" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="flex-shrink:0;"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor" stroke="none"/></svg><span>Select</span>';
  selItem.addEventListener('click', (e) => {
    e.stopPropagation();
    menu.remove();
@@ -416,6 +417,15 @@ function _openSkillMenu(btn, card, sk, name, isPublished) {
  });
  menu.appendChild(selItem);

+  mk(_ICON.edit, 'Edit', {}, async () => {
+    if (!card.classList.contains('doclib-card-expanded')) await _expandSkillCard(card, name);
+    _toggleSkillEdit(card, name);
+  });
+  mk(_ICON.test, 'Test', {}, () => _testSkill(card, name));
+  // Audit kicks off the bulk audit-all loop (test → judge → fix → retry → demote).
+  mk(_ICON.test, 'Audit', {}, () => _auditAllSkills());
+  mk(_ICON.del, 'Delete', { danger: true }, () => _deleteSkill(name, card));
+
  // Mobile-only Cancel — mirrors the email/documents/brain popup pattern.
  // CSS hides `.dropdown-cancel-mobile` on desktop where outside-click
  // already dismisses cleanly.
@@ -1597,13 +1607,16 @@ function _renderAuditPanel(panel, st) {

 // ---- Select mode / bulk actions ----

+const _SKILLS_SELECT_BTN_DOT_SVG = '<svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:3px;"><circle cx="12" cy="12" r="10"/><circle cx="12" cy="12" r="3" fill="currentColor" stroke="none"/></svg>';
+const _SKILLS_SELECT_BTN_X_SVG = '<svg class="memory-select-btn-icon" width="11" height="11" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2.5" stroke-linecap="round" style="vertical-align:-2px;margin-right:3px;"><line x1="18" y1="6" x2="6" y2="18"/><line x1="6" y1="6" x2="18" y2="18"/></svg>';
+
 function _enterSelectMode() {
  _selectMode = true;
  _selectedNames.clear();
  const bar = document.getElementById('skills-bulk-bar');
  const btn = document.getElementById('skills-select-btn');
  if (bar) bar.classList.remove('hidden');
-  if (btn) { btn.classList.add('active'); btn.textContent = 'Cancel'; }
+  if (btn) { btn.classList.add('active'); btn.innerHTML = _SKILLS_SELECT_BTN_X_SVG + 'Cancel'; }
  _updateBulkBar();
  renderSkillsList();
 }
@@ -1615,7 +1628,7 @@ function _exitSelectMode() {
  const btn = document.getElementById('skills-select-btn');
  const all = document.getElementById('skills-select-all');
  if (bar) bar.classList.add('hidden');
-  if (btn) { btn.classList.remove('active'); btn.textContent = 'Select'; }
+  if (btn) { btn.classList.remove('active'); btn.innerHTML = _SKILLS_SELECT_BTN_DOT_SVG + 'Select'; }
  if (all) all.checked = false;
  renderSkillsList();
 }
@@ -1077,9 +1077,23 @@ function _showForm(existing, initTaskType, initTriggerType) {
    typeOpts.innerHTML = '';
    if (taskType === 'llm' || taskType === 'research') {
      const placeholder = taskType === 'research' ? 'What should be researched?' : 'What should the AI do?';
+      const _personaOpts = [
+        ['', 'Default (no persona)'],
+        ['socrates', 'Socrates'],
+        ['razor', 'Razor'],
+        ['nietzsche', 'Nietzsche'],
+        ['spark', 'Spark'],
+        ['odysseus', 'Odysseus'],
+      ];
+      const _curPersona = (existing?.character_id || '').toLowerCase();
+      const _personaOptsHtml = _personaOpts.map(([v, label]) =>
+        `<option value="${v}" ${v === _curPersona ? 'selected' : ''}>${label}</option>`).join('');
      typeOpts.innerHTML = `
        <label class="task-form-label">${taskType === 'research' ? 'Research question' : 'Prompt'}</label>
        <textarea id="task-form-prompt" class="task-form-input task-form-textarea" rows="4" placeholder="${placeholder}">${existing?.prompt || ''}</textarea>
+
+        <label class="task-form-label">Persona <span style="opacity:0.5;font-weight:normal;font-size:10px;">(optional — biases the output voice)</span></label>
+        <select id="task-form-persona" class="task-form-input">${_personaOptsHtml}</select>
      `;
    } else {
      typeOpts.innerHTML = `
@@ -1437,7 +1451,11 @@ function _showForm(existing, initTaskType, initTriggerType) {
        return;
      }
      payload.prompt = prompt;
+      const personaVal = document.getElementById('task-form-persona')?.value || '';
+      payload.character_id = personaVal;
    } else {
+      // Non-llm/research tasks: explicitly clear any persona on switch.
+      payload.character_id = '';
      const action = document.getElementById('task-form-action')?.value;
      if (!action) {
        if (uiModule) uiModule.showError('Select an action');
@@ -2482,12 +2500,15 @@ function _renderMainView() {

 // ---- Modal ----

-export function openTasks(focusId) {
+export function openTasks(focusId, opts) {
+  const o = opts || {};
  if (_open) {
-    // Already open — just focus the requested task.
+    // Already open — just focus the requested task / apply filter.
+    if (o.filter !== undefined) { _taskFilter = o.filter; _renderList(); }
    if (focusId) _focusTask(focusId);
    return;
  }
+  if (o.filter !== undefined) _taskFilter = o.filter;
  _pendingFocusTaskId = focusId || null;
  _open = true;
  _tasksCascadeNext = true;
@@ -0,0 +1,104 @@
+from src import ai_interaction
+
+
+class _GenerationResponse:
+    status_code = 200
+    text = ""
+
+    def __init__(self, image_url):
+        self._image_url = image_url
+
+    def json(self):
+        return {"data": [{"url": self._image_url}]}
+
+
+class _DownloadResponse:
+    status_code = 503
+    content = b""
+
+
+def _patch_generation(monkeypatch, image_url):
+    async def _post(self, url, json, headers):
+        return _GenerationResponse(image_url)
+
+    class _AsyncClient:
+        def __init__(self, *args, **kwargs):
+            pass
+
+        async def __aenter__(self):
+            return self
+
+        async def __aexit__(self, *exc):
+            return False
+
+        post = _post
+
+    import httpx
+    import src.settings as settings
+
+    monkeypatch.setattr(settings, "load_settings", lambda: {})
+    monkeypatch.setattr(httpx, "AsyncClient", _AsyncClient)
+    monkeypatch.setattr(
+        ai_interaction,
+        "_resolve_model",
+        lambda model_spec, owner=None: (
+            "https://api.openai.example/v1/chat/completions",
+            "dall-e-3",
+            {"Authorization": "Bearer test"},
+        ),
+    )
+
+
+async def test_generate_image_validates_provider_url_before_download(monkeypatch):
+    import httpx
+    import src.url_safety as url_safety
+
+    provider_url = "https://images.example.com/generated.png?sig=abc"
+    events = []
+    _patch_generation(monkeypatch, provider_url)
+
+    def _check_outbound_url(url, *, block_private=False):
+        events.append(("check", url, block_private))
+        return True, "ok"
+
+    def _get(url, *, timeout):
+        events.append(("get", url, timeout))
+        return _DownloadResponse()
+
+    monkeypatch.setattr(url_safety, "check_outbound_url", _check_outbound_url)
+    monkeypatch.setattr(httpx, "get", _get)
+
+    result = await ai_interaction.do_generate_image("draw a chair\ndall-e-3")
+
+    assert result["image_url"] == provider_url
+    assert events == [
+        ("check", provider_url, False),
+        ("get", provider_url, 60),
+    ]
+
+
+async def test_generate_image_rejects_unsafe_provider_url_without_download(monkeypatch):
+    import httpx
+    import src.url_safety as url_safety
+
+    unsafe_url = "http://169.254.169.254/latest/meta-data"
+    events = []
+    _patch_generation(monkeypatch, unsafe_url)
+
+    def _check_outbound_url(url, *, block_private=False):
+        events.append(("check", url, block_private))
+        return False, "link-local address blocked (SSRF metadata risk): 169.254.169.254"
+
+    def _get(url, *, timeout):
+        raise AssertionError("unsafe provider image URL must not be downloaded")
+
+    monkeypatch.setattr(url_safety, "check_outbound_url", _check_outbound_url)
+    monkeypatch.setattr(httpx, "get", _get)
+
+    result = await ai_interaction.do_generate_image("draw a chair\ndall-e-3")
+
+    assert result["error"] == (
+        "Image API returned unsafe image URL: "
+        "link-local address blocked (SSRF metadata risk): 169.254.169.254"
+    )
+    assert events == [("check", unsafe_url, False)]
@@ -502,3 +502,77 @@ def test_delete_token_owner_check_skipped_when_auth_disabled(monkeypatch, token_
    resp = delete_token(request=req, token_id="tok123")
    assert resp == {"status": "deleted"}
    fake_session.delete.assert_called_once_with(fake_token)
+
+
+# ---------------------------------------------------------------------------
+# 7. PATCH /api/tokens/{id} — non-object JSON bodies must not 500
+# ---------------------------------------------------------------------------
+
+
+def test_update_token_with_array_body_does_not_500(monkeypatch, token_routes_mod):
+    """PATCH body of [] must be normalised to {} and not raise."""
+    monkeypatch.setenv("AUTH_ENABLED", "true")
+    mod = token_routes_mod
+
+    token = SimpleNamespace(
+        id="tok123", name="original", owner="alice",
+        token_prefix="ody_orig", scopes="email:read", is_active=True,
+    )
+    fake_session = MagicMock()
+    fake_session.query.return_value.filter.return_value.first.return_value = token
+    monkeypatch.setattr(mod, "get_db_session", lambda: _db_ctx(fake_session))
+
+    invalidator = MagicMock()
+    req = _patch_request(invalidator, [])
+    update_token = _get_handler(mod, "PATCH", "/tokens/{token_id}")
+    resp = asyncio.run(update_token(request=req, token_id="tok123"))
+
+    # Name and scopes must be unchanged — payload was normalised to {}
+    assert token.name == "original"
+    assert token.scopes == "email:read"
+    assert resp["name"] == "original"
+
+
+def test_update_token_with_null_body_does_not_500(monkeypatch, token_routes_mod):
+    """PATCH body of null must be normalised to {} and not raise."""
+    monkeypatch.setenv("AUTH_ENABLED", "true")
+    mod = token_routes_mod
+
+    token = SimpleNamespace(
+        id="tok123", name="original", owner="alice",
+        token_prefix="ody_orig", scopes="chat", is_active=True,
+    )
+    fake_session = MagicMock()
+    fake_session.query.return_value.filter.return_value.first.return_value = token
+    monkeypatch.setattr(mod, "get_db_session", lambda: _db_ctx(fake_session))
+
+    invalidator = MagicMock()
+    req = _patch_request(invalidator, None)
+    update_token = _get_handler(mod, "PATCH", "/tokens/{token_id}")
+    resp = asyncio.run(update_token(request=req, token_id="tok123"))
+
+    assert token.name == "original"
+    assert token.scopes == "chat"
+
+
+def test_update_token_normal_object_still_works(monkeypatch, token_routes_mod):
+    """Normal dict payload continues to update fields as before."""
+    monkeypatch.setenv("AUTH_ENABLED", "true")
+    mod = token_routes_mod
+
+    token = SimpleNamespace(
+        id="tok123", name="original", owner="alice",
+        token_prefix="ody_orig", scopes="email:read", is_active=True,
+    )
+    fake_session = MagicMock()
+    fake_session.query.return_value.filter.return_value.first.return_value = token
+    monkeypatch.setattr(mod, "get_db_session", lambda: _db_ctx(fake_session))
+
+    invalidator = MagicMock()
+    req = _patch_request(invalidator, {"name": "updated"})
+    update_token = _get_handler(mod, "PATCH", "/tokens/{token_id}")
+    resp = asyncio.run(update_token(request=req, token_id="tok123"))
+
+    assert token.name == "updated"
+    assert resp["name"] == "updated"
+    invalidator.assert_called_once()
@@ -0,0 +1,88 @@
+"""do_manage_calendar must honour abbreviated reminder phrasings like "mins"/"hrs".
+
+`_reminder_minutes` parsed the reminder offset with regexes anchored on
+`(?:m|min|minute|minutes)\b` / `(?:h|hr|hour|hours)\b`. The trailing `\b`
+made the very common plural abbreviations "mins" and "hrs" fail to match
+(after "min" the next char "s" is a word char, so no boundary), so a request
+like ``reminder_minutes: "5 mins"`` silently produced no reminder at all —
+even though the sibling duration parser (no `\b`) already accepted them.
+"""
+
+import json
+import sys
+import uuid
+
+import pytest
+
+from tests.helpers.import_state import clear_fake_database_modules
+from tests.helpers.sqlite_db import make_temp_sqlite
+
+clear_fake_database_modules()
+
+import core.database as cdb
+from core.database import Note
+
+_TS, _ENGINE, _TMPDB = make_temp_sqlite(cdb.Base.metadata)
+
+
+@pytest.fixture(autouse=True)
+def _bind_temp_db(monkeypatch):
+    monkeypatch.setitem(sys.modules, "core.database", cdb)
+    parent = sys.modules.get("core")
+    if parent is not None:
+        monkeypatch.setattr(parent, "database", cdb, raising=False)
+    monkeypatch.setattr(cdb, "SessionLocal", _TS)
+    yield
+
+
+async def _create_with_reminder(reminder, owner):
+    from src.tool_implementations import do_manage_calendar
+
+    payload = {
+        "action": "create_event",
+        "summary": "Dentist",
+        # Far-future so the reminder is never "already passed".
+        "dtstart": "2030-01-01T10:00:00",
+        "reminder_minutes": reminder,
+    }
+    return await do_manage_calendar(json.dumps(payload), owner=owner)
+
+
+@pytest.mark.parametrize("reminder,expected", [
+    ("5 mins", 5),
+    ("10 mins", 10),
+    ("2 hrs", 120),
+    ("1 hr", 60),
+    ("15 minutes", 15),   # regression: long form still works
+    ("30m", 30),          # regression: bare unit still works
+])
+async def test_reminder_minutes_accepts_abbreviations(reminder, expected):
+    owner = "tester-" + uuid.uuid4().hex[:6]
+    res = await _create_with_reminder(reminder, owner)
+    assert res.get("exit_code") == 0, res
+    assert f"reminder {expected} min before" in res.get("response", ""), res
+
+    db = _TS()
+    try:
+        note = (
+            db.query(Note)
+            .filter(Note.owner == owner, Note.title == "Reminder: Dentist")
+            .first()
+        )
+        assert note is not None, "reminder note should have been created"
+    finally:
+        db.close()
+
+
+async def test_no_reminder_when_offset_absent():
+    owner = "tester-" + uuid.uuid4().hex[:6]
+    from src.tool_implementations import do_manage_calendar
+
+    payload = {
+        "action": "create_event",
+        "summary": "No Reminder Event",
+        "dtstart": "2030-02-01T10:00:00",
+    }
+    res = await do_manage_calendar(json.dumps(payload), owner=owner)
+    assert res.get("exit_code") == 0, res
+    assert "reminder set" not in res.get("response", ""), res
@@ -7,12 +7,39 @@ in ``remoteHost`` would be injected into that command.
 These pin validation on the host/port before they reach the ssh string, matching
 the validators the rest of the cookbook routes already apply.
 """
+import asyncio
+
 import pytest
 from fastapi import HTTPException
+from starlette.requests import Request

 import routes.codex_routes as codex_routes


+def _route_endpoint(path: str, method: str):
+    router = codex_routes.setup_codex_routes()
+    for route in router.routes:
+        if route.path == path and method in route.methods:
+            return route.endpoint
+    raise AssertionError(f"{method} {path} route not found")
+
+
+def _launch_request() -> Request:
+    request = Request(
+        {
+            "type": "http",
+            "method": "POST",
+            "path": "/api/codex/cookbook/adopt",
+            "headers": [],
+            "state": {},
+        }
+    )
+    request.state.api_token = True
+    request.state.api_token_owner = "alice"
+    request.state.api_token_scopes = ["cookbook:launch"]
+    return request
+
+
 def test_rejects_remote_host_with_shell_metacharacters():
    task = {"remoteHost": "box; rm -rf ~", "sshPort": ""}
    with pytest.raises(HTTPException) as exc:
@@ -47,3 +74,26 @@ def test_default_ssh_port_omits_flag():
    )
    assert host == "box"
    assert port_flag == ""
+
+
+def test_adopt_rejects_ssh_option_host_before_shell(monkeypatch):
+    calls = []
+
+    async def fail_if_shell_runs(*args, **kwargs):
+        calls.append((args, kwargs))
+        raise RuntimeError("shell should not run for invalid host")
+
+    monkeypatch.setattr(asyncio, "create_subprocess_shell", fail_if_shell_runs)
+
+    endpoint = _route_endpoint("/api/codex/cookbook/adopt", "POST")
+    body = {
+        "tmux_session": "serve_abc123",
+        "model": "org/model",
+        "host": "-oProxyCommand=sh",
+    }
+
+    with pytest.raises(HTTPException) as exc:
+        asyncio.run(endpoint(_launch_request(), body))
+
+    assert exc.value.status_code == 400
+    assert calls == []
@@ -0,0 +1,50 @@
+import json
+
+import pytest
+
+from src import cookbook_serve_lifecycle as lifecycle
+
+
+@pytest.mark.asyncio
+async def test_tick_persists_only_successfully_stopped_serves(tmp_path, monkeypatch):
+    state_path = tmp_path / "cookbook_state.json"
+    state_path.write_text(
+        json.dumps({
+            "tasks": [
+                {
+                    "id": "stop-succeeds",
+                    "type": "serve",
+                    "status": "running",
+                    "_scheduledStopAtMs": 0,
+                },
+                {
+                    "id": "stop-fails",
+                    "type": "serve",
+                    "status": "running",
+                    "_scheduledStopAtMs": 0,
+                },
+            ]
+        }),
+        encoding="utf-8",
+    )
+
+    async def fake_stop_serve(session_id, remote_host="", ssh_port=""):
+        return session_id == "stop-succeeds"
+
+    async def fake_delete_endpoint(task):
+        return None
+
+    monkeypatch.setattr(lifecycle, "COOKBOOK_STATE_FILE", str(state_path))
+    monkeypatch.setattr(lifecycle, "_stop_serve", fake_stop_serve)
+    monkeypatch.setattr(lifecycle, "_delete_endpoint_for_task", fake_delete_endpoint)
+
+    await lifecycle._tick()
+
+    tasks = {
+        task["id"]: task
+        for task in json.loads(state_path.read_text(encoding="utf-8"))["tasks"]
+    }
+    assert tasks["stop-succeeds"]["status"] == "stopped"
+    assert tasks["stop-succeeds"]["_scheduledStopAtMs"] is None
+    assert tasks["stop-fails"]["status"] == "running"
+    assert tasks["stop-fails"]["_scheduledStopAtMs"] == 0
@@ -0,0 +1,580 @@
+"""Tests for the Google OAuth2 email helpers.
+
+Covers the security-critical surface added for Google Workspace / .edu
+IMAP/SMTP support:
+
+- `make_oauth_state` / `verify_oauth_state` — HMAC-signed OAuth state so the
+  callback can't be CSRF'd or have its account_id/owner tampered with.
+- `_smtp_ready` — an OAuth account (no stored password) must still count as
+  send-capable; a host+user-only account without password or OAuth must not.
+- `_xoauth2_raw` / `_xoauth2_bytes` — SASL XOAUTH2 framing for SMTP/IMAP.
+- `_refresh_google_token` — token refresh stores result encrypted; failure is
+  silent (no token/secret in logs or return value).
+- `_get_valid_google_token` — uses cached token when fresh; calls refresh when
+  expired.
+- `google_oauth_callback` (real route) — invalid/tampered/missing state and
+  provider errors return generic redirects with no PII; owner mismatch refuses
+  the token write; a valid owner writes encrypted tokens only to the intended
+  account.
+- `list_email_accounts` (real route) — exposes OAuth status but never token
+  values.
+- `_imap_connect` — password accounts use login(); OAuth accounts use XOAUTH2.
+
+Route tests pull the live endpoint out of `setup_email_routes()` and call it
+directly — they pin the real handler, not a re-implementation. The ASGI app is
+not booted; outbound HTTP is mocked and the DB is an isolated in-memory SQLite.
+"""
+
+import base64
+import json
+import time
+import unittest.mock as mock
+
+import pytest
+
+
+# ── OAuth state signing ──────────────────────────────────────────
+
+def test_oauth_state_round_trips_account_and_owner():
+    from routes.email_helpers import make_oauth_state, verify_oauth_state
+
+    state = make_oauth_state("acct-123", "user@example.com")
+    payload = verify_oauth_state(state)
+
+    assert payload is not None
+    assert payload["a"] == "acct-123"
+    assert payload["o"] == "user@example.com"
+    assert payload["n"]  # nonce present
+
+
+def test_oauth_state_nonce_is_unique_per_call():
+    from routes.email_helpers import make_oauth_state, verify_oauth_state
+
+    a = verify_oauth_state(make_oauth_state("acct", "o"))
+    b = verify_oauth_state(make_oauth_state("acct", "o"))
+    assert a["n"] != b["n"]
+
+
+def test_oauth_state_rejects_tampered_account_id():
+    from routes.email_helpers import make_oauth_state, verify_oauth_state
+
+    state = make_oauth_state("acct-123", "user@example.com")
+    decoded = base64.urlsafe_b64decode(state.encode()).decode()
+    payload_str, sig = decoded.rsplit("|", 1)
+    payload = json.loads(payload_str)
+    payload["a"] = "evil-acct"  # attacker swaps the target account
+    forged = base64.urlsafe_b64encode(
+        (json.dumps(payload, separators=(",", ":")) + "|" + sig).encode()
+    ).decode()
+
+    assert verify_oauth_state(forged) is None
+
+
+def test_oauth_state_rejects_forged_signature():
+    from routes.email_helpers import make_oauth_state, verify_oauth_state
+
+    state = make_oauth_state("acct-123", "user@example.com")
+    decoded = base64.urlsafe_b64decode(state.encode()).decode()
+    payload_str, _ = decoded.rsplit("|", 1)
+    forged = base64.urlsafe_b64encode((payload_str + "|" + "deadbeef" * 8).encode()).decode()
+
+    assert verify_oauth_state(forged) is None
+
+
+@pytest.mark.parametrize("garbage", ["", "not-base64-at-all", "###", "a|b|c"])
+def test_oauth_state_rejects_garbage(garbage):
+    from routes.email_helpers import verify_oauth_state
+
+    assert verify_oauth_state(garbage) is None
+
+
+# ── _smtp_ready: OAuth accounts have no password but can still send ──
+
+def test_smtp_ready_true_for_oauth_account_without_password():
+    from routes.email_routes import _smtp_ready
+
+    cfg = {
+        "smtp_host": "smtp.gmail.com",
+        "smtp_user": "me@nyu.edu",
+        "smtp_password": "",
+        "oauth_provider": "google",
+    }
+    assert _smtp_ready(cfg) is True
+
+
+def test_smtp_ready_true_for_password_account():
+    from routes.email_routes import _smtp_ready
+
+    cfg = {
+        "smtp_host": "smtp.example.com",
+        "smtp_user": "me@example.com",
+        "smtp_password": "app-password",
+        "oauth_provider": "",
+    }
+    assert _smtp_ready(cfg) is True
+
+
+def test_smtp_ready_false_without_password_or_oauth():
+    from routes.email_routes import _smtp_ready
+
+    cfg = {
+        "smtp_host": "smtp.example.com",
+        "smtp_user": "me@example.com",
+        "smtp_password": "",
+        "oauth_provider": "",
+    }
+    assert _smtp_ready(cfg) is False
+
+
+def test_smtp_ready_false_without_host():
+    from routes.email_routes import _smtp_ready
+
+    cfg = {"smtp_host": "", "smtp_user": "me@x.com", "oauth_provider": "google"}
+    assert _smtp_ready(cfg) is False
+
+
+# ── XOAUTH2 SASL framing ─────────────────────────────────────────
+
+def test_xoauth2_raw_is_unencoded_sasl_frame():
+    from routes.email_helpers import _xoauth2_raw
+
+    assert _xoauth2_raw("me@nyu.edu", "tok123") == "user=me@nyu.edu\x01auth=Bearer tok123\x01\x01"
+
+
+def test_xoauth2_bytes_is_raw_frame_encoded():
+    from routes.email_helpers import _xoauth2_bytes
+
+    assert _xoauth2_bytes("me@nyu.edu", "tok123") == b"user=me@nyu.edu\x01auth=Bearer tok123\x01\x01"
+
+
+# ── Helpers for in-memory DB fixtures ────────────────────────────
+
+def _make_db():
+    """Return (Session, SessionFactory) backed by an isolated in-memory SQLite DB.
+
+    Used to test DB-touching helpers without the real database.
+    The factory lets tests open a fresh session after the helper closes its own.
+    """
+    from sqlalchemy import create_engine
+    from sqlalchemy.orm import sessionmaker
+    from core.database import Base
+    engine = create_engine("sqlite:///:memory:", connect_args={"check_same_thread": False})
+    Base.metadata.create_all(engine)
+    Factory = sessionmaker(bind=engine)
+    return Factory(), Factory
+
+
+def _make_account(session, account_id="acct-1", owner="alice", **kwargs):
+    """Insert a minimal EmailAccount row and return it."""
+    from core.database import EmailAccount
+    row = EmailAccount(
+        id=account_id,
+        owner=owner,
+        name=kwargs.get("name", "Test"),
+        from_address=kwargs.get("from_address", "test@example.com"),
+        imap_host=kwargs.get("imap_host", "imap.gmail.com"),
+        imap_port=kwargs.get("imap_port", 993),
+        imap_user=kwargs.get("imap_user", "test@example.com"),
+        smtp_host=kwargs.get("smtp_host", "smtp.gmail.com"),
+        smtp_port=kwargs.get("smtp_port", 587),
+        smtp_user=kwargs.get("smtp_user", "test@example.com"),
+    )
+    for k, v in kwargs.items():
+        if hasattr(row, k):
+            setattr(row, k, v)
+    session.add(row)
+    session.commit()
+    return row
+
+
+# ── Token encryption at rest ─────────────────────────────────────
+
+def test_refresh_token_stored_encrypted_not_raw():
+    """_refresh_google_token must encrypt the new access token before writing it
+    to the DB — storing the raw token string would expose credentials at rest."""
+    from src.secret_storage import encrypt as _enc, decrypt as _dec
+    from core.database import EmailAccount
+
+    raw_token = "ya29.test_access_token_raw"
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-r", owner="bob",
+                  oauth_refresh_token=_enc("refresh-tok-xyz"))
+    db.close()
+
+    fake_resp = mock.MagicMock()
+    fake_resp.raise_for_status = mock.MagicMock()
+    fake_resp.json.return_value = {"access_token": raw_token, "expires_in": 3600}
+
+    with mock.patch("httpx.post", return_value=fake_resp), \
+         mock.patch("core.database.SessionLocal", Factory), \
+         mock.patch("routes.email_helpers.os.environ.get", side_effect=lambda k, d="": {
+             "GOOGLE_OAUTH_CLIENT_ID": "cid", "GOOGLE_OAUTH_CLIENT_SECRET": "csec"
+         }.get(k, d)):
+        from routes.email_helpers import _refresh_google_token
+        result = _refresh_google_token("acct-r")
+
+    verify_db = Factory()
+    row = verify_db.query(EmailAccount).filter(EmailAccount.id == "acct-r").first()
+    stored = row.oauth_access_token
+    verify_db.close()
+
+    assert result == raw_token, "function should return the plain access token to callers"
+    assert stored != raw_token, "raw token must not be stored directly in the DB"
+    assert _dec(stored) == raw_token, "stored value must decrypt back to the raw token"
+
+
+def test_refresh_stores_encrypted_expiry_not_token():
+    """oauth_token_expiry stores only a timestamp, never the token value."""
+    from src.secret_storage import encrypt as _enc
+    from core.database import EmailAccount
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-e", owner="bob",
+                  oauth_refresh_token=_enc("ref-tok"))
+    db.close()
+
+    fake_resp = mock.MagicMock()
+    fake_resp.raise_for_status = mock.MagicMock()
+    fake_resp.json.return_value = {"access_token": "ya29.secret", "expires_in": 3600}
+
+    with mock.patch("httpx.post", return_value=fake_resp), \
+         mock.patch("core.database.SessionLocal", Factory), \
+         mock.patch("routes.email_helpers.os.environ.get", side_effect=lambda k, d="": {
+             "GOOGLE_OAUTH_CLIENT_ID": "cid", "GOOGLE_OAUTH_CLIENT_SECRET": "csec"
+         }.get(k, d)):
+        from routes.email_helpers import _refresh_google_token
+        _refresh_google_token("acct-e")
+
+    verify_db = Factory()
+    row = verify_db.query(EmailAccount).filter(EmailAccount.id == "acct-e").first()
+    expiry = row.oauth_token_expiry
+    verify_db.close()
+
+    assert "ya29" not in (expiry or ""), \
+        "token_expiry must be a timestamp, not the token string"
+
+
+# ── Real OAuth callback route ─────────────────────────────────────
+#
+# These pull the actual google_oauth_callback endpoint out of the router and
+# invoke it — they pin the real route's behaviour, not a re-implementation, so
+# they fail if the ownership/state guards are ever removed or weakened.
+
+def _callback_endpoint():
+    """Return the live google_oauth_callback endpoint from the email router."""
+    from routes.email_routes import setup_email_routes
+    router = setup_email_routes()
+    for route in router.routes:
+        if route.path == "/api/email/oauth/google/callback" and "GET" in getattr(route, "methods", set()):
+            return route.endpoint
+    raise AssertionError("google_oauth_callback route not found")
+
+
+class _FakeRequest:
+    """Minimal stand-in for starlette Request — the callback only reads headers."""
+    headers = {"host": "localhost:7000"}
+
+
+def _location(resp):
+    """Pull the redirect target out of a RedirectResponse."""
+    return resp.headers["location"]
+
+
+@pytest.mark.asyncio
+async def test_callback_missing_code_returns_generic_error():
+    """No `code` query param → generic error redirect, with no account id, owner,
+    or state echoed back into the URL."""
+    from routes.email_helpers import make_oauth_state
+
+    callback = _callback_endpoint()
+    state = make_oauth_state("acct-1", "alice")
+    resp = await callback(code=None, state=state, error=None, request=_FakeRequest())
+
+    loc = _location(resp)
+    assert "email_oauth_error=missing_code" in loc
+    assert "acct-1" not in loc, "account id must not appear in redirect URL"
+    assert "alice" not in loc, "owner must not appear in redirect URL"
+
+
+@pytest.mark.asyncio
+async def test_callback_provider_error_returns_generic_error():
+    """An `error` from Google → generic error redirect, no raw provider text."""
+    callback = _callback_endpoint()
+    resp = await callback(code=None, state=None, error="access_denied", request=_FakeRequest())
+
+    loc = _location(resp)
+    assert "email_oauth_error=google_error" in loc
+    assert "access_denied" not in loc, "raw provider error must not leak into redirect"
+
+
+@pytest.mark.asyncio
+async def test_callback_tampered_state_returns_generic_error_no_leak():
+    """Tampered/invalid state → invalid_state redirect; the auth code and any
+    token must never appear in the redirect URL."""
+    callback = _callback_endpoint()
+    resp = await callback(code="4/secret-auth-code", state="not-a-valid-state",
+                          error=None, request=_FakeRequest())
+
+    loc = _location(resp)
+    assert "email_oauth_error=invalid_state" in loc
+    assert "4/secret-auth-code" not in loc, "auth code must not leak into redirect"
+    assert "token" not in loc
+
+
+@pytest.mark.asyncio
+async def test_callback_owner_mismatch_does_not_write_tokens():
+    """A signed, valid state whose owner does not match the target account's
+    owner must NOT write tokens — this blocks one authenticated user from
+    binding their Google account onto another user's mailbox row.
+    """
+    from routes.email_helpers import make_oauth_state
+    from core.database import EmailAccount
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-x", owner="alice")
+    db.close()
+
+    # Token-exchange + userinfo would succeed — the point is the ownership gate
+    # rejects the write *before* trusting them.
+    token_resp = mock.MagicMock()
+    token_resp.raise_for_status = mock.MagicMock()
+    token_resp.json.return_value = {"access_token": "ya29.attacker", "refresh_token": "r", "expires_in": 3600}
+    userinfo_resp = mock.MagicMock()
+    userinfo_resp.is_success = True
+    userinfo_resp.json.return_value = {"email": "bob@evil.com", "name": "Bob"}
+
+    # State is genuinely signed, but for owner "bob" — not the row owner "alice".
+    state = make_oauth_state("acct-x", "bob")
+
+    with mock.patch("httpx.post", return_value=token_resp), \
+         mock.patch("httpx.get", return_value=userinfo_resp), \
+         mock.patch("core.database.SessionLocal", Factory):
+        callback = _callback_endpoint()
+        resp = await callback(code="4/code", state=state, error=None, request=_FakeRequest())
+
+    loc = _location(resp)
+    assert "email_oauth_error=ownership_error" in loc
+
+    verify_db = Factory()
+    row = verify_db.query(EmailAccount).filter(EmailAccount.id == "acct-x").first()
+    token_after = row.oauth_access_token
+    verify_db.close()
+    assert token_after is None, "no token may be written when ownership check fails"
+
+
+@pytest.mark.asyncio
+async def test_callback_valid_owner_writes_encrypted_tokens_to_intended_account():
+    """A signed state whose owner matches the target account writes the tokens —
+    and only to that account, stored encrypted (raw token never persisted)."""
+    from routes.email_helpers import make_oauth_state
+    from src.secret_storage import decrypt as _dec
+    from core.database import EmailAccount
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-v", owner="alice", imap_host="", smtp_host="")
+    _make_account(db, account_id="acct-other", owner="alice")  # must stay untouched
+    db.close()
+
+    raw_access = "ya29.legit_access_token"
+    raw_refresh = "1//legit_refresh_token"
+    token_resp = mock.MagicMock()
+    token_resp.raise_for_status = mock.MagicMock()
+    token_resp.json.return_value = {"access_token": raw_access, "refresh_token": raw_refresh, "expires_in": 3600}
+    userinfo_resp = mock.MagicMock()
+    userinfo_resp.is_success = True
+    userinfo_resp.json.return_value = {"email": "alice@nyu.edu", "name": "Alice"}
+
+    state = make_oauth_state("acct-v", "alice")
+
+    with mock.patch("httpx.post", return_value=token_resp), \
+         mock.patch("httpx.get", return_value=userinfo_resp), \
+         mock.patch("core.database.SessionLocal", Factory):
+        callback = _callback_endpoint()
+        resp = await callback(code="4/code", state=state, error=None, request=_FakeRequest())
+
+    assert "email_oauth_success=1" in _location(resp)
+
+    verify_db = Factory()
+    target = verify_db.query(EmailAccount).filter(EmailAccount.id == "acct-v").first()
+    other = verify_db.query(EmailAccount).filter(EmailAccount.id == "acct-other").first()
+    verify_db.close()
+
+    assert target.oauth_provider == "google"
+    assert target.oauth_access_token != raw_access, "access token must be stored encrypted"
+    assert _dec(target.oauth_access_token) == raw_access
+    assert _dec(target.oauth_refresh_token) == raw_refresh
+    assert other.oauth_access_token is None, "tokens must only touch the intended account"
+
+
+# ── Token refresh scenarios ───────────────────────────────────────
+
+def test_get_valid_google_token_uses_cached_when_fresh():
+    """_get_valid_google_token must NOT call refresh when the stored token is
+    still valid (expiry - 60s buffer > now). Refresh is an outbound HTTP call
+    that should only happen when genuinely needed."""
+    from src.secret_storage import encrypt as _enc
+    from routes.email_helpers import _get_valid_google_token
+
+    future_expiry = str(int(time.time()) + 7200)  # 2 hours from now
+    cfg = {
+        "account_id": "acct-fresh",
+        "oauth_access_token": _enc("ya29.fresh_token"),
+        "oauth_token_expiry": future_expiry,
+    }
+
+    with mock.patch("routes.email_helpers._refresh_google_token") as mock_refresh:
+        result = _get_valid_google_token("acct-fresh", cfg)
+
+    assert result == "ya29.fresh_token"
+    mock_refresh.assert_not_called()
+
+
+def test_get_valid_google_token_refreshes_when_expired():
+    """_get_valid_google_token must call refresh when the token is expired."""
+    from src.secret_storage import encrypt as _enc
+    from routes.email_helpers import _get_valid_google_token
+
+    past_expiry = str(int(time.time()) - 10)  # already expired
+    cfg = {
+        "account_id": "acct-exp",
+        "oauth_access_token": _enc("ya29.old_token"),
+        "oauth_token_expiry": past_expiry,
+    }
+
+    with mock.patch("routes.email_helpers._refresh_google_token", return_value="ya29.new_token") as mock_refresh:
+        result = _get_valid_google_token("acct-exp", cfg)
+
+    mock_refresh.assert_called_once_with("acct-exp")
+    assert result == "ya29.new_token"
+
+
+def test_refresh_failure_returns_none_no_secret_raised():
+    """When the refresh HTTP call fails, _refresh_google_token must return None
+    silently. It must not raise an exception or surface token/secret details."""
+    from src.secret_storage import encrypt as _enc
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-fail", owner="dave",
+                  oauth_refresh_token=_enc("ref-tok"))
+    db.close()
+
+    failing_resp = mock.MagicMock()
+    failing_resp.raise_for_status.side_effect = Exception("401 Unauthorized")
+
+    with mock.patch("httpx.post", return_value=failing_resp), \
+         mock.patch("core.database.SessionLocal", Factory), \
+         mock.patch("routes.email_helpers.os.environ.get", side_effect=lambda k, d="": {
+             "GOOGLE_OAUTH_CLIENT_ID": "cid", "GOOGLE_OAUTH_CLIENT_SECRET": "csec"
+         }.get(k, d)):
+        from routes.email_helpers import _refresh_google_token
+        result = _refresh_google_token("acct-fail")
+
+    assert result is None, "failed refresh must return None, not raise"
+
+
+def test_refresh_without_credentials_returns_none():
+    """_refresh_google_token must return None immediately when the OAuth client
+    credentials are not configured — no DB query, no HTTP call."""
+    with mock.patch("routes.email_helpers.os.environ.get", return_value=""):
+        from routes.email_helpers import _refresh_google_token
+        result = _refresh_google_token("acct-any")
+
+    assert result is None
+
+
+# ── Password-account regression ───────────────────────────────────
+
+def test_imap_connect_uses_login_for_password_accounts():
+    """Existing password-auth IMAP accounts must still call conn.login() and
+    must NOT trigger the XOAUTH2 authenticate path."""
+    from routes.email_helpers import _imap_connect
+
+    mock_conn = mock.MagicMock()
+    # _imap_connect calls _get_email_config internally — mock it to return our cfg.
+    cfg = {
+        "imap_host": "imap.gmail.com",
+        "imap_port": 993,
+        "imap_starttls": False,
+        "imap_user": "me@gmail.com",
+        "imap_password": "app-password-xyz",
+        "oauth_provider": "",
+        "account_id": "acct-pw",
+    }
+
+    with mock.patch("routes.email_helpers._open_imap_connection", return_value=mock_conn), \
+         mock.patch("routes.email_helpers._get_email_config", return_value=cfg):
+        _imap_connect("acct-pw", owner="alice")
+
+    mock_conn.login.assert_called_once_with("me@gmail.com", "app-password-xyz")
+    mock_conn.authenticate.assert_not_called()
+
+
+def test_imap_connect_uses_xoauth2_for_oauth_accounts():
+    """OAuth accounts must call conn.authenticate('XOAUTH2', ...) and must NOT
+    call conn.login() — which would fail without a password."""
+    from routes.email_helpers import _imap_connect
+    from src.secret_storage import encrypt as _enc
+
+    mock_conn = mock.MagicMock()
+    future_expiry = str(int(time.time()) + 7200)
+    cfg = {
+        "imap_host": "imap.gmail.com",
+        "imap_port": 993,
+        "imap_starttls": False,
+        "imap_user": "me@nyu.edu",
+        "imap_password": "",
+        "oauth_provider": "google",
+        "account_id": "acct-oauth",
+        "oauth_access_token": _enc("ya29.live_token"),
+        "oauth_token_expiry": future_expiry,
+    }
+
+    with mock.patch("routes.email_helpers._open_imap_connection", return_value=mock_conn), \
+         mock.patch("routes.email_helpers._get_email_config", return_value=cfg):
+        _imap_connect("acct-oauth", owner="alice")
+
+    mock_conn.authenticate.assert_called_once()
+    assert mock_conn.authenticate.call_args[0][0] == "XOAUTH2"
+    mock_conn.login.assert_not_called()
+
+
+@pytest.mark.asyncio
+async def test_account_list_response_does_not_expose_token_values():
+    """The /accounts list route is the client-facing account inventory. It must
+    expose `oauth_provider` (so the UI can show OAuth status) but never the
+    access/refresh token values, encrypted or otherwise — only boolean
+    has_*_password flags and the provider name."""
+    from routes.email_routes import setup_email_routes
+    from src.secret_storage import encrypt as _enc
+
+    raw_access = "ya29.super_secret_access_token"
+    raw_refresh = "1//super_secret_refresh_token"
+
+    db, Factory = _make_db()
+    _make_account(db, account_id="acct-list", owner="alice",
+                  oauth_provider="google",
+                  oauth_access_token=_enc(raw_access),
+                  oauth_refresh_token=_enc(raw_refresh))
+    db.close()
+
+    router = setup_email_routes()
+    list_accounts = None
+    for route in router.routes:
+        if route.path == "/api/email/accounts" and "GET" in getattr(route, "methods", set()):
+            list_accounts = route.endpoint
+            break
+    assert list_accounts is not None, "accounts list route not found"
+
+    with mock.patch("core.database.SessionLocal", Factory):
+        result = await list_accounts(owner="alice")
+
+    blob = json.dumps(result)
+    assert raw_access not in blob, "raw access token must not appear in list response"
+    assert raw_refresh not in blob, "raw refresh token must not appear in list response"
+    assert _enc(raw_access) not in blob, "encrypted token must not be sent to the client either"
+
+    acct = result["accounts"][0]
+    assert acct["oauth_provider"] == "google"   # status is exposed
+    assert "oauth_access_token" not in acct      # token value is not
+    assert "oauth_refresh_token" not in acct
@@ -41,8 +41,10 @@ def _seed(tmp_path):


 def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
-    monkeypatch.chdir(tmp_path)
    SessionLocal = _seed(tmp_path)
+    # GALLERY_IMAGE_DIR is an absolute path fixed at import, so a chdir can't
+    # redirect the delete; point the resolver at the seeded tmp dir directly.
+    monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
    monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")

    # A session whose commit always fails, to simulate a DB error mid-delete.
@@ -67,8 +69,8 @@ def test_file_kept_when_commit_fails(tmp_path, monkeypatch):


 def test_file_removed_on_successful_delete(tmp_path, monkeypatch):
-    monkeypatch.chdir(tmp_path)
    SessionLocal = _seed(tmp_path)
+    monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
    monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
    monkeypatch.setattr(gallery_routes, "SessionLocal", SessionLocal)

@@ -2,7 +2,14 @@ import os
 from pathlib import Path

 import pytest
+from fastapi import FastAPI
 from fastapi import HTTPException
+from fastapi.testclient import TestClient
+from sqlalchemy import create_engine
+from sqlalchemy.orm import sessionmaker
+from sqlalchemy.pool import NullPool
+
+from core.database import Base, GalleryImage


 def _gallery_module():
@@ -53,6 +60,57 @@ def test_gallery_image_path_rejects_symlink_escape(tmp_path, monkeypatch):
    assert exc.value.status_code == 400


+def test_gallery_replace_rejects_symlink_escape(tmp_path, monkeypatch):
+    gallery_routes = _gallery_module()
+    image_dir = tmp_path / "generated_images"
+    image_dir.mkdir()
+    outside = tmp_path / "outside.png"
+    outside.write_bytes(b"outside image root")
+    link = image_dir / "escape.png"
+    try:
+        os.symlink(outside, link)
+    except (AttributeError, NotImplementedError, OSError) as exc:
+        pytest.skip(f"symlinks unavailable: {exc}")
+
+    engine = create_engine(
+        f"sqlite:///{tmp_path / 'gallery.db'}",
+        connect_args={"check_same_thread": False},
+        poolclass=NullPool,
+    )
+    Base.metadata.create_all(engine)
+    SessionLocal = sessionmaker(bind=engine, autoflush=False, autocommit=False)
+    db = SessionLocal()
+    try:
+        db.add(
+            GalleryImage(
+                id="img-1",
+                filename="escape.png",
+                prompt="escape",
+                owner="alice",
+                is_active=True,
+            )
+        )
+        db.commit()
+    finally:
+        db.close()
+
+    monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", image_dir)
+    monkeypatch.setattr(gallery_routes, "SessionLocal", SessionLocal)
+    monkeypatch.setattr(gallery_routes, "get_current_user", lambda request: "alice")
+
+    app = FastAPI()
+    app.include_router(gallery_routes.setup_gallery_routes())
+    client = TestClient(app)
+
+    response = client.post(
+        "/api/gallery/img-1/replace",
+        files={"image": ("replacement.png", b"replacement bytes", "image/png")},
+    )
+
+    assert response.status_code == 400
+    assert outside.read_bytes() == b"outside image root"
+
+
 def test_gallery_file_operations_use_confining_resolver():
    source = Path("routes/gallery_routes.py").read_text(encoding="utf-8")

@@ -1,4 +1,4 @@
-from services.hwfit.fit import _lookup_bandwidth
+from services.hwfit.fit import _lookup_apple_bandwidth, _lookup_bandwidth


 def test_m3_max_bandwidth_uses_gpu_cores():
@@ -35,6 +35,25 @@ def test_non_apple_gpu_does_not_match_apple_bandwidth():


 def test_non_apple_gpu_with_cores_does_not_match():
-    """NVIDIA GPU with core count should not match Apple bandwidth."""
-    assert _lookup_bandwidth({"gpu_name": "NVIDIA GeForce RTX 4090", "gpu_cores": 128}) is None
-    assert _lookup_bandwidth({"gpu_name": "AMD Radeon RX 9070 XT", "gpu_cores": 64}) is None
+    """A non-Apple GPU that happens to carry a gpu_cores count must not be
+    matched by the APPLE bandwidth path. This asserts the Apple-specific
+    matcher directly: _lookup_bandwidth would (correctly) return these cards'
+    real bandwidth from the general GPU table (e.g. the RTX 4090's 1008 GB/s),
+    which is a different code path and not what this guard is about.
+    """
+    assert _lookup_apple_bandwidth({"gpu_name": "NVIDIA GeForce RTX 4090", "gpu_cores": 128}) is None
+    assert _lookup_apple_bandwidth({"gpu_name": "AMD Radeon RX 9070 XT", "gpu_cores": 64}) is None
+
+
+def test_apple_string_input_resolves_conservative_tier():
+    """Bare-string callers must still get Apple bandwidth. #2564 moved the
+    Apple tiers out of the generic GPU table into the dict-only Apple helper,
+    so _lookup_bandwidth("Apple M3 Max") (no gpu_cores) regressed to None;
+    string inputs now route through the Apple helper and get the conservative
+    (lowest) tier for the model."""
+    assert _lookup_bandwidth("Apple M3 Max") == 300
+    assert _lookup_bandwidth("Apple M4 Max") == 410
+    assert _lookup_bandwidth("Apple M5 Max") == 460
+    # Non-Apple strings still fall through to the generic table.
+    assert _lookup_bandwidth("NVIDIA GeForce RTX 4090") == 1008
+    assert _lookup_bandwidth("Totally Unknown GPU") is None
@@ -1286,6 +1286,14 @@ class _ImmediateThread:
        self.target()


+class _NoopThread:
+    def __init__(self, target, daemon=None):
+        self.target = target
+
+    def start(self):
+        return None
+
+
 def _wait_for(predicate, timeout=2.0):
    deadline = time.time() + timeout
    while time.time() < deadline:
@@ -1313,6 +1321,7 @@ def _route_ep(
    pinned_models=None,
    refresh_mode="auto",
    refresh_timeout=None,
+    owner=None,
 ):
    return SimpleNamespace(
        id=id,
@@ -1329,7 +1338,7 @@ def _route_ep(
        model_refresh_interval=None,
        model_refresh_timeout=refresh_timeout,
        supports_tools=None,
-        owner=None,
+        owner=owner,
        created_at=None,
        updated_at=None,
    )
@@ -1342,6 +1351,72 @@ def _route_request():
    )


+def test_api_models_rejects_api_token_without_chat_scope(monkeypatch):
+    router = model_routes.setup_model_routes(model_discovery=None)
+
+    def fail_session():
+        raise AssertionError("model DB should not be queried without chat scope")
+
+    monkeypatch.setattr(model_routes, "SessionLocal", fail_session)
+
+    request = SimpleNamespace(
+        state=SimpleNamespace(
+            current_user="api",
+            api_token=True,
+            api_token_owner="alice",
+            api_token_scopes=["documents:read"],
+        ),
+        app=SimpleNamespace(
+            state=SimpleNamespace(
+                auth_manager=SimpleNamespace(is_configured=True, is_admin=lambda user: False),
+            ),
+        ),
+    )
+
+    with pytest.raises(HTTPException) as exc:
+        _route_endpoint(router, "/api/models")(request)
+
+    assert exc.value.status_code == 403
+    assert "chat" in str(exc.value.detail)
+
+
+def test_api_models_scopes_api_token_to_token_owner(monkeypatch):
+    rows = [
+        _route_ep("alice", "http://alice.example/v1", cached_models=["alice-model"], owner="alice"),
+        _route_ep("shared", "http://shared.example/v1", cached_models=["shared-model"], owner=None),
+        _route_ep("bob", "http://bob.example/v1", cached_models=["bob-model"], owner="bob"),
+    ]
+    db = _RouteDb(rows)
+    router = model_routes.setup_model_routes(model_discovery=None)
+    admin_checks = []
+
+    monkeypatch.setattr(model_routes, "ModelEndpoint", _RouteModelEndpoint)
+    monkeypatch.setattr(model_routes, "SessionLocal", lambda: db)
+    monkeypatch.setattr(threading, "Thread", _NoopThread)
+
+    request = SimpleNamespace(
+        state=SimpleNamespace(
+            current_user="api",
+            api_token=True,
+            api_token_owner="alice",
+            api_token_scopes=["chat"],
+        ),
+        app=SimpleNamespace(
+            state=SimpleNamespace(
+                auth_manager=SimpleNamespace(
+                    is_configured=True,
+                    is_admin=lambda user: admin_checks.append(user) or False,
+                ),
+            ),
+        ),
+    )
+
+    result = _route_endpoint(router, "/api/models")(request)
+
+    assert [item["endpoint_name"] for item in result["items"]] == ["alice", "shared"]
+    assert admin_checks == ["alice"]
+
+
 def test_api_models_returns_cached_proxy_models_without_refresh_probe(monkeypatch):
    row = _route_ep(
        "proxy",
@@ -0,0 +1,188 @@
+"""Owner-scoped note routes must fail closed when the request has no identity.
+
+The notes CRUD routes resolved the acting user with bare get_current_user().
+A request that reached them carrying no identity (auth-middleware regression,
+SSRF from a sibling service) therefore came through as user=None — and the
+queries treat None as the single-user mode, i.e. blanket access to every
+account's notes: list everything, read/update/delete/pin/archive any row,
+reorder globally.
+
+require_user() already encodes the correct policy — 401 when auth is
+configured, while the documented anonymous modes (AUTH_ENABLED=false,
+LOCALHOST_BYPASS on loopback, unconfigured first-run) still pass — and
+fire-reminder in the same file already used it. The CRUD routes now resolve
+the owner through it too.
+
+Test transport note: these drive the ASGI app through ``httpx.ASGITransport``
+ ``httpx.AsyncClient`` rather than ``starlette.testclient.TestClient``.
+TestClient runs the app inside a background event-loop thread spun up by
+``anyio.from_thread.start_blocking_portal`` and then dispatches each sync
+endpoint onto *another* worker thread; on some anyio/httpx/platform
+combinations that two-thread handshake deadlocks and ``TestClient(app).get(...)``
+simply hangs. ASGITransport runs the whole request on the test's own event
+loop — no portal thread, no BaseHTTPMiddleware — so the suite is portable.
+Identity is injected by a pure-ASGI shim that writes the same
+``request.state`` fields the real auth middleware sets.
+"""
+import uuid
+from types import SimpleNamespace
+
+import httpx
+import pytest
+from fastapi import FastAPI
+from sqlalchemy import create_engine
+from sqlalchemy.orm import sessionmaker
+from sqlalchemy.pool import NullPool
+
+import core.database as cdb
+from core.database import Note
+import routes.note_routes as nr
+
+
+# A deliberately NON-loopback peer. require_user has loopback fall-throughs
+# (unconfigured first-run, LOCALHOST_BYPASS); pinning a public-looking client
+# keeps every assertion below about the *configured-auth* path and not an
+# accidental loopback bypass — the same reason the old fixture leaned on
+# TestClient's non-loopback "testclient" host.
+_PEER = ("203.0.113.7", 54321)
+
+
+class _Identity:
+    """Pure-ASGI shim mirroring what the auth middleware writes onto
+    request.state. Pure-ASGI on purpose — it stays off Starlette's
+    BaseHTTPMiddleware + sync-TestClient path, the source of the
+    ``TestClient(app).get(...)`` hang. No x-test-user header => no identity,
+    the exact state an auth-middleware regression would produce."""
+
+    def __init__(self, app):
+        self.app = app
+
+    async def __call__(self, scope, receive, send):
+        if scope["type"] == "http":
+            headers = dict(scope.get("headers") or [])
+            state = scope.setdefault("state", {})
+            user = headers.get(b"x-test-user")
+            if user:
+                state["current_user"] = user.decode()
+            if headers.get(b"x-test-api-token"):
+                state["current_user"] = "api"
+                state["api_token"] = True
+        await self.app(scope, receive, send)
+
+
+def _temp_db(tmp_path):
+    """Note routes over a fresh temp DB; returns the session factory."""
+    engine = create_engine(
+        f"sqlite:///{tmp_path / 'notes.db'}",
+        connect_args={"check_same_thread": False},
+        poolclass=NullPool,
+    )
+    cdb.Base.metadata.create_all(engine)
+    return sessionmaker(bind=engine)
+
+
+def _build_app(factory, *, configured=True):
+    app = FastAPI()
+    app.state.auth_manager = SimpleNamespace(is_configured=configured)
+    app.include_router(nr.setup_note_routes())
+    return _Identity(app)
+
+
+def _client(app):
+    """AsyncClient over the ASGI app with a non-loopback peer. Caller drives
+    it inside ``async with``."""
+    transport = httpx.ASGITransport(app=app, client=_PEER)
+    return httpx.AsyncClient(transport=transport, base_url="http://notes.test")
+
+
+@pytest.fixture
+def env(monkeypatch, tmp_path):
+    """Configured-auth world: AUTH_ENABLED=true, auth_manager.is_configured,
+    no LOCALHOST_BYPASS. Identity comes only from the x-test-user header
+    (mirroring the auth middleware); no header => no identity, the exact state
+    an auth-middleware regression leaves behind. Seeds one note each for alice
+    and bob. Returns (app, factory)."""
+    factory = _temp_db(tmp_path)
+    monkeypatch.setattr(nr, "SessionLocal", factory)
+    monkeypatch.setenv("AUTH_ENABLED", "true")
+    monkeypatch.delenv("LOCALHOST_BYPASS", raising=False)
+
+    app = _build_app(factory)
+
+    db = factory()
+    db.add(Note(id="note-alice", owner="alice", title="a", content="x",
+                items='[{"text": "t", "done": false}]'))
+    db.add(Note(id="note-bob", owner="bob", title="b", content="y"))
+    db.commit()
+    db.close()
+    return app, factory
+
+
+async def test_no_identity_fails_closed_on_every_owner_scoped_route(env):
+    app, _ = env
+    async with _client(app) as c:
+        assert (await c.get("/api/notes")).status_code == 401
+        assert (await c.get("/api/notes/note-alice")).status_code == 401
+        assert (await c.put("/api/notes/note-alice", json={"title": "pwn"})).status_code == 401
+        assert (await c.delete("/api/notes/note-alice")).status_code == 401
+        assert (await c.post("/api/notes/note-alice/pin")).status_code == 401
+        assert (await c.post("/api/notes/note-alice/archive")).status_code == 401
+        assert (await c.post("/api/notes/note-alice/items/0/toggle")).status_code == 401
+        assert (await c.post("/api/notes/reorder", json={"ids": ["note-bob", "note-alice"]})).status_code == 401
+        assert (await c.post("/api/notes", json={"title": "ghost"})).status_code == 401
+
+
+async def test_no_identity_did_not_mutate_anything(env):
+    app, factory = env
+    async with _client(app) as c:
+        await c.put("/api/notes/note-alice", json={"title": "pwn"})
+        await c.post("/api/notes/note-alice/pin")
+        await c.delete("/api/notes/note-bob")
+    db = factory()
+    rows = {n.id: n for n in db.query(Note).all()}
+    db.close()
+    assert set(rows) == {"note-alice", "note-bob"}
+    assert rows["note-alice"].title == "a"
+    assert not rows["note-alice"].pinned
+
+
+async def test_authenticated_user_still_scoped_to_own_notes(env):
+    app, _ = env
+    alice = {"x-test-user": "alice"}
+    async with _client(app) as c:
+        listed = (await c.get("/api/notes", headers=alice)).json()["notes"]
+        assert [n["id"] for n in listed] == ["note-alice"]
+        assert (await c.get("/api/notes/note-alice", headers=alice)).status_code == 200
+        # Someone else's note stays a 404 (don't reveal it exists).
+        assert (await c.get("/api/notes/note-bob", headers=alice)).status_code == 404
+        assert (await c.put("/api/notes/note-alice", json={"title": "mine"}, headers=alice)).status_code == 200
+
+
+async def test_api_token_pseudo_user_is_rejected(env):
+    """Bearer tokens must use the scope-aware API routes (require_user's
+    existing contract), not slip into cookie-session routes as user 'api'."""
+    app, _ = env
+    async with _client(app) as c:
+        r = await c.get("/api/notes", headers={"x-test-api-token": "1"})
+    assert r.status_code == 403
+
+
+async def test_auth_disabled_keeps_single_user_mode_working(monkeypatch, tmp_path):
+    """AUTH_ENABLED=false is the operator's explicit anonymous mode: no
+    identity must still mean full single-user access (issue #622 contract),
+    even with a stale configured auth.json on disk."""
+    factory = _temp_db(tmp_path)
+    monkeypatch.setattr(nr, "SessionLocal", factory)
+    monkeypatch.setenv("AUTH_ENABLED", "false")
+
+    app = _build_app(factory)
+
+    db = factory()
+    db.add(Note(id="n1", owner=None, title="solo", content="x"))
+    db.commit()
+    db.close()
+
+    async with _client(app) as c:
+        assert [n["id"] for n in (await c.get("/api/notes")).json()["notes"]] == ["n1"]
+        assert (await c.put("/api/notes/n1", json={"title": "still mine"})).status_code == 200
+        assert (await c.post("/api/notes/n1/pin")).status_code == 200
@@ -0,0 +1,74 @@
+import asyncio
+import os
+from pathlib import Path
+
+from routes import personal_routes
+
+
+class _FakePersonalDocs:
+    def __init__(self):
+        self.excluded = []
+
+    def exclude_file(self, filepath):
+        self.excluded.append(filepath)
+
+
+class _FakeRAG:
+    def __init__(self):
+        self.deleted_sources = []
+
+    def delete_by_source(self, filepath):
+        self.deleted_sources.append(filepath)
+        return 1
+
+
+def _delete_endpoint(personal_docs):
+    router = personal_routes.setup_personal_routes(personal_docs, None, True)
+    for route in router.routes:
+        if getattr(route, "path", "") == "/api/personal/file" and "DELETE" in getattr(route, "methods", set()):
+            return route.endpoint
+    raise AssertionError("DELETE /api/personal/file endpoint not found")
+
+
+def test_delete_file_refuses_symlink_directory_escape(tmp_path, monkeypatch):
+    uploads = tmp_path / "uploads"
+    uploads.mkdir()
+    outside = tmp_path / "outside"
+    outside.mkdir()
+    victim = outside / "victim.txt"
+    victim.write_text("keep me", encoding="utf-8")
+    os.symlink(outside, uploads / "linked")
+
+    docs = _FakePersonalDocs()
+    rag = _FakeRAG()
+    monkeypatch.setattr(personal_routes, "UPLOADS_DIR", str(uploads))
+    monkeypatch.setattr(personal_routes, "get_rag_manager", lambda: rag)
+
+    filepath = str(uploads / "linked" / "victim.txt")
+    result = asyncio.run(_delete_endpoint(docs)(filepath=filepath, owner="alice", _admin=None))
+
+    assert result["deleted_from_disk"] is False
+    assert victim.read_text(encoding="utf-8") == "keep me"
+    assert docs.excluded == [filepath]
+    assert rag.deleted_sources == [filepath]
+
+
+def test_delete_file_removes_regular_file_inside_upload_root(tmp_path, monkeypatch):
+    uploads = tmp_path / "uploads"
+    uploads.mkdir()
+    uploaded_file = uploads / "alice" / "notes.txt"
+    uploaded_file.parent.mkdir()
+    uploaded_file.write_text("delete me", encoding="utf-8")
+
+    docs = _FakePersonalDocs()
+    rag = _FakeRAG()
+    monkeypatch.setattr(personal_routes, "UPLOADS_DIR", str(uploads))
+    monkeypatch.setattr(personal_routes, "get_rag_manager", lambda: rag)
+
+    filepath = str(uploaded_file)
+    result = asyncio.run(_delete_endpoint(docs)(filepath=filepath, owner="alice", _admin=None))
+
+    assert result["deleted_from_disk"] is True
+    assert not uploaded_file.exists()
+    assert docs.excluded == [filepath]
+    assert rag.deleted_sources == [filepath]
@@ -107,10 +107,8 @@ class TestBuildersRejectLookalikeHosts:
        assert build_chat_url("https://notanthropic.com") == "https://notanthropic.com/chat/completions"

    def test_lookalike_anthropic_models_is_openai(self):
-        # Must hit the generic OpenAI branch, not Anthropic — assert the
-        # provider directly since both branches now end in /v1/models.
        assert llm_core._detect_provider("https://anthropic.com.evil.com") == "openai"
-        assert build_models_url("https://anthropic.com.evil.com") == "https://anthropic.com.evil.com/v1/models"
+        assert build_models_url("https://anthropic.com.evil.com") == "https://anthropic.com.evil.com/models"

    def test_anthropic_domain_in_path_is_openai(self):
        assert build_chat_url("https://myproxy.internal/anthropic.com/v1") == "https://myproxy.internal/anthropic.com/v1/chat/completions"
@@ -122,9 +120,8 @@ class TestBuildersRejectLookalikeHosts:
        assert build_chat_url("https://notollama.com") == "https://notollama.com/chat/completions"

    def test_lookalike_ollama_models_is_openai(self):
-        # Must hit the generic OpenAI branch, not Ollama.
        assert llm_core._detect_provider("https://notollama.com") == "openai"
-        assert build_models_url("https://notollama.com") == "https://notollama.com/v1/models"
+        assert build_models_url("https://notollama.com") == "https://notollama.com/models"


 class TestBuildersLocalAndDockerEndpoints:
@@ -1,16 +1,18 @@
-"""Regression guard for issue #1390 — the README banner / ASCII art was not in a
-fenced code block, so GitHub's markdown collapsed its leading whitespace and the
-box-drawing rules, rendering it misaligned instead of monospace-as-typed.
+"""Regression guard for the README title presentation.

-This pins that the decorative banner stays inside a ``` code fence.
+Originally (#1390) the README opened with an ASCII-art banner that had to live
+inside a ``` code fence, otherwise GitHub's markdown collapsed its leading
+whitespace and box-drawing rules and rendered it misaligned. The README refresh
+(#4306) dropped that banner in favour of a centered wordmark image, so the guard
+now pins the wordmark identity instead, while still catching the original failure
+mode if an un-fenced ASCII banner is ever reintroduced.
 """
 from pathlib import Path

 README = Path(__file__).resolve().parent.parent / "README.md"

-# Distinctive bits of the banner (box-drawing rule + the kaomoji version line).
+# Box-drawing rule from the legacy ASCII banner (the #1390 failure mode).
 _RULE = "─" * 10
-_BANNER_LINE = "Odysseus vers. 1.0"


 def _fenced_segments(text: str):
@@ -20,15 +22,18 @@ def _fenced_segments(text: str):
    return parts[1::2]


-def test_readme_banner_is_inside_a_code_fence():
+def test_readme_opens_with_wordmark_title():
+    # The README must still open with a recognizable Odysseus title: now the
+    # centered wordmark image rather than an H1 / ASCII banner.
+    head = "\n".join(README.read_text(encoding="utf-8").splitlines()[:15])
+    assert 'alt="Odysseus"' in head, "README must open with the Odysseus wordmark image"
+
+
+def test_reintroduced_ascii_banner_stays_fenced():
+    # Defensive: if a box-drawing banner is ever added back, it must be fenced so
+    # GitHub renders it monospace-as-typed (the original #1390 regression).
    text = README.read_text(encoding="utf-8")
-    assert _BANNER_LINE in text, "banner line missing from README"
+    if _RULE not in text:
+        return
    inside = "\n".join(_fenced_segments(text))
-    assert _BANNER_LINE in inside, "banner version line must be inside a ``` code fence"
-    assert _RULE in inside, "banner rule line must be inside a ``` code fence"
-
-
-def test_readme_title_stays_a_heading():
-    # The H1 must remain a real heading, not get swallowed into the fence.
-    first = README.read_text(encoding="utf-8").splitlines()[0]
-    assert first.strip() == "# Odysseus"
+    assert _RULE in inside, "ASCII banner rule must be inside a ``` code fence"
@@ -0,0 +1,50 @@
+import os
+import sys
+from unittest import mock
+import pytest
+from src.runtime_paths import get_app_root, get_default_data_dir
+
+
+def test_get_app_root_normal_run():
+    """Verify that get_app_root returns the repository root parent of src/ when not frozen."""
+    with mock.patch.object(sys, "frozen", False, create=True):
+        app_root = get_app_root()
+        # Verify it is a valid directory path and matches expected parent structure
+        assert os.path.isdir(app_root)
+        assert os.path.exists(os.path.join(app_root, "src"))
+
+
+def test_get_app_root_frozen_with_meipass():
+    """Verify that get_app_root returns the sys._MEIPASS directory when frozen by PyInstaller."""
+    mock_meipass = os.path.abspath("mock_meipass_dir")
+    with mock.patch.object(sys, "frozen", True, create=True), \
+         mock.patch.object(sys, "_MEIPASS", mock_meipass, create=True):
+        app_root = get_app_root()
+        assert app_root == mock_meipass
+
+
+def test_get_app_root_frozen_without_meipass():
+    """Verify that get_app_root falls back to the sys.executable parent directory when frozen but _MEIPASS is absent."""
+    mock_exe_path = os.path.join(os.path.abspath("mock_exe_dir"), "Odysseus.exe")
+    with mock.patch.object(sys, "frozen", True, create=True), \
+         mock.patch.object(sys, "executable", mock_exe_path, create=True):
+        # Remove sys._MEIPASS if it exists in the test process environment
+        if hasattr(sys, "_MEIPASS"):
+            delattr(sys, "_MEIPASS")
+        app_root = get_app_root()
+        assert app_root == os.path.abspath("mock_exe_dir")
+
+
+def test_get_default_data_dir_normal():
+    """Verify that get_default_data_dir resolves to get_app_root() / 'data' when not frozen."""
+    with mock.patch.object(sys, "frozen", False, create=True):
+        res = get_default_data_dir()
+        assert res == os.path.join(get_app_root(), "data")
+
+
+def test_get_default_data_dir_frozen():
+    """Verify that get_default_data_dir resolves to a persistent user path under ~ when frozen."""
+    with mock.patch.object(sys, "frozen", True, create=True):
+        res = get_default_data_dir()
+        expected = os.path.join(os.path.expanduser("~"), ".odysseus", "data")
+        assert res == expected
--- a/Show More
+++ b/Show More