Refresh README presentation

test(provider): align lookalike-host URL expectations with /models behavior
build_models_url returns /models (no /v1 prefix) for non-local generic OpenAI-compatible hosts (intentional, see endpoint_resolver.py:206). The tests added in #4272 expected /v1/models, which is the local/deepseek behavior. Match production semantics.
2026-06-17 18:25:26 -04:00 · 2026-06-15 23:24:41 +09:00 · 2026-06-15 23:21:49 +09:00 · 2026-06-15 23:13:18 +09:00 · 2026-06-15 14:07:49 +00:00 · 2026-06-15 23:02:46 +09:00
15 changed files with 704 additions and 553 deletions
@@ -1,61 +0,0 @@
 # CodeQL code scanning
 #
 # Purpose: GitHub's own static analysis engine reads the application source
 # (Python backend + the JavaScript frontend) and looks for real
 # vulnerabilities -- SQL/command injection, path traversal, auth mistakes,
 # unsafe deserialization. Findings appear in the repo's Security tab. This is
 # the deepest check in the suite and the most valuable for a high-profile
 # target.
 #
 # It runs on every push to main and on a weekly schedule (to catch newly
 # disclosed query patterns against unchanged code). It deliberately does NOT
 # run on pull requests: most PRs here come from forks, whose read-only token
 # cannot publish results, which would produce confusing failures. To scan pull
 # requests too, a maintainer can instead enable CodeQL "default setup" in
 # Settings -> Security -> Code scanning (one toggle, no file needed) -- see
 # docs/security-ci.md.
 name: CodeQL
 on:
  push:
    branches: [main]
  schedule:
    # Weekly, Monday 06:00 UTC.
    - cron: '0 6 * * 1'
  workflow_dispatch:
 permissions: {}
 concurrency:
  group: codeql-${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  analyze:
    name: Analyze (${{ matrix.language }})
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write  # publish results to the Security tab
    strategy:
      fail-fast: false
      matrix:
        # Both are interpreted, so CodeQL needs no build step (build-mode none).
        language: [python, javascript-typescript]
    steps:
      - name: Checkout repository
        uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10  # v6.0.3
        with:
          persist-credentials: false
      - name: Initialize CodeQL
        uses: github/codeql-action/init@8aad20d150bbac5944a9f9d289da16a4b0d87c1e  # v4.36.2
        with:
          languages: ${{ matrix.language }}
          build-mode: none
      - name: Perform CodeQL analysis
        uses: github/codeql-action/analyze@8aad20d150bbac5944a9f9d289da16a4b0d87c1e  # v4.36.2
        with:
          category: "/language:${{ matrix.language }}"
@@ -1,476 +1,65 @@
-# Odysseus
+<p align="center">
  <img src="docs/odysseus-wordmark.png" alt="Odysseus" width="280">
 </p>
-> **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
+<p align="center">
  A self-hosted AI workspace for chat, agents, research, documents, email, notes, calendar, and local model workflows.
 </p>
-```
+<p align="center">
-───────────────────────────────────────────────
+  <a href="#quick-start">Quick Start</a> ·
- ⊹ ࣪ ˖ ૮( ˶ᵔ ᵕ ᵔ˶ )っ  Odysseus vers. 1.0
+  <a href="docs/setup.md">Setup Guide</a> ·
-───────────────────────────────────────────────
+  <a href="CONTRIBUTING.md">Contributing</a> ·
-```
+  <a href="ROADMAP.md">Roadmap</a>
 </p>
-![Odysseus](docs/odysseus.jpg)
+<p align="center">
  <a href="https://repology.org/project/odysseus-ai/versions"><img src="https://repology.org/badge/vertical-allrepos/odysseus-ai.svg" alt="Packaging status"></a>
 </p>
-A self-hosted AI workspace -- meant to be the self-hosted version of the UI experience you get from ChatGPT and Claude. But with more jank and fun. Running on your own hardware, with your own data -- local-first, privacy-first, and no trojan.
+<p align="center">
  <img src="docs/odysseus.jpg" alt="Odysseus interface">
 </p>
-[![Packaging status](https://repology.org/badge/vertical-allrepos/odysseus-ai.svg)](https://repology.org/project/odysseus-ai/versions)
+---
 ## Features
  - **Chat** -- chat with any local model or API; adding them is super simple.<br>　<sub>vLLM · llama.cpp · Ollama · OpenRouter · OpenAI · GitHub Copilot</sub>
  - **Agent** -- hand it tools and let it run the whole task itself.<br>　<sub>built on [opencode](https://github.com/anomalyco/opencode) · MCP · web · files · shell · skills · memory</sub>
  - **Cookbook** -- Scans your hardware, recommends models, click to download and serve.. easy!<br>　<sub>built on [llmfit](https://github.com/AlexsJones/llmfit) · VRAM-aware · GGUF / FP8 / AWQ · fit scoring · vLLM / llama.cpp serving</sub>
  - **Deep Research** -- multi-step runs that gather, read, and synthesize sources into a nice visual report.<br>　<sub>adapted from [Tongyi DeepResearch](https://github.com/Alibaba-NLP/DeepResearch)</sub>
  - **Compare** -- a fun tool to compare models side by side. Test completely blind, no bias!<br>　<sub>multi-model · blind test · synthesis</sub>
  - **Documents** -- YOU write the text, AI is there to assist, not the opposite.<br>　<sub>multi-tab editor · markdown · HTML · CSV · syntax highlighting · AI edits · suggestions</sub>
  - **Memory / Skills** -- Persistent memory and skills, your agent evolves over time as it better understands you and your tasks!<br>　<sub>ChromaDB · fastembed (ONNX) · vector + keyword retrieval · import/export</sub>
  - **Email** -- IMAP/SMTP inbox with AI triage built in: urgency reminders, auto-tag, auto-summary, auto-reply drafts, auto-spam.<br>　<sub>IMAP · SMTP · per-account routing · CalDAV-aware</sub>
  - **Notes & Tasks** -- Quick notes with reminders, a todo list, and scheduled tasks the agent can act on.<br>　<sub>note pings · checklist · cron-style tasks · ntfy / browser / email channels</sub>
  - **Calendar** -- Local-first calendar with CalDAV sync to Radicale / Nextcloud / Apple / Fastmail.<br>　<sub>CalDAV pull · .ics import/export · per-calendar colors · agent-aware</sub>
  - **Works on mobile** -- looks and runs great on your phone, not just desktop.<br>　<sub>responsive · installable (PWA) · touch gestures</sub>
  - **Extras** -- more to explore, happy if you give it a go!<br>　<sub>image editor · theme editor · file uploads (vision + PDF) · web search · presets · sessions · 2FA</sub>
 ## Demo
 A full, hover-to-play tour lives on the landing page (`docs/index.html`).
 <details>
 <summary>Screenshots / clips</summary>
 ### Chat & Agents
 ![Chat & Agents](docs/chat.gif)
 ### Deep Research
 ![Deep Research](docs/research.gif)
 ### Compare
 ![Compare](docs/compare.gif)
 ### Documents
 ![Documents](docs/document.gif)
 ### Notes & Tasks
 ![Notes & Tasks](docs/notes.gif)
 </details>
 ## Quick Start
-Defaults work out of the box: clone, run, then configure models/search/email
+> `dev` is the default branch and gets the newest changes first. Use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main) if you want the more curated branch.
 inside **Settings**. Only edit `.env` for deployment-level overrides like
 `APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
 On first setup, Odysseus creates an admin account (`admin` unless
 `ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
 For Docker installs, the same line is in `docker compose logs odysseus`.
 Use that for the first login, then change it in **Settings**.
 Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
 pull request guidelines.
 ### Docker (recommended)
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
-cp .env.example .env       # optional, but recommended for explicit defaults
+cp .env.example .env
 docker compose up -d --build
 ```
 To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.
-Open `http://localhost:7000` when the containers are healthy. Docker Compose
+Open `http://localhost:7000` when the containers are healthy. The first admin password is printed in `docker compose logs odysseus`.
 binds the web UI to `127.0.0.1` by default. If the port is taken, set
 `APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
 only when you intentionally want LAN/reverse-proxy access.
-> **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
+Native installs, GPU notes, Windows/macOS instructions, HTTPS, and configuration live in the [setup guide](docs/setup.md).
 > Cookbook serves local models on CPU only. For GPU-accelerated model serving,
 > run natively instead — see [Apple Silicon](#apple-silicon) below.
-### Native Linux / macOS
+## Features
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 python3 -m venv venv
 source venv/bin/activate
 pip install -r requirements.txt
 python setup.py
 python -m uvicorn app:app --host 127.0.0.1 --port 7000
 ```
 Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
 downloads and serves. The app itself is lightweight; local model serving is the
 heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
 connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
-### Apple Silicon
+- **Chat + Agents** — local/API models, tools, MCP, files, shell, skills, and memory.
-Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
+- **Cookbook** — hardware-aware model recommendations, downloads, and serving.
-M-series Mac, run Odysseus natively:
+- **Deep Research** — multi-step web research with source reading and report generation.
 - **Compare** — blind side-by-side model testing and synthesis.
 - **Documents** — writing-first editor with AI edits, suggestions, Markdown, HTML, CSV, and syntax highlighting.
 - **Email** — IMAP/SMTP inbox with triage, tags, summaries, reminders, and reply drafts.
 - **Notes, Tasks + Calendar** — reminders, todos, scheduled agent tasks, and CalDAV sync.
 - **Extras** — gallery/image editor, themes, uploads, web search, presets, sessions, and 2FA.
-```bash
+## Demo
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 ./start-macos.sh
 ```
-It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
+A full hover-to-play tour lives on the landing page: [`docs/index.html`](docs/index.html).
 ```bash
 ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
 # then open http://<tailscale-ip>:7860
 ```
 The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
 set there are picked up automatically without a command-line override each run.
 Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
 expose this port directly to the public internet. To build a clickable app wrapper:
 ```bash
 ./build-macos-app.sh
 ```
 <details>
 <summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
 **Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
 ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
 they are reachable from the host but not exposed to your LAN/public internet
 unless you opt in.
 **Cookbook storage in Docker.** Downloads live in `./data/huggingface`
 (`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
 serve engines live in `./data/local` (`~/.local` in the container), so they
 survive container recreation.
 **Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
 Odysseus SSH key and add the public key to the remote server's
 `~/.ssh/authorized_keys`. From the host you can also run:
 ```bash
 ssh-copy-id -i data/ssh/id_ed25519.pub user@server
 ```
 **Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
 only detect GPUs that Docker exposes to the container — if the host runtime or
 device passthrough is not configured, Cookbook sees the iGPU, another card, or
 CPU instead of your intended GPU.
 For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
 optionally install the host runtime or update `.env`.
 ```bash
 # Read-only diagnostic (default — installs nothing, never edits .env):
 scripts/check-docker-gpu.sh
 # Print OS-specific install commands without running them:
 scripts/check-docker-gpu.sh --print-install-commands
 # Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
 scripts/check-docker-gpu.sh --install-nvidia-toolkit
 # Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
 scripts/check-docker-gpu.sh --enable-nvidia-overlay
 # Full assisted setup — install toolkit, then enable overlay if passthrough works:
 scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
 ```
 Safety notes:
 - The app never installs host GPU runtime automatically.
 - The app never edits `.env` automatically.
 - `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
  and only after GPU passthrough succeeds. `--yes` skips prompts but does not
  bypass the passthrough gate.
 - `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
  Git and the Docker build context.
 To enable manually without the script, add this to `.env`:
 ```bash
 COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
 ```
 **AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
 ```bash
 scripts/check-docker-amd-gpu.sh
 ```
 Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
 numeric render group id:
 ```bash
 COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
 RENDER_GID=989
 ```
 For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
 **Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
 often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
 or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
 overlay workflow above. For stack UIs, point the stack at one of the standalone
 files instead, which bundle the base stack plus the GPU settings:
 - `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
  on the host.
 - `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
  `video`/`render` group membership, and `RENDER_GID` when needed.
 The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
 source of truth; the standalone files mirror them for single-file deployments.
 Verify after enabling either overlay:
 ```bash
 docker compose exec odysseus nvidia-smi -L   # NVIDIA
 docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*'  # AMD
 ```
 > **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
 > container confirms Docker GPU access, but llama.cpp also needs `cudart` and
 > the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
 > library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
 > tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
 > not a Docker passthrough failure. Reinstall the serve engine via
 > **Cookbook → Dependencies** to get a CUDA-enabled build.
 >
 > The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
 > the container confirms device passthrough, not ROCm userspace or a
 > ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
 > inside the slim Odysseus image.
 **Ollama with Docker.** If Ollama runs on the host, add this endpoint in
 Settings:
 ```text
 http://host.docker.internal:11434/v1
 ```
 Ollama must listen outside its own loopback interface:
 ```bash
 OLLAMA_HOST=0.0.0.0:11434 ollama serve
 ```
 This connects Odysseus in Docker to an Ollama server that is already running on
 your host machine; it does not start Ollama inside the container.
 `host.docker.internal` is Docker's hostname for the host machine from inside the
 container. Cookbook **Serve** is a separate workflow for serving downloaded
 models through Odysseus/llama.cpp, so Windows users with an existing Ollama
 install usually only need to add the endpoint in Settings.
 **Useful checks.**
 ```bash
 docker compose ps
 docker compose logs --tail=120 odysseus
 docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
 ```
 **macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
 runs setup, and starts uvicorn on port `7860` because AirPlay often holds
 `7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
 do not run on macOS. MLX-only models are not served by Odysseus.
 </details>
 ### Native Windows
 **One-command launcher** (creates the venv, installs deps, runs setup, starts the
 server; safe to re-run):
 ```powershell
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
 ```
 Or do it by hand:
 ```powershell
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 py -3.11 -m venv venv
 venv\Scripts\Activate.ps1
 pip install -r requirements.txt
 python setup.py
 python -m uvicorn app:app --host 127.0.0.1 --port 7000
 ```
 If `python` points at an older interpreter, use `py -3.12` (or another installed
 3.11+ version) for the venv step.
 **Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
 email, calendar, deep research) runs fully native. For full **Cookbook** background
 model downloads and the agent shell tool, also install
 [Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
 Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
 [Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
 `http://localhost:11434/v1` in Settings.
 Open `http://localhost:7000`, log in with the generated admin password,
 and configure everything else inside **Settings**.
 ## Troubleshooting & Advanced Setup
 ### `chromadb-client` conflicts with embedded ChromaDB
 If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
 **Fix:** uninstall `chromadb-client` and force-reinstall the full package:
 ```bash
 ./venv/bin/pip uninstall chromadb-client -y
 ./venv/bin/pip install --force-reinstall chromadb
 ```
 ### HTTPS + LAN/Tailscale exposure
 To expose Odysseus on a local network or Tailscale with HTTPS:
 1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
 2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
   ```bash
   mkcert -install
   mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
   ```
 3. Run `uvicorn` with the generated certs:
   ```bash
   python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
   ```
 4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
 ### Optional Dependencies
 `requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
 | Package | Feature unlocked |
 |---------|-----------------|
 | `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
 | `ddgs` | DuckDuckGo as a search provider option. |
 | `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
 | `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
 ### Faster, reproducible installs with uv (optional)
 [uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
 venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
 ```bash
 uv venv venv --python 3.13
 uv pip install -r requirements.txt
 # then continue as usual: python setup.py, uvicorn, ...
 ```
 `requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
 ```bash
 uv pip compile requirements.txt -o requirements.lock   # snapshot current resolution
 uv pip sync requirements.lock                          # reproduce it exactly later
 ```
 `requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
 ### Outlook / Office 365 email
 Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
 and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
 passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
 current limitation and the planned integration direction.
 ## Security Notes
 Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
 - Keep `AUTH_ENABLED=true` for any network-accessible deployment.
 - Keep `LOCALHOST_BYPASS=false` outside local development.
 - Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
 - Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
 - Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
 - Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
 - Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
 - Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
 - If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
 - Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
 - Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
 - Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
 ### Private or proxied deployments
 Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
 1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
 2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
 3. Put the authenticated Odysseus web/API entrypoint behind that layer.
 4. Keep raw service and model ports internal-only.
 Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
 `ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
 Common internal-only ports from the default docs/compose setup:
 | Port | Service |
 |---|---|
 | `7000` | Odysseus raw app port |
 | `8080` | SearXNG |
 | `8091` | ntfy |
 | `8100` | ChromaDB host port for manual/compose access |
 | `11434` | Ollama |
 | `8000-8020` | Common local model/provider APIs |
 ## Contributing
 Help is welcome. The best entry points are fresh-install testing, provider setup
 bugs, mobile/editor polish, docs, and small focused refactors. See
 [ROADMAP.md](ROADMAP.md) for the current help-wanted list.
-## Configuration
+Help is welcome. The best entry points are fresh-install testing, provider setup bugs, mobile/editor polish, docs, and small focused refactors. See [CONTRIBUTING.md](CONTRIBUTING.md) and [ROADMAP.md](ROADMAP.md).
 Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
 for deployment-level defaults and secrets you want present before first boot.
 Key settings:
-| Variable | Default | Description |
+## Security
 |---|---|---|
 | `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
 | `LLM_HOSTS` | -- | Comma-separated list for model discovery |
 | `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
 | `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
 | `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
 | `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
 | `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
 | `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
 | `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
 | `AUTH_ENABLED` | `true` | Enable/disable login |
 | `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
 | `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
 | `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
 | `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
 | `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
 | `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
 | `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
 | `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
 | `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
 | `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
 | `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
 | `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
 | `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
 | `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
 | `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
-All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
+Odysseus is a self-hosted workspace with powerful local tools. Keep auth enabled, keep private data out of Git, and do not expose raw model/service ports publicly. Deployment details are in the [setup guide](docs/setup.md#security-notes).
 ### Built-in MCP servers (optional setup)
 Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
 To enable the browser MCP (page navigation, screenshots, vision), run once:
 ```bash
 npx -y @playwright/mcp@latest --version
 ```
 That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
 ## Architecture
 ```
 app.py                   # FastAPI entry point
 core/      auth, database, middleware, constants
 src/       llm_core, agent_loop, agent_tools, chat_processor, search/
 routes/    chat, session, document, memory, model … endpoints
 services/  docs, memory, search, hwfit (Cookbook) …
 static/    index.html + app.js + style.css + js/ (modular front-end)
 docs/      landing page (index.html) + preview clips
 ```
 ## Data
 All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
 `memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
 To back up or restore everything in `data/`, see the
 [Backup & Restore guide](docs/backup-restore.md).
 ## Star History
@@ -483,19 +72,5 @@ To back up or restore everything in `data/`, see the
 </a>
 ## License
 AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).
-```
+AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).
                                  |
                                 |||
                                |||||
                  |    |    |   |||||||
                 )_)  )_)  )_)   ~|~
                )___))___))___)\  |
               )____)____)_____)\\|
             _____|____|____|_____\\\__
             \                       /
       ~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
               ~^~  all aboard!  ~^~
       ~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
 ```
@@ -0,0 +1,425 @@
 # Odysseus Setup Guide
 This page keeps the detailed install, deployment, troubleshooting, and configuration notes out of the front README.
 ## Quick Start
 > **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
 Defaults work out of the box: clone, run, then configure models/search/email
 inside **Settings**. Only edit `.env` for deployment-level overrides like
 `APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
 On first setup, Odysseus creates an admin account (`admin` unless
 `ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
 For Docker installs, the same line is in `docker compose logs odysseus`.
 Use that for the first login, then change it in **Settings**.
 Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
 pull request guidelines.
 ### Docker (recommended)
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 cp .env.example .env       # optional, but recommended for explicit defaults
 docker compose up -d --build
 ```
 To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.
 Open `http://localhost:7000` when the containers are healthy. Docker Compose
 binds the web UI to `127.0.0.1` by default. If the port is taken, set
 `APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
 only when you intentionally want LAN/reverse-proxy access.
 > **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
 > Cookbook serves local models on CPU only. For GPU-accelerated model serving,
 > run natively instead — see [Apple Silicon](#apple-silicon) below.
 ### Native Linux / macOS
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 python3 -m venv venv
 source venv/bin/activate
 pip install -r requirements.txt
 python setup.py
 python -m uvicorn app:app --host 127.0.0.1 --port 7000
 ```
 Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
 downloads and serves. The app itself is lightweight; local model serving is the
 heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
 connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
 ### Apple Silicon
 Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
 M-series Mac, run Odysseus natively:
 ```bash
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 ./start-macos.sh
 ```
 It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
 ```bash
 ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
 # then open http://<tailscale-ip>:7860
 ```
 The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
 set there are picked up automatically without a command-line override each run.
 Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
 expose this port directly to the public internet. To build a clickable app wrapper:
 ```bash
 ./build-macos-app.sh
 ```
 <details>
 <summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
 **Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
 ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
 they are reachable from the host but not exposed to your LAN/public internet
 unless you opt in.
 **Cookbook storage in Docker.** Downloads live in `./data/huggingface`
 (`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
 serve engines live in `./data/local` (`~/.local` in the container), so they
 survive container recreation.
 **Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
 Odysseus SSH key and add the public key to the remote server's
 `~/.ssh/authorized_keys`. From the host you can also run:
 ```bash
 ssh-copy-id -i data/ssh/id_ed25519.pub user@server
 ```
 **Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
 only detect GPUs that Docker exposes to the container — if the host runtime or
 device passthrough is not configured, Cookbook sees the iGPU, another card, or
 CPU instead of your intended GPU.
 For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
 optionally install the host runtime or update `.env`.
 ```bash
 # Read-only diagnostic (default — installs nothing, never edits .env):
 scripts/check-docker-gpu.sh
 # Print OS-specific install commands without running them:
 scripts/check-docker-gpu.sh --print-install-commands
 # Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
 scripts/check-docker-gpu.sh --install-nvidia-toolkit
 # Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
 scripts/check-docker-gpu.sh --enable-nvidia-overlay
 # Full assisted setup — install toolkit, then enable overlay if passthrough works:
 scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
 ```
 Safety notes:
 - The app never installs host GPU runtime automatically.
 - The app never edits `.env` automatically.
 - `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
  and only after GPU passthrough succeeds. `--yes` skips prompts but does not
  bypass the passthrough gate.
 - `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
  Git and the Docker build context.
 To enable manually without the script, add this to `.env`:
 ```bash
 COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
 ```
 **AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
 ```bash
 scripts/check-docker-amd-gpu.sh
 ```
 Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
 numeric render group id:
 ```bash
 COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
 RENDER_GID=989
 ```
 For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
 **Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
 often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
 or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
 overlay workflow above. For stack UIs, point the stack at one of the standalone
 files instead, which bundle the base stack plus the GPU settings:
 - `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
  on the host.
 - `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
  `video`/`render` group membership, and `RENDER_GID` when needed.
 The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
 source of truth; the standalone files mirror them for single-file deployments.
 Verify after enabling either overlay:
 ```bash
 docker compose exec odysseus nvidia-smi -L   # NVIDIA
 docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*'  # AMD
 ```
 > **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
 > container confirms Docker GPU access, but llama.cpp also needs `cudart` and
 > the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
 > library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
 > tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
 > not a Docker passthrough failure. Reinstall the serve engine via
 > **Cookbook → Dependencies** to get a CUDA-enabled build.
 >
 > The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
 > the container confirms device passthrough, not ROCm userspace or a
 > ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
 > inside the slim Odysseus image.
 **Ollama with Docker.** If Ollama runs on the host, add this endpoint in
 Settings:
 ```text
 http://host.docker.internal:11434/v1
 ```
 Ollama must listen outside its own loopback interface:
 ```bash
 OLLAMA_HOST=0.0.0.0:11434 ollama serve
 ```
 This connects Odysseus in Docker to an Ollama server that is already running on
 your host machine; it does not start Ollama inside the container.
 `host.docker.internal` is Docker's hostname for the host machine from inside the
 container. Cookbook **Serve** is a separate workflow for serving downloaded
 models through Odysseus/llama.cpp, so Windows users with an existing Ollama
 install usually only need to add the endpoint in Settings.
 **Useful checks.**
 ```bash
 docker compose ps
 docker compose logs --tail=120 odysseus
 docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
 ```
 **macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
 runs setup, and starts uvicorn on port `7860` because AirPlay often holds
 `7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
 do not run on macOS. MLX-only models are not served by Odysseus.
 </details>
 ### Native Windows
 **One-command launcher** (creates the venv, installs deps, runs setup, starts the
 server; safe to re-run):
 ```powershell
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
 ```
 Or do it by hand:
 ```powershell
 git clone https://github.com/pewdiepie-archdaemon/odysseus.git
 cd odysseus
 py -3.11 -m venv venv
 venv\Scripts\Activate.ps1
 pip install -r requirements.txt
 python setup.py
 python -m uvicorn app:app --host 127.0.0.1 --port 7000
 ```
 If `python` points at an older interpreter, use `py -3.12` (or another installed
 3.11+ version) for the venv step.
 **Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
 email, calendar, deep research) runs fully native. For full **Cookbook** background
 model downloads and the agent shell tool, also install
 [Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
 Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
 [Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
 `http://localhost:11434/v1` in Settings.
 Open `http://localhost:7000`, log in with the generated admin password,
 and configure everything else inside **Settings**.
 ## Troubleshooting & Advanced Setup
 ### `chromadb-client` conflicts with embedded ChromaDB
 If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
 **Fix:** uninstall `chromadb-client` and force-reinstall the full package:
 ```bash
 ./venv/bin/pip uninstall chromadb-client -y
 ./venv/bin/pip install --force-reinstall chromadb
 ```
 ### HTTPS + LAN/Tailscale exposure
 To expose Odysseus on a local network or Tailscale with HTTPS:
 1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
 2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
   ```bash
   mkcert -install
   mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
   ```
 3. Run `uvicorn` with the generated certs:
   ```bash
   python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
   ```
 4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
 ### Optional Dependencies
 `requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
 | Package | Feature unlocked |
 |---------|-----------------|
 | `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
 | `ddgs` | DuckDuckGo as a search provider option. |
 | `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
 | `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
 ### Faster, reproducible installs with uv (optional)
 [uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
 venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
 ```bash
 uv venv venv --python 3.13
 uv pip install -r requirements.txt
 # then continue as usual: python setup.py, uvicorn, ...
 ```
 `requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
 ```bash
 uv pip compile requirements.txt -o requirements.lock   # snapshot current resolution
 uv pip sync requirements.lock                          # reproduce it exactly later
 ```
 `requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
 ### Outlook / Office 365 email
 Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
 and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
 passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
 current limitation and the planned integration direction.
 ## Security Notes
 Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
 - Keep `AUTH_ENABLED=true` for any network-accessible deployment.
 - Keep `LOCALHOST_BYPASS=false` outside local development.
 - Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
 - Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
 - Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
 - Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
 - Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
 - Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
 - If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
 - Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
 - Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
 - Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
 ### Private or proxied deployments
 Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
 1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
 2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
 3. Put the authenticated Odysseus web/API entrypoint behind that layer.
 4. Keep raw service and model ports internal-only.
 Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
 `ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
 Common internal-only ports from the default docs/compose setup:
 | Port | Service |
 |---|---|
 | `7000` | Odysseus raw app port |
 | `8080` | SearXNG |
 | `8091` | ntfy |
 | `8100` | ChromaDB host port for manual/compose access |
 | `11434` | Ollama |
 | `8000-8020` | Common local model/provider APIs |
 ## Configuration
 Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
 for deployment-level defaults and secrets you want present before first boot.
 Key settings:
 | Variable | Default | Description |
 |---|---|---|
 | `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
 | `LLM_HOSTS` | -- | Comma-separated list for model discovery |
 | `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
 | `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
 | `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
 | `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
 | `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
 | `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
 | `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
 | `AUTH_ENABLED` | `true` | Enable/disable login |
 | `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
 | `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
 | `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
 | `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
 | `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
 | `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
 | `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
 | `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
 | `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
 | `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
 | `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
 | `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
 | `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
 | `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
 | `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
 All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
 ### Built-in MCP servers (optional setup)
 Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
 To enable the browser MCP (page navigation, screenshots, vision), run once:
 ```bash
 npx -y @playwright/mcp@latest --version
 ```
 That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
 ## Architecture
 ```
 app.py                   # FastAPI entry point
 core/      auth, database, middleware, constants
 src/       llm_core, agent_loop, agent_tools, chat_processor, search/
 routes/    chat, session, document, memory, model … endpoints
 services/  docs, memory, search, hwfit (Cookbook) …
 static/    index.html + app.js + style.css + js/ (modular front-end)
 docs/      landing page (index.html) + preview clips
 ```
 ## Data
 All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
 `memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
 To back up or restore everything in `data/`, see the
 [Backup & Restore guide](docs/backup-restore.md).
@@ -12,6 +12,7 @@ import json
 import csv
 import io
 import os
 import inspect
 import httpx
 from pathlib import Path
 from datetime import datetime
@@ -741,8 +742,8 @@ def setup_contacts_routes():
        email = (data.get("email") or "").strip()
        phone = (data.get("phone") or "").strip()
        address = (data.get("address") or "").strip()
-        if not email and not name:
+        if not email:
-            return {"success": False, "error": "Name or email required"}
+            return {"success": False, "error": "Email required"}
        # Check if already exists by email
        if email:
            contacts = _fetch_contacts()
@@ -751,7 +752,11 @@ def setup_contacts_routes():
                    return {"success": True, "message": "Already exists", "contact": c}
        if not name:
            name = email.split("@")[0]
        create_params = inspect.signature(_create_contact).parameters
        if len(create_params) >= 3:
            ok = _create_contact(name, email, address)
        else:
            ok = _create_contact(name, email)
        # If a phone was provided, do an immediate update to thread it
        # through (the simple _create_contact signature only takes name +
        # email + address; phones happen via update).
@@ -67,6 +67,14 @@ def _gallery_image_path(filename: str) -> Path:
        raise HTTPException(400, "Unsafe gallery filename")
    if safe_name != original:
        raise HTTPException(400, "Unsafe gallery filename")
    if not path.exists():
        cwd_root = (Path.cwd() / "data" / "generated_images").resolve()
        cwd_path = (cwd_root / safe_name).resolve()
        try:
            if os.path.commonpath([str(cwd_root), str(cwd_path)]) == str(cwd_root) and cwd_path.exists():
                return cwd_path
        except Exception:
            pass
    return path
@@ -19,22 +19,32 @@ GPU_BANDWIDTH = {
    "6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224,
    "mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229,
    "9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322,
    # Apple Silicon unified-memory bandwidth (GB/s). Keyed off the chip name
    # reported by sysctl machdep.cpu.brand_string (e.g. "Apple M4 Max"). Listed
    # before the bare "m_" keys matters less than length-sorting (done below),
    # which guarantees "m4 max" is tried before "m4".
    "m1 ultra": 800, "m1 max": 400, "m1 pro": 200, "m1": 68,
    "m2 ultra": 800, "m2 max": 400, "m2 pro": 200, "m2": 100,
    "m3 ultra": 800, "m3 max": 300, "m3 pro": 150, "m3": 100,
    "m4 max": 546, "m4 pro": 273, "m4": 120,
    "m5 max": 546, "m5 pro": 273, "m5": 150,
 }
 # Pre-sort keys by length descending for correct substring matching
 _BW_KEYS_SORTED = sorted(GPU_BANDWIDTH.keys(), key=len, reverse=True)
-# metal: backstop for Apple Silicon chips not in GPU_BANDWIDTH (e.g. a future
+# Apple Silicon unified-memory bandwidth (GB/s). For chip families with both
-# M5) — the named chips above take the accurate bandwidth path instead.
+# binned and full variants under the same "Apple Mx Max" brand string, prefer
 # GPU core count when hardware detection provides it; otherwise fall back to the
 # conservative tier so speed estimates do not over-promise.
 APPLE_BANDWIDTH_FIXED = {
    "m1 ultra": 800, "m1 max": 400, "m1 pro": 200, "m1": 68,
    "m2 ultra": 800, "m2 max": 400, "m2 pro": 200, "m2": 100,
    "m3 ultra": 800, "m3 pro": 150, "m3": 100,
    "m4 pro": 273, "m4": 120,
    "m5 pro": 307, "m5": 153,
 }
 APPLE_BANDWIDTH_BY_CORES = {
    "m3 max": {30: 300, 40: 400},
    "m4 max": {32: 410, 40: 546},
    "m5 max": {32: 460, 40: 614},
 }
 _APPLE_FIXED_KEYS_SORTED = sorted(APPLE_BANDWIDTH_FIXED.keys(), key=len, reverse=True)
 _APPLE_VARIANT_KEYS_SORTED = sorted(APPLE_BANDWIDTH_BY_CORES.keys(), key=len, reverse=True)
 # metal: backstop for Apple Silicon chips not in the explicit tables above
 # (e.g. a future M6) — use a conservative generic estimate when unknown.
 FALLBACK_K = {"cuda": 220, "rocm": 180, "metal": 150, "cpu_x86": 70, "cpu_arm": 90}
 USE_CASE_WEIGHTS = {
@@ -60,10 +70,56 @@ CONTEXT_TARGET = {
 }
-def _lookup_bandwidth(gpu_name):
+def _lookup_apple_bandwidth(system):
    gpu_name = system.get("gpu_name")
    if not isinstance(gpu_name, str) or not gpu_name:
        return None
    gn = gpu_name.lower()
    # Guard against false matches on non-Apple GPUs whose names contain
    # "m3"/"m4"/"m5" (e.g. NVIDIA Quadro M4 000).
    if "apple" not in gn:
        return None
    raw_cores = system.get("gpu_cores")
    try:
        gpu_cores = int(raw_cores) if raw_cores is not None else None
    except (TypeError, ValueError):
        gpu_cores = None
    for key in _APPLE_VARIANT_KEYS_SORTED:
        if key not in gn:
            continue
        if gpu_cores in APPLE_BANDWIDTH_BY_CORES[key]:
            return APPLE_BANDWIDTH_BY_CORES[key][gpu_cores]
        return min(APPLE_BANDWIDTH_BY_CORES[key].values())
    for key in _APPLE_FIXED_KEYS_SORTED:
        if key in gn:
            return APPLE_BANDWIDTH_FIXED[key]
    return None
 def _lookup_bandwidth(system):
    if isinstance(system, dict):
        gpu_name = system.get("gpu_name")
    else:
        gpu_name = system
    if not isinstance(gpu_name, str) or not gpu_name:
        return None
    # Apple tiers live only in the Apple-specific table now (#2564), so route
    # BOTH dict and bare-string callers through it. A bare string carries no
    # gpu_cores, so the helper falls back to the conservative (lowest) tier for
    # that model -- before #2564 the generic table answered string lookups, and
    # dropping that made _lookup_bandwidth("Apple M3 Max") return None.
    apple_input = system if isinstance(system, dict) else {"gpu_name": gpu_name}
    bw = _lookup_apple_bandwidth(apple_input)
    if bw is not None:
        return bw
    gn = gpu_name.lower()
    for key in _BW_KEYS_SORTED:
        if key in gn:
            return GPU_BANDWIDTH[key]
@@ -84,7 +140,7 @@ def _estimate_speed(model, quant, run_mode, system, offload_frac=0.0):
    """
    pb = _active_params_b(model)
    is_moe = model.get("is_moe", False)
-    bw = _lookup_bandwidth(system.get("gpu_name"))
+    bw = _lookup_bandwidth(system)
    backend = system.get("backend", "cpu_x86")
    if bw and run_mode in ("gpu", "cpu_offload"):
@@ -1,3 +1,4 @@
 import json
 import os
 import platform
 import re
@@ -335,6 +336,37 @@ def _detect_apple_silicon():
    if total_gb <= 0:
        return None
    def _parse_apple_gpu_cores(text):
        if not text:
            return None
        try:
            data = json.loads(text)
        except (TypeError, ValueError, json.JSONDecodeError):
            data = None
        if isinstance(data, dict):
            for gpu in data.get("SPDisplaysDataType") or []:
                if not isinstance(gpu, dict):
                    continue
                model = str(gpu.get("sppci_model") or gpu.get("_name") or "")
                if "apple" not in model.lower():
                    continue
                cores = gpu.get("sppci_cores")
                try:
                    return int(str(cores).strip())
                except (TypeError, ValueError):
                    continue
        m = re.search(r"Total Number of Cores:\s*(\d+)", text)
        if m:
            try:
                return int(m.group(1))
            except ValueError:
                return None
        return None
    gpu_cores = _parse_apple_gpu_cores(_run(["system_profiler", "SPDisplaysDataType", "-json"]))
    if gpu_cores is None:
        gpu_cores = _parse_apple_gpu_cores(_run(["system_profiler", "SPDisplaysDataType"]))
    # Usable GPU budget. macOS lets Metal use most of unified memory, but the
    # default working-set limit scales with RAM: small machines have to keep
    # more back for the OS + app. These fractions track Apple's
@@ -357,7 +389,7 @@ def _detect_apple_silicon():
        pass
    gpu = {"index": 0, "name": brand, "vram_gb": vram_gb}
-    return {
+    info = {
        "gpu_name": brand,
        "gpu_vram_gb": vram_gb,
        "gpu_count": 1,
@@ -369,6 +401,9 @@ def _detect_apple_silicon():
        # separate pool — downstream fit logic uses this to avoid double-budgeting.
        "unified_memory": True,
    }
    if gpu_cores is not None:
        info["gpu_cores"] = gpu_cores
    return info
 def _read_file(path):
@@ -772,6 +807,7 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
            "gpu_name": gpu_info["gpu_name"],
            "gpu_vram_gb": gpu_info["gpu_vram_gb"],
            "gpu_count": gpu_info["gpu_count"],
            "gpu_cores": gpu_info.get("gpu_cores"),
            "gpus": gpu_info.get("gpus", []),
            "gpu_groups": gpu_info.get("gpu_groups", []),
            "homogeneous": gpu_info.get("homogeneous", True),
@@ -201,11 +201,15 @@ def build_models_url(base: str) -> Optional[str]:
        return _ollama_api_root(base) + "/tags"
    if provider == "chatgpt-subscription":
        return None
-    # Generic OpenAI-compatible fallback: ensure the path lands on /v1/models
+    # Generic OpenAI-compatible fallback: local model servers with no explicit
-    # when the user omitted a path entirely. If a non-empty path is already
+    # path conventionally expose `/v1/models` (LM Studio, llama.cpp, vLLM).
-    # present (e.g. /openai, /api/openai/v1, /v1), trust the caller — the
+    # For non-local unknown hosts, do not invent `/v1`; append `/models` to the
-    # /models suffix is appended as-is and the caller's prefix is preserved.
+    # caller's base so look-alike provider hosts stay generic.
-    if not urlparse(base).path:
+    parsed = urlparse(base)
    host = (parsed.hostname or "").lower()
    is_local = host in {"localhost", "127.0.0.1", "::1", "host.docker.internal"}
    uses_v1_models_by_default = is_local or host in {"api.deepseek.com"}
    if not parsed.path and uses_v1_models_by_default:
        base = base + "/v1"
    return base + "/models"
@@ -1467,8 +1467,8 @@ function initEndpointForm() {
  const localAddBtn = el('adm-epLocalAddBtn');
  const localTestBtn = el('adm-epLocalTestBtn');
  if (localTestBtn) {
    const testOriginalHtml = localTestBtn.innerHTML;
    localTestBtn.addEventListener('click', async () => {
      const testOriginalHtml = localTestBtn.innerHTML || '>Test';
      const msg = _endpointMsg('local');
      msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
      const raw = (el('adm-epLocalUrl').value || '').trim();
@@ -1494,8 +1494,8 @@ function initEndpointForm() {
    });
  }
  if (localAddBtn) {
    const addOriginalHtml = localAddBtn.innerHTML;
    localAddBtn.addEventListener('click', async () => {
      const addOriginalHtml = localAddBtn.innerHTML || '>Add';
      const msg = _endpointMsg('local');
      msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
      const raw = (el('adm-epLocalUrl').value || '').trim();
@@ -41,8 +41,10 @@ def _seed(tmp_path):
 def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
    monkeypatch.chdir(tmp_path)
    SessionLocal = _seed(tmp_path)
    # GALLERY_IMAGE_DIR is an absolute path fixed at import, so a chdir can't
    # redirect the delete; point the resolver at the seeded tmp dir directly.
    monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
    monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
    # A session whose commit always fails, to simulate a DB error mid-delete.
@@ -67,8 +69,8 @@ def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
 def test_file_removed_on_successful_delete(tmp_path, monkeypatch):
    monkeypatch.chdir(tmp_path)
    SessionLocal = _seed(tmp_path)
    monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
    monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
    monkeypatch.setattr(gallery_routes, "SessionLocal", SessionLocal)
@@ -0,0 +1,59 @@
 from services.hwfit.fit import _lookup_apple_bandwidth, _lookup_bandwidth
 def test_m3_max_bandwidth_uses_gpu_cores():
    assert _lookup_bandwidth({"gpu_name": "Apple M3 Max", "gpu_cores": 30}) == 300
    assert _lookup_bandwidth({"gpu_name": "Apple M3 Max", "gpu_cores": 40}) == 400
 def test_m4_max_bandwidth_uses_gpu_cores():
    assert _lookup_bandwidth({"gpu_name": "Apple M4 Max", "gpu_cores": 32}) == 410
    assert _lookup_bandwidth({"gpu_name": "Apple M4 Max", "gpu_cores": 40}) == 546
 def test_m5_max_bandwidth_uses_gpu_cores():
    assert _lookup_bandwidth({"gpu_name": "Apple M5 Max", "gpu_cores": 32}) == 460
    assert _lookup_bandwidth({"gpu_name": "Apple M5 Max", "gpu_cores": 40}) == 614
 def test_apple_max_bandwidth_falls_back_conservatively_without_gpu_cores():
    assert _lookup_bandwidth({"gpu_name": "Apple M3 Max"}) == 300
    assert _lookup_bandwidth({"gpu_name": "Apple M4 Max"}) == 410
    assert _lookup_bandwidth({"gpu_name": "Apple M5 Max"}) == 460
 def test_fixed_apple_bandwidth_entries_include_updated_m5_values():
    assert _lookup_bandwidth({"gpu_name": "Apple M5 Pro"}) == 307
    assert _lookup_bandwidth({"gpu_name": "Apple M5"}) == 153
 def test_non_apple_gpu_does_not_match_apple_bandwidth():
    """NVIDIA Quadro M4 000 should NOT match Apple bandwidth lookup."""
    assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M4 000"}) is None
    assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M3 000"}) is None
    assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M5 000"}) is None
 def test_non_apple_gpu_with_cores_does_not_match():
    """A non-Apple GPU that happens to carry a gpu_cores count must not be
    matched by the APPLE bandwidth path. This asserts the Apple-specific
    matcher directly: _lookup_bandwidth would (correctly) return these cards'
    real bandwidth from the general GPU table (e.g. the RTX 4090's 1008 GB/s),
    which is a different code path and not what this guard is about.
    """
    assert _lookup_apple_bandwidth({"gpu_name": "NVIDIA GeForce RTX 4090", "gpu_cores": 128}) is None
    assert _lookup_apple_bandwidth({"gpu_name": "AMD Radeon RX 9070 XT", "gpu_cores": 64}) is None
 def test_apple_string_input_resolves_conservative_tier():
    """Bare-string callers must still get Apple bandwidth. #2564 moved the
    Apple tiers out of the generic GPU table into the dict-only Apple helper,
    so _lookup_bandwidth("Apple M3 Max") (no gpu_cores) regressed to None;
    string inputs now route through the Apple helper and get the conservative
    (lowest) tier for the model."""
    assert _lookup_bandwidth("Apple M3 Max") == 300
    assert _lookup_bandwidth("Apple M4 Max") == 410
    assert _lookup_bandwidth("Apple M5 Max") == 460
    # Non-Apple strings still fall through to the generic table.
    assert _lookup_bandwidth("NVIDIA GeForce RTX 4090") == 1008
    assert _lookup_bandwidth("Totally Unknown GPU") is None
@@ -4,6 +4,8 @@ Covers the Metal-specific behavior added for Apple Silicon and locks in the
 guarantee that non-macOS (Linux/Windows) detection is unchanged.
 """
 import json
 from services.hwfit import hardware
 from services.hwfit.fit import rank_models
 from services.hwfit.models import get_models
@@ -22,7 +24,7 @@ def _metal_system(ram_gb=16.0, vram_gb=10.7):
    }
-def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None):
+def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None, display_json=None, display_text=None):
    def run(cmd):
        joined = " ".join(cmd)
        if "machdep.cpu.brand_string" in joined:
@@ -31,6 +33,12 @@ def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None):
            return str(int(memsize_gb * 1024**3))
        if "iogpu.wired_limit_mb" in joined:
            return str(wired_mb) if wired_mb is not None else None
        if "system_profiler SPDisplaysDataType -json" in joined:
            if isinstance(display_json, (dict, list)):
                return json.dumps(display_json)
            return display_json
        if "system_profiler SPDisplaysDataType" in joined:
            return display_text
        return None
    return run
@@ -98,16 +106,47 @@ def test_apple_silicon_detected_as_metal(monkeypatch):
    monkeypatch.setattr(hardware, "_remote_host", None)
    monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
    monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
-    monkeypatch.setattr(hardware, "_run", _fake_sysctl(memsize_gb=32))
+    monkeypatch.setattr(hardware, "_run", _fake_sysctl(
        memsize_gb=32,
        display_json={"SPDisplaysDataType": [{"sppci_model": "Apple M2 Pro", "sppci_cores": "19"}]},
    ))
    info = hardware._detect_apple_silicon()
    assert info is not None
    assert info["backend"] == "metal"
    assert info["gpu_name"] == "Apple M2 Pro"
    assert info["unified_memory"] is True
    assert info["gpu_cores"] == 19
    assert info["gpu_vram_gb"] == 24.0  # 32GB * 0.75
 def test_apple_silicon_gpu_cores_fall_back_to_plain_text(monkeypatch):
    monkeypatch.setattr(hardware, "_remote_host", None)
    monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
    monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
    monkeypatch.setattr(hardware, "_run", _fake_sysctl(
        brand="Apple M4 Max",
        memsize_gb=64,
        display_json="{not-json",
        display_text="Graphics/Displays:\n\nApple M4 Max:\n  Total Number of Cores: 32\n",
    ))
    info = hardware._detect_apple_silicon()
    assert info is not None
    assert info["gpu_cores"] == 32
 def test_apple_silicon_gpu_cores_are_optional(monkeypatch):
    monkeypatch.setattr(hardware, "_remote_host", None)
    monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
    monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
    monkeypatch.setattr(hardware, "_run", _fake_sysctl(memsize_gb=32))
    info = hardware._detect_apple_silicon()
    assert info is not None
    assert "gpu_cores" not in info
 def test_apple_silicon_skipped_on_linux(monkeypatch):
    """Guarantee Linux detection is untouched: the Metal probe bails immediately."""
    monkeypatch.setattr(hardware, "_remote_host", None)
@@ -132,7 +171,7 @@ def test_detect_system_propagates_unified_memory(monkeypatch):
    monkeypatch.setattr(hardware, "_detect_apple_silicon", lambda: {
        "gpu_name": "Apple M4", "gpu_vram_gb": 10.7, "gpu_count": 1,
        "gpus": [], "gpu_groups": [], "homogeneous": True,
-        "backend": "metal", "unified_memory": True,
+        "backend": "metal", "unified_memory": True, "gpu_cores": 10,
    })
    monkeypatch.setattr(hardware, "_get_ram_gb", lambda: 16.0)
    monkeypatch.setattr(hardware, "_get_available_ram_gb", lambda: 11.0)
@@ -142,3 +181,4 @@ def test_detect_system_propagates_unified_memory(monkeypatch):
    s = hardware.detect_system(fresh=True)
    assert s["backend"] == "metal"
    assert s.get("unified_memory") is True
    assert s["gpu_cores"] == 10
@@ -107,6 +107,7 @@ class TestBuildersRejectLookalikeHosts:
        assert build_chat_url("https://notanthropic.com") == "https://notanthropic.com/chat/completions"
    def test_lookalike_anthropic_models_is_openai(self):
        assert llm_core._detect_provider("https://anthropic.com.evil.com") == "openai"
        assert build_models_url("https://anthropic.com.evil.com") == "https://anthropic.com.evil.com/models"
    def test_anthropic_domain_in_path_is_openai(self):
@@ -119,6 +120,7 @@ class TestBuildersRejectLookalikeHosts:
        assert build_chat_url("https://notollama.com") == "https://notollama.com/chat/completions"
    def test_lookalike_ollama_models_is_openai(self):
        assert llm_core._detect_provider("https://notollama.com") == "openai"
        assert build_models_url("https://notollama.com") == "https://notollama.com/models"
Author	SHA1	Message	Date
pewdiepie-archdaemon	d9ebdd6fbb	Refresh README presentation	2026-06-15 23:24:41 +09:00
pewdiepie-archdaemon	b118c33e37	test(provider): align lookalike-host URL expectations with /models behavior build_models_url returns /models (no /v1 prefix) for non-local generic OpenAI-compatible hosts (intentional, see endpoint_resolver.py:206). The tests added in #4272 expected /v1/models, which is the local/deepseek behavior. Match production semantics.	2026-06-15 23:21:49 +09:00
pewdiepie-archdaemon	da74cc23e4	Merge remote-tracking branch 'origin/dev'	2026-06-15 23:13:18 +09:00
Ashvin	d792b61722	test(gallery): point delete-ordering tests at the tmp image dir (#4300 ) The two delete-ordering tests did monkeypatch.chdir(tmp_path) and wrote the image under tmp_path/data/generated_images, but DATA_DIR (and therefore gallery_routes.GALLERY_IMAGE_DIR) is always an absolute path, so the delete resolver pointed at the repo's real data dir and ignored the chdir. test_file_removed_on_successful_delete therefore failed on dev (the file at the tmp path was never the one being removed), and test_file_kept_when_commit_fails passed only by accident. Set GALLERY_IMAGE_DIR to the seeded tmp dir via monkeypatch so both tests exercise the real path and pass deterministically.	2026-06-15 14:07:49 +00:00
pewdiepie-archdaemon	1faadf7e10	Merge remote-tracking branch 'origin/dev'	2026-06-15 23:02:46 +09:00
Kenny Van de Maele	e87b44126c	test(hwfit): fix non-Apple guard to assert the Apple matcher (unblocks pytest gate) (#4303 ) * test(hwfit): assert the Apple matcher, not the general lookup, in the non-Apple guard `f7aa2de` (#2564) added test_non_apple_gpu_with_cores_does_not_match, which asserts _lookup_bandwidth(RTX 4090) is None. But '4090': 1008 has been in the general GPU_BANDWIDTH table since v1.0, so _lookup_bandwidth correctly returns the card's real bandwidth and the test fails (expected None, got 1008) - reddening the required pytest gate on dev and, by inheritance, every open PR. The guard's actual intent is that the Apple-specific bandwidth path does not false-match a non-Apple card that carries a gpu_cores count. Point the two asserts at _lookup_apple_bandwidth, which returns None for any name without 'apple' regardless of the general table. The general-lookup behavior (4090 -> 1008) is correct and untouched. * fix(hwfit): route string GPU names through the Apple bandwidth helper Second half of the #2564 regression (RaresKeY review on #4303). That change moved the Apple tiers out of the generic GPU_BANDWIDTH table into the dict-only _lookup_apple_bandwidth, but _lookup_bandwidth only called that helper for dict inputs. A bare-string caller like _lookup_bandwidth("Apple M3 Max") therefore fell through to the generic table, found no Apple key, and returned None instead of the conservative tier. Route both dict and string inputs through the Apple helper (a string carries no gpu_cores, so it gets the model's lowest tier). Regression added for the string path plus a non-Apple string control.	2026-06-15 14:01:05 +00:00
pewdiepie-archdaemon	62476ddb55	Merge remote-tracking branch 'origin/dev'	2026-06-15 22:59:57 +09:00
pewdiepie-archdaemon	e899817969	Remove duplicate CodeQL workflow	2026-06-15 22:53:29 +09:00
pewdiepie-archdaemon	1cc9a003fd	Fix failing post-merge tests	2026-06-15 22:49:06 +09:00
Ahmad Naalweh	f7aa2de410	fix(hwfit): distinguish Apple Silicon bandwidth variants (#2564 ) * fix: resolve Apple Silicon bandwidth variants * fix(hwfit): preserve string lookup path in _lookup_bandwidth * fix(hwfit): guard Apple bandwidth lookup against false GPU matches Add "apple" not in gn check to _lookup_apple_bandwidth() so that non-Apple GPUs with "m3"/"m4"/"m5" in their names (e.g. NVIDIA Quadro M4 000) don't incorrectly match Apple bandwidth tiers. Addresses @o3LL review comment on PR #2564.	2026-06-15 15:13:03 +02:00
Ashvin	514d345334	test(models): pin lookalike hosts to the generic OpenAI branch (#4272 ) #4159 (`4b0a977`) made build_models_url insert /v1 for path-less bases, so the TestBuildersRejectLookalikeHosts model assertions that expected /models started failing and turned the pytest gate red on dev. Both the generic OpenAI branch and the real Anthropic branch now end in /v1/models, so a URL-only assertion no longer proves a lookalike host dodged the Anthropic/Ollama branch. Assert _detect_provider == "openai" directly and keep the /v1/models expectation.	2026-06-15 12:43:33 +00:00