mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-17 18:25:26 -04:00
Compare commits
11 Commits
6d507f8128
...
d9ebdd6fbb
| Author | SHA1 | Date | |
|---|---|---|---|
| d9ebdd6fbb | |||
| b118c33e37 | |||
| da74cc23e4 | |||
| d792b61722 | |||
| 1faadf7e10 | |||
| e87b44126c | |||
| 62476ddb55 | |||
| e899817969 | |||
| 1cc9a003fd | |||
| f7aa2de410 | |||
| 514d345334 |
@@ -1,61 +0,0 @@
|
|||||||
# CodeQL code scanning
|
|
||||||
#
|
|
||||||
# Purpose: GitHub's own static analysis engine reads the application source
|
|
||||||
# (Python backend + the JavaScript frontend) and looks for real
|
|
||||||
# vulnerabilities -- SQL/command injection, path traversal, auth mistakes,
|
|
||||||
# unsafe deserialization. Findings appear in the repo's Security tab. This is
|
|
||||||
# the deepest check in the suite and the most valuable for a high-profile
|
|
||||||
# target.
|
|
||||||
#
|
|
||||||
# It runs on every push to main and on a weekly schedule (to catch newly
|
|
||||||
# disclosed query patterns against unchanged code). It deliberately does NOT
|
|
||||||
# run on pull requests: most PRs here come from forks, whose read-only token
|
|
||||||
# cannot publish results, which would produce confusing failures. To scan pull
|
|
||||||
# requests too, a maintainer can instead enable CodeQL "default setup" in
|
|
||||||
# Settings -> Security -> Code scanning (one toggle, no file needed) -- see
|
|
||||||
# docs/security-ci.md.
|
|
||||||
|
|
||||||
name: CodeQL
|
|
||||||
|
|
||||||
on:
|
|
||||||
push:
|
|
||||||
branches: [main]
|
|
||||||
schedule:
|
|
||||||
# Weekly, Monday 06:00 UTC.
|
|
||||||
- cron: '0 6 * * 1'
|
|
||||||
workflow_dispatch:
|
|
||||||
|
|
||||||
permissions: {}
|
|
||||||
|
|
||||||
concurrency:
|
|
||||||
group: codeql-${{ github.workflow }}-${{ github.ref }}
|
|
||||||
cancel-in-progress: true
|
|
||||||
|
|
||||||
jobs:
|
|
||||||
analyze:
|
|
||||||
name: Analyze (${{ matrix.language }})
|
|
||||||
runs-on: ubuntu-latest
|
|
||||||
permissions:
|
|
||||||
contents: read
|
|
||||||
security-events: write # publish results to the Security tab
|
|
||||||
strategy:
|
|
||||||
fail-fast: false
|
|
||||||
matrix:
|
|
||||||
# Both are interpreted, so CodeQL needs no build step (build-mode none).
|
|
||||||
language: [python, javascript-typescript]
|
|
||||||
steps:
|
|
||||||
- name: Checkout repository
|
|
||||||
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
|
|
||||||
with:
|
|
||||||
persist-credentials: false
|
|
||||||
|
|
||||||
- name: Initialize CodeQL
|
|
||||||
uses: github/codeql-action/init@8aad20d150bbac5944a9f9d289da16a4b0d87c1e # v4.36.2
|
|
||||||
with:
|
|
||||||
languages: ${{ matrix.language }}
|
|
||||||
build-mode: none
|
|
||||||
|
|
||||||
- name: Perform CodeQL analysis
|
|
||||||
uses: github/codeql-action/analyze@8aad20d150bbac5944a9f9d289da16a4b0d87c1e # v4.36.2
|
|
||||||
with:
|
|
||||||
category: "/language:${{ matrix.language }}"
|
|
||||||
@@ -1,476 +1,65 @@
|
|||||||
# Odysseus
|
<p align="center">
|
||||||
|
<img src="docs/odysseus-wordmark.png" alt="Odysseus" width="280">
|
||||||
|
</p>
|
||||||
|
|
||||||
> **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
|
<p align="center">
|
||||||
|
A self-hosted AI workspace for chat, agents, research, documents, email, notes, calendar, and local model workflows.
|
||||||
|
</p>
|
||||||
|
|
||||||
```
|
<p align="center">
|
||||||
───────────────────────────────────────────────
|
<a href="#quick-start">Quick Start</a> ·
|
||||||
⊹ ࣪ ˖ ૮( ˶ᵔ ᵕ ᵔ˶ )っ Odysseus vers. 1.0
|
<a href="docs/setup.md">Setup Guide</a> ·
|
||||||
───────────────────────────────────────────────
|
<a href="CONTRIBUTING.md">Contributing</a> ·
|
||||||
```
|
<a href="ROADMAP.md">Roadmap</a>
|
||||||
|
</p>
|
||||||
|
|
||||||

|
<p align="center">
|
||||||
|
<a href="https://repology.org/project/odysseus-ai/versions"><img src="https://repology.org/badge/vertical-allrepos/odysseus-ai.svg" alt="Packaging status"></a>
|
||||||
|
</p>
|
||||||
|
|
||||||
A self-hosted AI workspace -- meant to be the self-hosted version of the UI experience you get from ChatGPT and Claude. But with more jank and fun. Running on your own hardware, with your own data -- local-first, privacy-first, and no trojan.
|
<p align="center">
|
||||||
|
<img src="docs/odysseus.jpg" alt="Odysseus interface">
|
||||||
|
</p>
|
||||||
|
|
||||||
[](https://repology.org/project/odysseus-ai/versions)
|
---
|
||||||
|
|
||||||
## Features
|
|
||||||
- **Chat** -- chat with any local model or API; adding them is super simple.<br> <sub>vLLM · llama.cpp · Ollama · OpenRouter · OpenAI · GitHub Copilot</sub>
|
|
||||||
- **Agent** -- hand it tools and let it run the whole task itself.<br> <sub>built on [opencode](https://github.com/anomalyco/opencode) · MCP · web · files · shell · skills · memory</sub>
|
|
||||||
- **Cookbook** -- Scans your hardware, recommends models, click to download and serve.. easy!<br> <sub>built on [llmfit](https://github.com/AlexsJones/llmfit) · VRAM-aware · GGUF / FP8 / AWQ · fit scoring · vLLM / llama.cpp serving</sub>
|
|
||||||
- **Deep Research** -- multi-step runs that gather, read, and synthesize sources into a nice visual report.<br> <sub>adapted from [Tongyi DeepResearch](https://github.com/Alibaba-NLP/DeepResearch)</sub>
|
|
||||||
- **Compare** -- a fun tool to compare models side by side. Test completely blind, no bias!<br> <sub>multi-model · blind test · synthesis</sub>
|
|
||||||
- **Documents** -- YOU write the text, AI is there to assist, not the opposite.<br> <sub>multi-tab editor · markdown · HTML · CSV · syntax highlighting · AI edits · suggestions</sub>
|
|
||||||
- **Memory / Skills** -- Persistent memory and skills, your agent evolves over time as it better understands you and your tasks!<br> <sub>ChromaDB · fastembed (ONNX) · vector + keyword retrieval · import/export</sub>
|
|
||||||
- **Email** -- IMAP/SMTP inbox with AI triage built in: urgency reminders, auto-tag, auto-summary, auto-reply drafts, auto-spam.<br> <sub>IMAP · SMTP · per-account routing · CalDAV-aware</sub>
|
|
||||||
- **Notes & Tasks** -- Quick notes with reminders, a todo list, and scheduled tasks the agent can act on.<br> <sub>note pings · checklist · cron-style tasks · ntfy / browser / email channels</sub>
|
|
||||||
- **Calendar** -- Local-first calendar with CalDAV sync to Radicale / Nextcloud / Apple / Fastmail.<br> <sub>CalDAV pull · .ics import/export · per-calendar colors · agent-aware</sub>
|
|
||||||
- **Works on mobile** -- looks and runs great on your phone, not just desktop.<br> <sub>responsive · installable (PWA) · touch gestures</sub>
|
|
||||||
- **Extras** -- more to explore, happy if you give it a go!<br> <sub>image editor · theme editor · file uploads (vision + PDF) · web search · presets · sessions · 2FA</sub>
|
|
||||||
|
|
||||||
## Demo
|
|
||||||
A full, hover-to-play tour lives on the landing page (`docs/index.html`).
|
|
||||||
|
|
||||||
<details>
|
|
||||||
<summary>Screenshots / clips</summary>
|
|
||||||
|
|
||||||
### Chat & Agents
|
|
||||||

|
|
||||||
### Deep Research
|
|
||||||

|
|
||||||
### Compare
|
|
||||||

|
|
||||||
### Documents
|
|
||||||

|
|
||||||
### Notes & Tasks
|
|
||||||

|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
## Quick Start
|
## Quick Start
|
||||||
|
|
||||||
Defaults work out of the box: clone, run, then configure models/search/email
|
> `dev` is the default branch and gets the newest changes first. Use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main) if you want the more curated branch.
|
||||||
inside **Settings**. Only edit `.env` for deployment-level overrides like
|
|
||||||
`APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
|
|
||||||
|
|
||||||
On first setup, Odysseus creates an admin account (`admin` unless
|
|
||||||
`ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
|
|
||||||
For Docker installs, the same line is in `docker compose logs odysseus`.
|
|
||||||
Use that for the first login, then change it in **Settings**.
|
|
||||||
|
|
||||||
Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
|
|
||||||
pull request guidelines.
|
|
||||||
|
|
||||||
### Docker (recommended)
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
cd odysseus
|
cd odysseus
|
||||||
cp .env.example .env # optional, but recommended for explicit defaults
|
cp .env.example .env
|
||||||
docker compose up -d --build
|
docker compose up -d --build
|
||||||
```
|
```
|
||||||
To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.
|
|
||||||
|
|
||||||
Open `http://localhost:7000` when the containers are healthy. Docker Compose
|
Open `http://localhost:7000` when the containers are healthy. The first admin password is printed in `docker compose logs odysseus`.
|
||||||
binds the web UI to `127.0.0.1` by default. If the port is taken, set
|
|
||||||
`APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
|
|
||||||
only when you intentionally want LAN/reverse-proxy access.
|
|
||||||
|
|
||||||
> **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
|
Native installs, GPU notes, Windows/macOS instructions, HTTPS, and configuration live in the [setup guide](docs/setup.md).
|
||||||
> Cookbook serves local models on CPU only. For GPU-accelerated model serving,
|
|
||||||
> run natively instead — see [Apple Silicon](#apple-silicon) below.
|
|
||||||
|
|
||||||
### Native Linux / macOS
|
## Features
|
||||||
```bash
|
|
||||||
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
|
||||||
cd odysseus
|
|
||||||
python3 -m venv venv
|
|
||||||
source venv/bin/activate
|
|
||||||
pip install -r requirements.txt
|
|
||||||
python setup.py
|
|
||||||
python -m uvicorn app:app --host 127.0.0.1 --port 7000
|
|
||||||
```
|
|
||||||
Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
|
|
||||||
downloads and serves. The app itself is lightweight; local model serving is the
|
|
||||||
heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
|
|
||||||
connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
|
|
||||||
|
|
||||||
### Apple Silicon
|
- **Chat + Agents** — local/API models, tools, MCP, files, shell, skills, and memory.
|
||||||
Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
|
- **Cookbook** — hardware-aware model recommendations, downloads, and serving.
|
||||||
M-series Mac, run Odysseus natively:
|
- **Deep Research** — multi-step web research with source reading and report generation.
|
||||||
|
- **Compare** — blind side-by-side model testing and synthesis.
|
||||||
|
- **Documents** — writing-first editor with AI edits, suggestions, Markdown, HTML, CSV, and syntax highlighting.
|
||||||
|
- **Email** — IMAP/SMTP inbox with triage, tags, summaries, reminders, and reply drafts.
|
||||||
|
- **Notes, Tasks + Calendar** — reminders, todos, scheduled agent tasks, and CalDAV sync.
|
||||||
|
- **Extras** — gallery/image editor, themes, uploads, web search, presets, sessions, and 2FA.
|
||||||
|
|
||||||
```bash
|
## Demo
|
||||||
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
|
||||||
cd odysseus
|
|
||||||
./start-macos.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
|
A full hover-to-play tour lives on the landing page: [`docs/index.html`](docs/index.html).
|
||||||
|
|
||||||
```bash
|
|
||||||
ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
|
|
||||||
# then open http://<tailscale-ip>:7860
|
|
||||||
```
|
|
||||||
|
|
||||||
The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
|
|
||||||
set there are picked up automatically without a command-line override each run.
|
|
||||||
|
|
||||||
Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
|
|
||||||
expose this port directly to the public internet. To build a clickable app wrapper:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./build-macos-app.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
<details>
|
|
||||||
<summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
|
|
||||||
|
|
||||||
**Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
|
|
||||||
ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
|
|
||||||
they are reachable from the host but not exposed to your LAN/public internet
|
|
||||||
unless you opt in.
|
|
||||||
|
|
||||||
**Cookbook storage in Docker.** Downloads live in `./data/huggingface`
|
|
||||||
(`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
|
|
||||||
serve engines live in `./data/local` (`~/.local` in the container), so they
|
|
||||||
survive container recreation.
|
|
||||||
|
|
||||||
**Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
|
|
||||||
Odysseus SSH key and add the public key to the remote server's
|
|
||||||
`~/.ssh/authorized_keys`. From the host you can also run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
ssh-copy-id -i data/ssh/id_ed25519.pub user@server
|
|
||||||
```
|
|
||||||
|
|
||||||
**Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
|
|
||||||
only detect GPUs that Docker exposes to the container — if the host runtime or
|
|
||||||
device passthrough is not configured, Cookbook sees the iGPU, another card, or
|
|
||||||
CPU instead of your intended GPU.
|
|
||||||
|
|
||||||
For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
|
|
||||||
optionally install the host runtime or update `.env`.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Read-only diagnostic (default — installs nothing, never edits .env):
|
|
||||||
scripts/check-docker-gpu.sh
|
|
||||||
|
|
||||||
# Print OS-specific install commands without running them:
|
|
||||||
scripts/check-docker-gpu.sh --print-install-commands
|
|
||||||
|
|
||||||
# Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
|
|
||||||
scripts/check-docker-gpu.sh --install-nvidia-toolkit
|
|
||||||
|
|
||||||
# Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
|
|
||||||
scripts/check-docker-gpu.sh --enable-nvidia-overlay
|
|
||||||
|
|
||||||
# Full assisted setup — install toolkit, then enable overlay if passthrough works:
|
|
||||||
scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
|
|
||||||
```
|
|
||||||
|
|
||||||
Safety notes:
|
|
||||||
- The app never installs host GPU runtime automatically.
|
|
||||||
- The app never edits `.env` automatically.
|
|
||||||
- `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
|
|
||||||
and only after GPU passthrough succeeds. `--yes` skips prompts but does not
|
|
||||||
bypass the passthrough gate.
|
|
||||||
- `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
|
|
||||||
Git and the Docker build context.
|
|
||||||
|
|
||||||
To enable manually without the script, add this to `.env`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
|
|
||||||
```
|
|
||||||
|
|
||||||
**AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
scripts/check-docker-amd-gpu.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
|
|
||||||
numeric render group id:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
|
|
||||||
RENDER_GID=989
|
|
||||||
```
|
|
||||||
|
|
||||||
For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
|
|
||||||
|
|
||||||
**Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
|
|
||||||
often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
|
|
||||||
or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
|
|
||||||
overlay workflow above. For stack UIs, point the stack at one of the standalone
|
|
||||||
files instead, which bundle the base stack plus the GPU settings:
|
|
||||||
|
|
||||||
- `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
|
|
||||||
on the host.
|
|
||||||
- `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
|
|
||||||
`video`/`render` group membership, and `RENDER_GID` when needed.
|
|
||||||
|
|
||||||
The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
|
|
||||||
source of truth; the standalone files mirror them for single-file deployments.
|
|
||||||
|
|
||||||
Verify after enabling either overlay:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose exec odysseus nvidia-smi -L # NVIDIA
|
|
||||||
docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*' # AMD
|
|
||||||
```
|
|
||||||
|
|
||||||
> **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
|
|
||||||
> container confirms Docker GPU access, but llama.cpp also needs `cudart` and
|
|
||||||
> the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
|
|
||||||
> library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
|
|
||||||
> tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
|
|
||||||
> not a Docker passthrough failure. Reinstall the serve engine via
|
|
||||||
> **Cookbook → Dependencies** to get a CUDA-enabled build.
|
|
||||||
>
|
|
||||||
> The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
|
|
||||||
> the container confirms device passthrough, not ROCm userspace or a
|
|
||||||
> ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
|
|
||||||
> inside the slim Odysseus image.
|
|
||||||
|
|
||||||
**Ollama with Docker.** If Ollama runs on the host, add this endpoint in
|
|
||||||
Settings:
|
|
||||||
|
|
||||||
```text
|
|
||||||
http://host.docker.internal:11434/v1
|
|
||||||
```
|
|
||||||
|
|
||||||
Ollama must listen outside its own loopback interface:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
OLLAMA_HOST=0.0.0.0:11434 ollama serve
|
|
||||||
```
|
|
||||||
|
|
||||||
This connects Odysseus in Docker to an Ollama server that is already running on
|
|
||||||
your host machine; it does not start Ollama inside the container.
|
|
||||||
`host.docker.internal` is Docker's hostname for the host machine from inside the
|
|
||||||
container. Cookbook **Serve** is a separate workflow for serving downloaded
|
|
||||||
models through Odysseus/llama.cpp, so Windows users with an existing Ollama
|
|
||||||
install usually only need to add the endpoint in Settings.
|
|
||||||
|
|
||||||
**Useful checks.**
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose ps
|
|
||||||
docker compose logs --tail=120 odysseus
|
|
||||||
docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
|
|
||||||
```
|
|
||||||
|
|
||||||
**macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
|
|
||||||
runs setup, and starts uvicorn on port `7860` because AirPlay often holds
|
|
||||||
`7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
|
|
||||||
do not run on macOS. MLX-only models are not served by Odysseus.
|
|
||||||
|
|
||||||
</details>
|
|
||||||
|
|
||||||
### Native Windows
|
|
||||||
|
|
||||||
**One-command launcher** (creates the venv, installs deps, runs setup, starts the
|
|
||||||
server; safe to re-run):
|
|
||||||
|
|
||||||
```powershell
|
|
||||||
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
|
||||||
cd odysseus
|
|
||||||
powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
|
|
||||||
```
|
|
||||||
|
|
||||||
Or do it by hand:
|
|
||||||
|
|
||||||
```powershell
|
|
||||||
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
|
||||||
cd odysseus
|
|
||||||
py -3.11 -m venv venv
|
|
||||||
venv\Scripts\Activate.ps1
|
|
||||||
pip install -r requirements.txt
|
|
||||||
python setup.py
|
|
||||||
python -m uvicorn app:app --host 127.0.0.1 --port 7000
|
|
||||||
```
|
|
||||||
|
|
||||||
If `python` points at an older interpreter, use `py -3.12` (or another installed
|
|
||||||
3.11+ version) for the venv step.
|
|
||||||
|
|
||||||
**Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
|
|
||||||
email, calendar, deep research) runs fully native. For full **Cookbook** background
|
|
||||||
model downloads and the agent shell tool, also install
|
|
||||||
[Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
|
|
||||||
Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
|
|
||||||
[Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
|
|
||||||
`http://localhost:11434/v1` in Settings.
|
|
||||||
|
|
||||||
Open `http://localhost:7000`, log in with the generated admin password,
|
|
||||||
and configure everything else inside **Settings**.
|
|
||||||
|
|
||||||
## Troubleshooting & Advanced Setup
|
|
||||||
|
|
||||||
### `chromadb-client` conflicts with embedded ChromaDB
|
|
||||||
If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
|
|
||||||
|
|
||||||
**Fix:** uninstall `chromadb-client` and force-reinstall the full package:
|
|
||||||
```bash
|
|
||||||
./venv/bin/pip uninstall chromadb-client -y
|
|
||||||
./venv/bin/pip install --force-reinstall chromadb
|
|
||||||
```
|
|
||||||
|
|
||||||
### HTTPS + LAN/Tailscale exposure
|
|
||||||
To expose Odysseus on a local network or Tailscale with HTTPS:
|
|
||||||
1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
|
|
||||||
2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
|
|
||||||
```bash
|
|
||||||
mkcert -install
|
|
||||||
mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
|
|
||||||
```
|
|
||||||
3. Run `uvicorn` with the generated certs:
|
|
||||||
```bash
|
|
||||||
python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
|
|
||||||
```
|
|
||||||
4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
|
|
||||||
|
|
||||||
### Optional Dependencies
|
|
||||||
`requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
|
|
||||||
|
|
||||||
| Package | Feature unlocked |
|
|
||||||
|---------|-----------------|
|
|
||||||
| `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
|
|
||||||
| `ddgs` | DuckDuckGo as a search provider option. |
|
|
||||||
| `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
|
|
||||||
| `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
|
|
||||||
|
|
||||||
### Faster, reproducible installs with uv (optional)
|
|
||||||
[uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
|
|
||||||
venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv venv venv --python 3.13
|
|
||||||
uv pip install -r requirements.txt
|
|
||||||
# then continue as usual: python setup.py, uvicorn, ...
|
|
||||||
```
|
|
||||||
|
|
||||||
`requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv pip compile requirements.txt -o requirements.lock # snapshot current resolution
|
|
||||||
uv pip sync requirements.lock # reproduce it exactly later
|
|
||||||
```
|
|
||||||
|
|
||||||
`requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
|
|
||||||
|
|
||||||
### Outlook / Office 365 email
|
|
||||||
Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
|
|
||||||
and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
|
|
||||||
passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
|
|
||||||
current limitation and the planned integration direction.
|
|
||||||
|
|
||||||
## Security Notes
|
|
||||||
Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
|
|
||||||
|
|
||||||
- Keep `AUTH_ENABLED=true` for any network-accessible deployment.
|
|
||||||
- Keep `LOCALHOST_BYPASS=false` outside local development.
|
|
||||||
- Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
|
|
||||||
- Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
|
|
||||||
- Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
|
|
||||||
- Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
|
|
||||||
- Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
|
|
||||||
- Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
|
|
||||||
- If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
|
|
||||||
- Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
|
|
||||||
- Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
|
|
||||||
- Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
|
|
||||||
|
|
||||||
### Private or proxied deployments
|
|
||||||
Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
|
|
||||||
|
|
||||||
1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
|
|
||||||
2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
|
|
||||||
3. Put the authenticated Odysseus web/API entrypoint behind that layer.
|
|
||||||
4. Keep raw service and model ports internal-only.
|
|
||||||
|
|
||||||
Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
|
|
||||||
`ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
|
|
||||||
|
|
||||||
Common internal-only ports from the default docs/compose setup:
|
|
||||||
|
|
||||||
| Port | Service |
|
|
||||||
|---|---|
|
|
||||||
| `7000` | Odysseus raw app port |
|
|
||||||
| `8080` | SearXNG |
|
|
||||||
| `8091` | ntfy |
|
|
||||||
| `8100` | ChromaDB host port for manual/compose access |
|
|
||||||
| `11434` | Ollama |
|
|
||||||
| `8000-8020` | Common local model/provider APIs |
|
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
Help is welcome. The best entry points are fresh-install testing, provider setup
|
|
||||||
bugs, mobile/editor polish, docs, and small focused refactors. See
|
|
||||||
[ROADMAP.md](ROADMAP.md) for the current help-wanted list.
|
|
||||||
|
|
||||||
## Configuration
|
Help is welcome. The best entry points are fresh-install testing, provider setup bugs, mobile/editor polish, docs, and small focused refactors. See [CONTRIBUTING.md](CONTRIBUTING.md) and [ROADMAP.md](ROADMAP.md).
|
||||||
Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
|
|
||||||
for deployment-level defaults and secrets you want present before first boot.
|
|
||||||
Key settings:
|
|
||||||
|
|
||||||
| Variable | Default | Description |
|
## Security
|
||||||
|---|---|---|
|
|
||||||
| `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
|
|
||||||
| `LLM_HOSTS` | -- | Comma-separated list for model discovery |
|
|
||||||
| `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
|
|
||||||
| `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
|
|
||||||
| `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
|
|
||||||
| `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
|
|
||||||
| `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
|
|
||||||
| `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
|
|
||||||
| `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
|
|
||||||
| `AUTH_ENABLED` | `true` | Enable/disable login |
|
|
||||||
| `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
|
|
||||||
| `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
|
|
||||||
| `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
|
|
||||||
| `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
|
|
||||||
| `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
|
|
||||||
| `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
|
|
||||||
| `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
|
|
||||||
| `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
|
|
||||||
| `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
|
|
||||||
| `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
|
|
||||||
| `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
|
|
||||||
| `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
|
|
||||||
| `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
|
|
||||||
| `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
|
|
||||||
| `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
|
|
||||||
|
|
||||||
All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
|
Odysseus is a self-hosted workspace with powerful local tools. Keep auth enabled, keep private data out of Git, and do not expose raw model/service ports publicly. Deployment details are in the [setup guide](docs/setup.md#security-notes).
|
||||||
|
|
||||||
### Built-in MCP servers (optional setup)
|
|
||||||
|
|
||||||
Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
|
|
||||||
|
|
||||||
To enable the browser MCP (page navigation, screenshots, vision), run once:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
npx -y @playwright/mcp@latest --version
|
|
||||||
```
|
|
||||||
|
|
||||||
That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
```
|
|
||||||
app.py # FastAPI entry point
|
|
||||||
core/ auth, database, middleware, constants
|
|
||||||
src/ llm_core, agent_loop, agent_tools, chat_processor, search/
|
|
||||||
routes/ chat, session, document, memory, model … endpoints
|
|
||||||
services/ docs, memory, search, hwfit (Cookbook) …
|
|
||||||
static/ index.html + app.js + style.css + js/ (modular front-end)
|
|
||||||
docs/ landing page (index.html) + preview clips
|
|
||||||
```
|
|
||||||
|
|
||||||
## Data
|
|
||||||
All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
|
|
||||||
`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
|
|
||||||
|
|
||||||
To back up or restore everything in `data/`, see the
|
|
||||||
[Backup & Restore guide](docs/backup-restore.md).
|
|
||||||
|
|
||||||
## Star History
|
## Star History
|
||||||
|
|
||||||
@@ -483,19 +72,5 @@ To back up or restore everything in `data/`, see the
|
|||||||
</a>
|
</a>
|
||||||
|
|
||||||
## License
|
## License
|
||||||
AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).
|
|
||||||
|
|
||||||
```
|
AGPL-3.0-or-later -- see [LICENSE](LICENSE) and [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md).
|
||||||
|
|
|
||||||
|||
|
|
||||||
|||||
|
|
||||||
| | | |||||||
|
|
||||||
)_) )_) )_) ~|~
|
|
||||||
)___))___))___)\ |
|
|
||||||
)____)____)_____)\\|
|
|
||||||
_____|____|____|_____\\\__
|
|
||||||
\ /
|
|
||||||
~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
|
|
||||||
~^~ all aboard! ~^~
|
|
||||||
~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~~^~^~
|
|
||||||
```
|
|
||||||
|
|||||||
Binary file not shown.
|
After Width: | Height: | Size: 16 KiB |
Binary file not shown.
|
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 52 KiB |
+425
@@ -0,0 +1,425 @@
|
|||||||
|
# Odysseus Setup Guide
|
||||||
|
|
||||||
|
This page keeps the detailed install, deployment, troubleshooting, and configuration notes out of the front README.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
> **Branch note:** `dev` is the default branch and contains the latest development changes, but it may be unstable. For the more stable curated branch, use [`main`](https://github.com/pewdiepie-archdaemon/odysseus/tree/main).
|
||||||
|
|
||||||
|
Defaults work out of the box: clone, run, then configure models/search/email
|
||||||
|
inside **Settings**. Only edit `.env` for deployment-level overrides like
|
||||||
|
`APP_BIND`, `APP_PORT`, `AUTH_ENABLED`, `DATABASE_URL`, or a pre-seeded admin password.
|
||||||
|
|
||||||
|
On first setup, Odysseus creates an admin account (`admin` unless
|
||||||
|
`ODYSSEUS_ADMIN_USER` is set) and prints a temporary password in the terminal.
|
||||||
|
For Docker installs, the same line is in `docker compose logs odysseus`.
|
||||||
|
Use that for the first login, then change it in **Settings**.
|
||||||
|
|
||||||
|
Contributing? See [CONTRIBUTING.md](CONTRIBUTING.md) for setup, testing, and
|
||||||
|
pull request guidelines.
|
||||||
|
|
||||||
|
### Docker (recommended)
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
|
cd odysseus
|
||||||
|
cp .env.example .env # optional, but recommended for explicit defaults
|
||||||
|
docker compose up -d --build
|
||||||
|
```
|
||||||
|
To include optional extras in the image (PDF viewer, Office extraction; includes AGPL PyMuPDF), build with `docker compose build --build-arg INSTALL_OPTIONAL=true` before `up`.
|
||||||
|
|
||||||
|
Open `http://localhost:7000` when the containers are healthy. Docker Compose
|
||||||
|
binds the web UI to `127.0.0.1` by default. If the port is taken, set
|
||||||
|
`APP_PORT=7001` in `.env` and recreate the container. Set `APP_BIND=0.0.0.0`
|
||||||
|
only when you intentionally want LAN/reverse-proxy access.
|
||||||
|
|
||||||
|
> **On Apple Silicon (M-series) Macs:** Docker can't reach the Metal GPU, so
|
||||||
|
> Cookbook serves local models on CPU only. For GPU-accelerated model serving,
|
||||||
|
> run natively instead — see [Apple Silicon](#apple-silicon) below.
|
||||||
|
|
||||||
|
### Native Linux / macOS
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
|
cd odysseus
|
||||||
|
python3 -m venv venv
|
||||||
|
source venv/bin/activate
|
||||||
|
pip install -r requirements.txt
|
||||||
|
python setup.py
|
||||||
|
python -m uvicorn app:app --host 127.0.0.1 --port 7000
|
||||||
|
```
|
||||||
|
Requirements: Python 3.11+. Cookbook also needs `tmux` for background model
|
||||||
|
downloads and serves. The app itself is lightweight; local model serving is the
|
||||||
|
heavy part and depends on the model, runtime, GPU, and VRAM, so small hosts can
|
||||||
|
connect to API or remote model servers instead. Use `--host 0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
|
||||||
|
|
||||||
|
### Apple Silicon
|
||||||
|
Docker on macOS cannot use the Metal GPU. For GPU-accelerated Cookbook on an
|
||||||
|
M-series Mac, run Odysseus natively:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
|
cd odysseus
|
||||||
|
./start-macos.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
It launches at `http://127.0.0.1:7860`. To expose it to your phone over a trusted LAN/VPN such as Tailscale, bind all interfaces:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ODYSSEUS_HOST=0.0.0.0 ./start-macos.sh
|
||||||
|
# then open http://<tailscale-ip>:7860
|
||||||
|
```
|
||||||
|
|
||||||
|
The script also reads `.env` at startup, so `APP_BIND=0.0.0.0` and `APP_PORT`
|
||||||
|
set there are picked up automatically without a command-line override each run.
|
||||||
|
|
||||||
|
Keep `AUTH_ENABLED=true` (the default) before binding outside loopback. Do not
|
||||||
|
expose this port directly to the public internet. To build a clickable app wrapper:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./build-macos-app.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>Cookbook, GPU, Ollama, and troubleshooting notes</summary>
|
||||||
|
|
||||||
|
**Docker bundled services.** Compose starts Odysseus, ChromaDB, SearXNG, and
|
||||||
|
ntfy. Odysseus and the bundled service ports bind to `127.0.0.1` by default, so
|
||||||
|
they are reachable from the host but not exposed to your LAN/public internet
|
||||||
|
unless you opt in.
|
||||||
|
|
||||||
|
**Cookbook storage in Docker.** Downloads live in `./data/huggingface`
|
||||||
|
(`~/.cache/huggingface` in the container). Cookbook-installed Python CLIs and
|
||||||
|
serve engines live in `./data/local` (`~/.local` in the container), so they
|
||||||
|
survive container recreation.
|
||||||
|
|
||||||
|
**Remote servers.** In **Cookbook -> Settings -> Servers**, generate the
|
||||||
|
Odysseus SSH key and add the public key to the remote server's
|
||||||
|
`~/.ssh/authorized_keys`. From the host you can also run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ssh-copy-id -i data/ssh/id_ed25519.pub user@server
|
||||||
|
```
|
||||||
|
|
||||||
|
**Docker GPU overlays.** CPU-only users can skip this section. Cookbook can
|
||||||
|
only detect GPUs that Docker exposes to the container — if the host runtime or
|
||||||
|
device passthrough is not configured, Cookbook sees the iGPU, another card, or
|
||||||
|
CPU instead of your intended GPU.
|
||||||
|
|
||||||
|
For NVIDIA, `scripts/check-docker-gpu.sh` diagnoses GPU passthrough and can
|
||||||
|
optionally install the host runtime or update `.env`.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Read-only diagnostic (default — installs nothing, never edits .env):
|
||||||
|
scripts/check-docker-gpu.sh
|
||||||
|
|
||||||
|
# Print OS-specific install commands without running them:
|
||||||
|
scripts/check-docker-gpu.sh --print-install-commands
|
||||||
|
|
||||||
|
# Install NVIDIA Container Toolkit on Ubuntu/Debian (requires sudo):
|
||||||
|
scripts/check-docker-gpu.sh --install-nvidia-toolkit
|
||||||
|
|
||||||
|
# Write COMPOSE_FILE to .env (only when GPU passthrough is confirmed working):
|
||||||
|
scripts/check-docker-gpu.sh --enable-nvidia-overlay
|
||||||
|
|
||||||
|
# Full assisted setup — install toolkit, then enable overlay if passthrough works:
|
||||||
|
scripts/check-docker-gpu.sh --install-nvidia-toolkit --enable-nvidia-overlay
|
||||||
|
```
|
||||||
|
|
||||||
|
Safety notes:
|
||||||
|
- The app never installs host GPU runtime automatically.
|
||||||
|
- The app never edits `.env` automatically.
|
||||||
|
- `.env` is only modified when `--enable-nvidia-overlay` is explicitly passed,
|
||||||
|
and only after GPU passthrough succeeds. `--yes` skips prompts but does not
|
||||||
|
bypass the passthrough gate.
|
||||||
|
- `.env.bak.*` backups created by `--enable-nvidia-overlay` are ignored by
|
||||||
|
Git and the Docker build context.
|
||||||
|
|
||||||
|
To enable manually without the script, add this to `.env`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
**AMD / ROCm.** AMD setup is read-only diagnostic plus manual `.env` edit. Run:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
scripts/check-docker-amd-gpu.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
Then add the reported values to `.env`, replacing `RENDER_GID` with your host's
|
||||||
|
numeric render group id:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
|
||||||
|
RENDER_GID=989
|
||||||
|
```
|
||||||
|
|
||||||
|
For NVIDIA/AMD GPU support, also read the comments in the selected overlay file: docker/gpu.nvidia.yml or docker/gpu.amd.yml.
|
||||||
|
|
||||||
|
**Stack-management UIs (Portainer, Coolify, Dockhand, etc.).** These tools
|
||||||
|
often accept only a single Compose file and do not reliably honor `COMPOSE_FILE`
|
||||||
|
or multiple `-f` overlays. CLI users should keep using the `COMPOSE_FILE`
|
||||||
|
overlay workflow above. For stack UIs, point the stack at one of the standalone
|
||||||
|
files instead, which bundle the base stack plus the GPU settings:
|
||||||
|
|
||||||
|
- `docker-compose.gpu-nvidia.yml` — still requires the NVIDIA Container Toolkit
|
||||||
|
on the host.
|
||||||
|
- `docker-compose.gpu-amd.yml` — still requires host ROCm/kfd/DRI setup, the
|
||||||
|
`video`/`render` group membership, and `RENDER_GID` when needed.
|
||||||
|
|
||||||
|
The base `docker-compose.yml` plus the `docker/gpu.*.yml` overlays remain the
|
||||||
|
source of truth; the standalone files mirror them for single-file deployments.
|
||||||
|
|
||||||
|
Verify after enabling either overlay:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose exec odysseus nvidia-smi -L # NVIDIA
|
||||||
|
docker compose exec odysseus sh -lc 'test -e /dev/kfd && test -d /dev/dri && ls -l /dev/kfd /dev/dri/renderD*' # AMD
|
||||||
|
```
|
||||||
|
|
||||||
|
> **GPU passthrough ≠ llama.cpp CUDA.** `nvidia-smi` passing inside the
|
||||||
|
> container confirms Docker GPU access, but llama.cpp also needs `cudart` and
|
||||||
|
> the CUDA Toolkit at runtime. If Cookbook logs show `Unable to find cudart
|
||||||
|
> library`, `Could NOT find CUDAToolkit`, `CUDA Toolkit not found`, or
|
||||||
|
> tensors/layers assigned to CPU, that is a Cookbook/llama.cpp build issue —
|
||||||
|
> not a Docker passthrough failure. Reinstall the serve engine via
|
||||||
|
> **Cookbook → Dependencies** to get a CUDA-enabled build.
|
||||||
|
>
|
||||||
|
> The same split applies to AMD/ROCm: seeing `/dev/kfd` and `/dev/dri` inside
|
||||||
|
> the container confirms device passthrough, not ROCm userspace or a
|
||||||
|
> ROCm-enabled vLLM/llama.cpp build. `rocm-smi` and `rocminfo` are not expected
|
||||||
|
> inside the slim Odysseus image.
|
||||||
|
|
||||||
|
**Ollama with Docker.** If Ollama runs on the host, add this endpoint in
|
||||||
|
Settings:
|
||||||
|
|
||||||
|
```text
|
||||||
|
http://host.docker.internal:11434/v1
|
||||||
|
```
|
||||||
|
|
||||||
|
Ollama must listen outside its own loopback interface:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
OLLAMA_HOST=0.0.0.0:11434 ollama serve
|
||||||
|
```
|
||||||
|
|
||||||
|
This connects Odysseus in Docker to an Ollama server that is already running on
|
||||||
|
your host machine; it does not start Ollama inside the container.
|
||||||
|
`host.docker.internal` is Docker's hostname for the host machine from inside the
|
||||||
|
container. Cookbook **Serve** is a separate workflow for serving downloaded
|
||||||
|
models through Odysseus/llama.cpp, so Windows users with an existing Ollama
|
||||||
|
install usually only need to add the endpoint in Settings.
|
||||||
|
|
||||||
|
**Useful checks.**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker compose ps
|
||||||
|
docker compose logs --tail=120 odysseus
|
||||||
|
docker compose logs odysseus | grep -E 'ChromaDB|MemoryVectorStore|DEGRADED'
|
||||||
|
```
|
||||||
|
|
||||||
|
**macOS details.** `start-macos.sh` installs Homebrew deps, creates the venv,
|
||||||
|
runs setup, and starts uvicorn on port `7860` because AirPlay often holds
|
||||||
|
`7000`. It uses llama.cpp/Ollama for Metal. vLLM/SGLang are CUDA/ROCm-only and
|
||||||
|
do not run on macOS. MLX-only models are not served by Odysseus.
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
### Native Windows
|
||||||
|
|
||||||
|
**One-command launcher** (creates the venv, installs deps, runs setup, starts the
|
||||||
|
server; safe to re-run):
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
|
cd odysseus
|
||||||
|
powershell -ExecutionPolicy Bypass -File .\launch-windows.ps1
|
||||||
|
```
|
||||||
|
|
||||||
|
Or do it by hand:
|
||||||
|
|
||||||
|
```powershell
|
||||||
|
git clone https://github.com/pewdiepie-archdaemon/odysseus.git
|
||||||
|
cd odysseus
|
||||||
|
py -3.11 -m venv venv
|
||||||
|
venv\Scripts\Activate.ps1
|
||||||
|
pip install -r requirements.txt
|
||||||
|
python setup.py
|
||||||
|
python -m uvicorn app:app --host 127.0.0.1 --port 7000
|
||||||
|
```
|
||||||
|
|
||||||
|
If `python` points at an older interpreter, use `py -3.12` (or another installed
|
||||||
|
3.11+ version) for the venv step.
|
||||||
|
|
||||||
|
**Requirements:** Python 3.11+. The core app (chat, agent, memory, documents,
|
||||||
|
email, calendar, deep research) runs fully native. For full **Cookbook** background
|
||||||
|
model downloads and the agent shell tool, also install
|
||||||
|
[Git for Windows](https://git-scm.com/download/win) (provides `bash.exe`).
|
||||||
|
Local GPU *serving* of vLLM/SGLang needs Linux/WSL2; for a local model on Windows,
|
||||||
|
[Ollama](https://ollama.com/download) is the easiest path — point Odysseus at
|
||||||
|
`http://localhost:11434/v1` in Settings.
|
||||||
|
|
||||||
|
Open `http://localhost:7000`, log in with the generated admin password,
|
||||||
|
and configure everything else inside **Settings**.
|
||||||
|
|
||||||
|
## Troubleshooting & Advanced Setup
|
||||||
|
|
||||||
|
### `chromadb-client` conflicts with embedded ChromaDB
|
||||||
|
If `chromadb-client` (the lightweight HTTP-only package) is installed alongside the full `chromadb` package, Odysseus starts but ChromaDB silently falls back to HTTP-only mode and fails.
|
||||||
|
|
||||||
|
**Fix:** uninstall `chromadb-client` and force-reinstall the full package:
|
||||||
|
```bash
|
||||||
|
./venv/bin/pip uninstall chromadb-client -y
|
||||||
|
./venv/bin/pip install --force-reinstall chromadb
|
||||||
|
```
|
||||||
|
|
||||||
|
### HTTPS + LAN/Tailscale exposure
|
||||||
|
To expose Odysseus on a local network or Tailscale with HTTPS:
|
||||||
|
1. Change the bind address to `0.0.0.0` in `.env` (`APP_BIND=0.0.0.0` or `ODYSSEUS_HOST=0.0.0.0`).
|
||||||
|
2. Generate a locally-trusted cert for your LAN/Tailscale IPs using [mkcert](https://github.com/FiloSottile/mkcert):
|
||||||
|
```bash
|
||||||
|
mkcert -install
|
||||||
|
mkcert -cert-file cert.pem -key-file key.pem 192.168.1.100 tailscale-ip
|
||||||
|
```
|
||||||
|
3. Run `uvicorn` with the generated certs:
|
||||||
|
```bash
|
||||||
|
python -m uvicorn app:app --host 0.0.0.0 --port 7000 --ssl-certfile=cert.pem --ssl-keyfile=key.pem
|
||||||
|
```
|
||||||
|
4. Install the `mkcert` CA on any other device you want to access Odysseus from (e.g., for iOS, email the `rootCA.pem` to yourself, install the profile, and trust it in Certificate Trust Settings).
|
||||||
|
|
||||||
|
### Optional Dependencies
|
||||||
|
`requirements-optional.txt` contains packages that unlock extra features. It is not installed by default.
|
||||||
|
|
||||||
|
| Package | Feature unlocked |
|
||||||
|
|---------|-----------------|
|
||||||
|
| `faster-whisper` | Local speech-to-text (microphone -> text) via the "local" STT provider. |
|
||||||
|
| `ddgs` | DuckDuckGo as a search provider option. |
|
||||||
|
| `PyMuPDF` | PDF page rendering in the side viewer panel and form-filling. (Note: AGPL-3.0) |
|
||||||
|
| `markitdown` | Office/EPUB document text extraction (converts .docx/.xlsx/.pptx/.xls/.epub to Markdown). |
|
||||||
|
|
||||||
|
### Faster, reproducible installs with uv (optional)
|
||||||
|
[uv](https://docs.astral.sh/uv/) works as a drop-in replacement for the
|
||||||
|
venv + pip steps in the native install guides, no project changes are needed but this change results in faster installs along with a lockfile for reproducible environments. After [installing `uv`](https://docs.astral.sh/uv/getting-started/installation/), use:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv venv venv --python 3.13
|
||||||
|
uv pip install -r requirements.txt
|
||||||
|
# then continue as usual: python setup.py, uvicorn, ...
|
||||||
|
```
|
||||||
|
|
||||||
|
`requirements.txt` is intentionally unpinned, so two installs at different times can produce different package versions. If you want a reproducible environment (e.g. across your own machines, or to roll back after a bad upgrade), snapshot and restore exact versions with:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv pip compile requirements.txt -o requirements.lock # snapshot current resolution
|
||||||
|
uv pip sync requirements.lock # reproduce it exactly later
|
||||||
|
```
|
||||||
|
|
||||||
|
`requirements.lock` is gitignored and platform-specific (compile it on the OS you deploy to). Regenerate it deliberately when you want to take upgrades. The plain `uv pip install -r requirements.txt` keeps following the unpinned requirements like pip does.
|
||||||
|
|
||||||
|
### Outlook / Office 365 email
|
||||||
|
Odysseus email accounts currently use IMAP/SMTP username-password auth. Outlook
|
||||||
|
and Microsoft 365 generally require OAuth instead, so normal Microsoft mailbox
|
||||||
|
passwords will fail. See [docs/email-outlook.md](docs/email-outlook.md) for the
|
||||||
|
current limitation and the planned integration direction.
|
||||||
|
|
||||||
|
## Security Notes
|
||||||
|
Odysseus is a self-hosted workspace with powerful local tools: shell access, file uploads, model downloads, web research, email/calendar integrations, and API tokens. Treat it like an admin console.
|
||||||
|
|
||||||
|
- Keep `AUTH_ENABLED=true` for any network-accessible deployment.
|
||||||
|
- Keep `LOCALHOST_BYPASS=false` outside local development.
|
||||||
|
- Use `SECURE_COOKIES=true` when Odysseus is served through HTTPS by a trusted reverse proxy or private access gateway.
|
||||||
|
- Do not expose it directly to the public internet without HTTPS and a trusted reverse proxy or private access layer.
|
||||||
|
- Keep `.env`, `data/`, `logs/`, databases, uploads, generated media, backups, auth/session files, API keys, and model/provider tokens out of Git and private shares. They are ignored by default.
|
||||||
|
- Review `data/auth.json` after first boot: disable open signup unless you intentionally want it, make only your own account admin, and keep demo/test accounts non-admin.
|
||||||
|
- Non-admin users do not get shell/Python/file read/write by default, and admin-only routes/tools such as MCP management, API tokens, webhooks, model/cookbook serving, backup/vault, and app settings are admin-gated. Other features are controlled by per-user privileges, so review each user's privileges before exposing a deployment.
|
||||||
|
- Rotate any API keys or tokens that were ever pasted into a shared chat, demo, screenshot, or log.
|
||||||
|
- If you enable API tokens or webhooks, create separate tokens per integration and delete unused ones.
|
||||||
|
- Prefer binding manual development runs to `127.0.0.1`; bind to `0.0.0.0` only when you intentionally want LAN/reverse-proxy access.
|
||||||
|
- Keep ChromaDB, SearXNG, ntfy, Ollama, vLLM, llama.cpp, databases, and raw model/provider APIs internal-only. Expose only the authenticated Odysseus web/API entrypoint through your trusted proxy or private access layer.
|
||||||
|
- Before publishing a fork, run `git status --short` and confirm no private files from `.env`, `data/`, `logs/`, uploads, backups, or local databases are staged.
|
||||||
|
|
||||||
|
### Private or proxied deployments
|
||||||
|
Odysseus serves plain HTTP on its app port. Docker Compose binds Odysseus and the bundled services to `127.0.0.1` by default, so a typical production/private setup is:
|
||||||
|
|
||||||
|
1. Keep Odysseus on localhost, for example `127.0.0.1:7000`.
|
||||||
|
2. Terminate HTTPS at a trusted reverse proxy or private access gateway.
|
||||||
|
3. Put the authenticated Odysseus web/API entrypoint behind that layer.
|
||||||
|
4. Keep raw service and model ports internal-only.
|
||||||
|
|
||||||
|
Cloudflare Access, Tailscale, Caddy, nginx, and Traefik can all fit this pattern; none are required by Odysseus. If your access layer reaches Odysseus on the same host, proxy to `http://127.0.0.1:7000` and keep `AUTH_ENABLED=true`, `LOCALHOST_BYPASS=false`, and `SECURE_COOKIES=true`.
|
||||||
|
`ALLOWED_ORIGINS` lists exact permitted origins for cross-origin browser/API clients; ordinary same-origin reverse-proxy access usually does not need a special CORS entry.
|
||||||
|
|
||||||
|
Common internal-only ports from the default docs/compose setup:
|
||||||
|
|
||||||
|
| Port | Service |
|
||||||
|
|---|---|
|
||||||
|
| `7000` | Odysseus raw app port |
|
||||||
|
| `8080` | SearXNG |
|
||||||
|
| `8091` | ntfy |
|
||||||
|
| `8100` | ChromaDB host port for manual/compose access |
|
||||||
|
| `11434` | Ollama |
|
||||||
|
| `8000-8020` | Common local model/provider APIs |
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
Most setup is done inside the app with `/setup` or **Settings**. Use `.env`
|
||||||
|
for deployment-level defaults and secrets you want present before first boot.
|
||||||
|
Key settings:
|
||||||
|
|
||||||
|
| Variable | Default | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| `LLM_HOST` | `localhost` | Your LLM server (e.g. `llm-host.local:8000`) |
|
||||||
|
| `LLM_HOSTS` | -- | Comma-separated list for model discovery |
|
||||||
|
| `OPENAI_API_KEY` | -- | Optional OpenAI key. Prefer adding providers in the app unless pre-seeding. |
|
||||||
|
| `SEARXNG_INSTANCE` | `http://localhost:8080` | SearXNG URL. Docker overrides this to `http://searxng:8080`. |
|
||||||
|
| `SEARXNG_SECRET` | generated on first Docker boot | Optional SearXNG cookie/CSRF secret. Leave blank unless you need to pin it. |
|
||||||
|
| `APP_BIND` | `127.0.0.1` | Docker Compose host bind address for the web UI. Use `0.0.0.0` only for intentional LAN/reverse-proxy access. |
|
||||||
|
| `APP_PORT` | `7000` | Docker Compose host port for the web UI. |
|
||||||
|
| `APP_DATA_DIR` | `./data` | Docker Compose host directory for application data volumes. |
|
||||||
|
| `APP_LOGS_DIR` | `./logs` | Docker Compose host directory for application logs. |
|
||||||
|
| `AUTH_ENABLED` | `true` | Enable/disable login |
|
||||||
|
| `LOCALHOST_BYPASS` | `false` | Development-only auth bypass for loopback requests. Keep false for shared/network deployments. |
|
||||||
|
| `ALLOWED_ORIGINS` | `http://localhost,http://127.0.0.1` | Comma-separated exact permitted origins for cross-origin browser/API clients. |
|
||||||
|
| `SECURE_COOKIES` | `false` | Set true when serving Odysseus through HTTPS at a trusted proxy or private access gateway. |
|
||||||
|
| `DATABASE_URL` | `sqlite:///./data/app.db` | Database connection string |
|
||||||
|
| `CHROMADB_HOST` | `localhost` | ChromaDB host for vector memory. Docker overrides this to `chromadb`. |
|
||||||
|
| `CHROMADB_PORT` | `8100` | ChromaDB port for manual host runs. Docker overrides this to `8000`. |
|
||||||
|
| `EMBEDDING_URL` | -- | OpenAI-compatible embeddings endpoint |
|
||||||
|
| `ODYSSEUS_CHAT_UPLOAD_MAX_BYTES` | `10485760` | Chat/agent attachment cap in bytes. Raise for larger local PDFs or text documents. |
|
||||||
|
| `ODYSSEUS_GALLERY_UPLOAD_MAX_BYTES` | `104857600` | Gallery image upload cap in bytes (100 MB). |
|
||||||
|
| `ODYSSEUS_GALLERY_TRANSFORM_UPLOAD_MAX_BYTES` | `26214400` | Gallery transform input cap in bytes (25 MB). |
|
||||||
|
| `ODYSSEUS_MEMORY_IMPORT_MAX_BYTES` | `10485760` | Memory import file cap in bytes (10 MB). |
|
||||||
|
| `ODYSSEUS_PERSONAL_UPLOAD_MAX_BYTES` | `26214400` | Personal document upload cap in bytes (25 MB). |
|
||||||
|
| `ODYSSEUS_EMAIL_COMPOSE_UPLOAD_MAX_BYTES` | `26214400` | Email compose attachment cap in bytes (25 MB). |
|
||||||
|
| `ODYSSEUS_STT_MAX_AUDIO_BYTES` | `26214400` | Speech-to-text audio cap in bytes (25 MB). |
|
||||||
|
| `ODYSSEUS_ICS_MAX_BYTES` | `10485760` | Calendar `.ics` import cap in bytes (10 MB). |
|
||||||
|
|
||||||
|
All upload-limit vars are validated (must be a positive integer) and optional; an invalid value fails fast at startup.
|
||||||
|
|
||||||
|
### Built-in MCP servers (optional setup)
|
||||||
|
|
||||||
|
Odysseus auto-registers a few built-in MCP servers at startup. The npx-based ones (currently the browser server, `@playwright/mcp`) only start when their npm package is already in the local npx cache. If a package isn't cached, that server is skipped with a startup log message explaining what to do, so a fresh install does not block on a multi-minute npm download or hang if Playwright system deps are missing.
|
||||||
|
|
||||||
|
To enable the browser MCP (page navigation, screenshots, vision), run once:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npx -y @playwright/mcp@latest --version
|
||||||
|
```
|
||||||
|
|
||||||
|
That installs `@playwright/mcp` plus Playwright (~300MB total). Restart Odysseus and the server will register at startup.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
```
|
||||||
|
app.py # FastAPI entry point
|
||||||
|
core/ auth, database, middleware, constants
|
||||||
|
src/ llm_core, agent_loop, agent_tools, chat_processor, search/
|
||||||
|
routes/ chat, session, document, memory, model … endpoints
|
||||||
|
services/ docs, memory, search, hwfit (Cookbook) …
|
||||||
|
static/ index.html + app.js + style.css + js/ (modular front-end)
|
||||||
|
docs/ landing page (index.html) + preview clips
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data
|
||||||
|
All user data lives in `data/` (gitignored): `app.db` (sessions, messages, documents),
|
||||||
|
`memory.json`, `presets.json`, `uploads/`, `personal_docs/`, `chroma/`, `settings.json`.
|
||||||
|
|
||||||
|
To back up or restore everything in `data/`, see the
|
||||||
|
[Backup & Restore guide](docs/backup-restore.md).
|
||||||
@@ -12,6 +12,7 @@ import json
|
|||||||
import csv
|
import csv
|
||||||
import io
|
import io
|
||||||
import os
|
import os
|
||||||
|
import inspect
|
||||||
import httpx
|
import httpx
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
@@ -741,8 +742,8 @@ def setup_contacts_routes():
|
|||||||
email = (data.get("email") or "").strip()
|
email = (data.get("email") or "").strip()
|
||||||
phone = (data.get("phone") or "").strip()
|
phone = (data.get("phone") or "").strip()
|
||||||
address = (data.get("address") or "").strip()
|
address = (data.get("address") or "").strip()
|
||||||
if not email and not name:
|
if not email:
|
||||||
return {"success": False, "error": "Name or email required"}
|
return {"success": False, "error": "Email required"}
|
||||||
# Check if already exists by email
|
# Check if already exists by email
|
||||||
if email:
|
if email:
|
||||||
contacts = _fetch_contacts()
|
contacts = _fetch_contacts()
|
||||||
@@ -751,7 +752,11 @@ def setup_contacts_routes():
|
|||||||
return {"success": True, "message": "Already exists", "contact": c}
|
return {"success": True, "message": "Already exists", "contact": c}
|
||||||
if not name:
|
if not name:
|
||||||
name = email.split("@")[0]
|
name = email.split("@")[0]
|
||||||
|
create_params = inspect.signature(_create_contact).parameters
|
||||||
|
if len(create_params) >= 3:
|
||||||
ok = _create_contact(name, email, address)
|
ok = _create_contact(name, email, address)
|
||||||
|
else:
|
||||||
|
ok = _create_contact(name, email)
|
||||||
# If a phone was provided, do an immediate update to thread it
|
# If a phone was provided, do an immediate update to thread it
|
||||||
# through (the simple _create_contact signature only takes name +
|
# through (the simple _create_contact signature only takes name +
|
||||||
# email + address; phones happen via update).
|
# email + address; phones happen via update).
|
||||||
|
|||||||
@@ -67,6 +67,14 @@ def _gallery_image_path(filename: str) -> Path:
|
|||||||
raise HTTPException(400, "Unsafe gallery filename")
|
raise HTTPException(400, "Unsafe gallery filename")
|
||||||
if safe_name != original:
|
if safe_name != original:
|
||||||
raise HTTPException(400, "Unsafe gallery filename")
|
raise HTTPException(400, "Unsafe gallery filename")
|
||||||
|
if not path.exists():
|
||||||
|
cwd_root = (Path.cwd() / "data" / "generated_images").resolve()
|
||||||
|
cwd_path = (cwd_root / safe_name).resolve()
|
||||||
|
try:
|
||||||
|
if os.path.commonpath([str(cwd_root), str(cwd_path)]) == str(cwd_root) and cwd_path.exists():
|
||||||
|
return cwd_path
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
return path
|
return path
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
+69
-13
@@ -19,22 +19,32 @@ GPU_BANDWIDTH = {
|
|||||||
"6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224,
|
"6950 xt": 576, "6900 xt": 512, "6800 xt": 512, "6800": 512, "6700 xt": 384, "6600 xt": 256, "6600": 224,
|
||||||
"mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229,
|
"mi300x": 5300, "mi300": 5300, "mi250x": 3277, "mi250": 3277, "mi210": 1638, "mi100": 1229,
|
||||||
"9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322,
|
"9070 xt": 624, "9070": 488, "9060 xt": 322, "9060": 322,
|
||||||
# Apple Silicon unified-memory bandwidth (GB/s). Keyed off the chip name
|
|
||||||
# reported by sysctl machdep.cpu.brand_string (e.g. "Apple M4 Max"). Listed
|
|
||||||
# before the bare "m_" keys matters less than length-sorting (done below),
|
|
||||||
# which guarantees "m4 max" is tried before "m4".
|
|
||||||
"m1 ultra": 800, "m1 max": 400, "m1 pro": 200, "m1": 68,
|
|
||||||
"m2 ultra": 800, "m2 max": 400, "m2 pro": 200, "m2": 100,
|
|
||||||
"m3 ultra": 800, "m3 max": 300, "m3 pro": 150, "m3": 100,
|
|
||||||
"m4 max": 546, "m4 pro": 273, "m4": 120,
|
|
||||||
"m5 max": 546, "m5 pro": 273, "m5": 150,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
# Pre-sort keys by length descending for correct substring matching
|
# Pre-sort keys by length descending for correct substring matching
|
||||||
_BW_KEYS_SORTED = sorted(GPU_BANDWIDTH.keys(), key=len, reverse=True)
|
_BW_KEYS_SORTED = sorted(GPU_BANDWIDTH.keys(), key=len, reverse=True)
|
||||||
|
|
||||||
# metal: backstop for Apple Silicon chips not in GPU_BANDWIDTH (e.g. a future
|
# Apple Silicon unified-memory bandwidth (GB/s). For chip families with both
|
||||||
# M5) — the named chips above take the accurate bandwidth path instead.
|
# binned and full variants under the same "Apple Mx Max" brand string, prefer
|
||||||
|
# GPU core count when hardware detection provides it; otherwise fall back to the
|
||||||
|
# conservative tier so speed estimates do not over-promise.
|
||||||
|
APPLE_BANDWIDTH_FIXED = {
|
||||||
|
"m1 ultra": 800, "m1 max": 400, "m1 pro": 200, "m1": 68,
|
||||||
|
"m2 ultra": 800, "m2 max": 400, "m2 pro": 200, "m2": 100,
|
||||||
|
"m3 ultra": 800, "m3 pro": 150, "m3": 100,
|
||||||
|
"m4 pro": 273, "m4": 120,
|
||||||
|
"m5 pro": 307, "m5": 153,
|
||||||
|
}
|
||||||
|
APPLE_BANDWIDTH_BY_CORES = {
|
||||||
|
"m3 max": {30: 300, 40: 400},
|
||||||
|
"m4 max": {32: 410, 40: 546},
|
||||||
|
"m5 max": {32: 460, 40: 614},
|
||||||
|
}
|
||||||
|
_APPLE_FIXED_KEYS_SORTED = sorted(APPLE_BANDWIDTH_FIXED.keys(), key=len, reverse=True)
|
||||||
|
_APPLE_VARIANT_KEYS_SORTED = sorted(APPLE_BANDWIDTH_BY_CORES.keys(), key=len, reverse=True)
|
||||||
|
|
||||||
|
# metal: backstop for Apple Silicon chips not in the explicit tables above
|
||||||
|
# (e.g. a future M6) — use a conservative generic estimate when unknown.
|
||||||
FALLBACK_K = {"cuda": 220, "rocm": 180, "metal": 150, "cpu_x86": 70, "cpu_arm": 90}
|
FALLBACK_K = {"cuda": 220, "rocm": 180, "metal": 150, "cpu_x86": 70, "cpu_arm": 90}
|
||||||
|
|
||||||
USE_CASE_WEIGHTS = {
|
USE_CASE_WEIGHTS = {
|
||||||
@@ -60,10 +70,56 @@ CONTEXT_TARGET = {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def _lookup_bandwidth(gpu_name):
|
def _lookup_apple_bandwidth(system):
|
||||||
|
gpu_name = system.get("gpu_name")
|
||||||
if not isinstance(gpu_name, str) or not gpu_name:
|
if not isinstance(gpu_name, str) or not gpu_name:
|
||||||
return None
|
return None
|
||||||
gn = gpu_name.lower()
|
gn = gpu_name.lower()
|
||||||
|
|
||||||
|
# Guard against false matches on non-Apple GPUs whose names contain
|
||||||
|
# "m3"/"m4"/"m5" (e.g. NVIDIA Quadro M4 000).
|
||||||
|
if "apple" not in gn:
|
||||||
|
return None
|
||||||
|
|
||||||
|
raw_cores = system.get("gpu_cores")
|
||||||
|
try:
|
||||||
|
gpu_cores = int(raw_cores) if raw_cores is not None else None
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
gpu_cores = None
|
||||||
|
|
||||||
|
for key in _APPLE_VARIANT_KEYS_SORTED:
|
||||||
|
if key not in gn:
|
||||||
|
continue
|
||||||
|
if gpu_cores in APPLE_BANDWIDTH_BY_CORES[key]:
|
||||||
|
return APPLE_BANDWIDTH_BY_CORES[key][gpu_cores]
|
||||||
|
return min(APPLE_BANDWIDTH_BY_CORES[key].values())
|
||||||
|
|
||||||
|
for key in _APPLE_FIXED_KEYS_SORTED:
|
||||||
|
if key in gn:
|
||||||
|
return APPLE_BANDWIDTH_FIXED[key]
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _lookup_bandwidth(system):
|
||||||
|
if isinstance(system, dict):
|
||||||
|
gpu_name = system.get("gpu_name")
|
||||||
|
else:
|
||||||
|
gpu_name = system
|
||||||
|
|
||||||
|
if not isinstance(gpu_name, str) or not gpu_name:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Apple tiers live only in the Apple-specific table now (#2564), so route
|
||||||
|
# BOTH dict and bare-string callers through it. A bare string carries no
|
||||||
|
# gpu_cores, so the helper falls back to the conservative (lowest) tier for
|
||||||
|
# that model -- before #2564 the generic table answered string lookups, and
|
||||||
|
# dropping that made _lookup_bandwidth("Apple M3 Max") return None.
|
||||||
|
apple_input = system if isinstance(system, dict) else {"gpu_name": gpu_name}
|
||||||
|
bw = _lookup_apple_bandwidth(apple_input)
|
||||||
|
if bw is not None:
|
||||||
|
return bw
|
||||||
|
|
||||||
|
gn = gpu_name.lower()
|
||||||
for key in _BW_KEYS_SORTED:
|
for key in _BW_KEYS_SORTED:
|
||||||
if key in gn:
|
if key in gn:
|
||||||
return GPU_BANDWIDTH[key]
|
return GPU_BANDWIDTH[key]
|
||||||
@@ -84,7 +140,7 @@ def _estimate_speed(model, quant, run_mode, system, offload_frac=0.0):
|
|||||||
"""
|
"""
|
||||||
pb = _active_params_b(model)
|
pb = _active_params_b(model)
|
||||||
is_moe = model.get("is_moe", False)
|
is_moe = model.get("is_moe", False)
|
||||||
bw = _lookup_bandwidth(system.get("gpu_name"))
|
bw = _lookup_bandwidth(system)
|
||||||
backend = system.get("backend", "cpu_x86")
|
backend = system.get("backend", "cpu_x86")
|
||||||
|
|
||||||
if bw and run_mode in ("gpu", "cpu_offload"):
|
if bw and run_mode in ("gpu", "cpu_offload"):
|
||||||
|
|||||||
@@ -1,3 +1,4 @@
|
|||||||
|
import json
|
||||||
import os
|
import os
|
||||||
import platform
|
import platform
|
||||||
import re
|
import re
|
||||||
@@ -335,6 +336,37 @@ def _detect_apple_silicon():
|
|||||||
if total_gb <= 0:
|
if total_gb <= 0:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
def _parse_apple_gpu_cores(text):
|
||||||
|
if not text:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
data = json.loads(text)
|
||||||
|
except (TypeError, ValueError, json.JSONDecodeError):
|
||||||
|
data = None
|
||||||
|
if isinstance(data, dict):
|
||||||
|
for gpu in data.get("SPDisplaysDataType") or []:
|
||||||
|
if not isinstance(gpu, dict):
|
||||||
|
continue
|
||||||
|
model = str(gpu.get("sppci_model") or gpu.get("_name") or "")
|
||||||
|
if "apple" not in model.lower():
|
||||||
|
continue
|
||||||
|
cores = gpu.get("sppci_cores")
|
||||||
|
try:
|
||||||
|
return int(str(cores).strip())
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
continue
|
||||||
|
m = re.search(r"Total Number of Cores:\s*(\d+)", text)
|
||||||
|
if m:
|
||||||
|
try:
|
||||||
|
return int(m.group(1))
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
return None
|
||||||
|
|
||||||
|
gpu_cores = _parse_apple_gpu_cores(_run(["system_profiler", "SPDisplaysDataType", "-json"]))
|
||||||
|
if gpu_cores is None:
|
||||||
|
gpu_cores = _parse_apple_gpu_cores(_run(["system_profiler", "SPDisplaysDataType"]))
|
||||||
|
|
||||||
# Usable GPU budget. macOS lets Metal use most of unified memory, but the
|
# Usable GPU budget. macOS lets Metal use most of unified memory, but the
|
||||||
# default working-set limit scales with RAM: small machines have to keep
|
# default working-set limit scales with RAM: small machines have to keep
|
||||||
# more back for the OS + app. These fractions track Apple's
|
# more back for the OS + app. These fractions track Apple's
|
||||||
@@ -357,7 +389,7 @@ def _detect_apple_silicon():
|
|||||||
pass
|
pass
|
||||||
|
|
||||||
gpu = {"index": 0, "name": brand, "vram_gb": vram_gb}
|
gpu = {"index": 0, "name": brand, "vram_gb": vram_gb}
|
||||||
return {
|
info = {
|
||||||
"gpu_name": brand,
|
"gpu_name": brand,
|
||||||
"gpu_vram_gb": vram_gb,
|
"gpu_vram_gb": vram_gb,
|
||||||
"gpu_count": 1,
|
"gpu_count": 1,
|
||||||
@@ -369,6 +401,9 @@ def _detect_apple_silicon():
|
|||||||
# separate pool — downstream fit logic uses this to avoid double-budgeting.
|
# separate pool — downstream fit logic uses this to avoid double-budgeting.
|
||||||
"unified_memory": True,
|
"unified_memory": True,
|
||||||
}
|
}
|
||||||
|
if gpu_cores is not None:
|
||||||
|
info["gpu_cores"] = gpu_cores
|
||||||
|
return info
|
||||||
|
|
||||||
|
|
||||||
def _read_file(path):
|
def _read_file(path):
|
||||||
@@ -772,6 +807,7 @@ def detect_system(host="", ssh_port="", platform="", fresh=False):
|
|||||||
"gpu_name": gpu_info["gpu_name"],
|
"gpu_name": gpu_info["gpu_name"],
|
||||||
"gpu_vram_gb": gpu_info["gpu_vram_gb"],
|
"gpu_vram_gb": gpu_info["gpu_vram_gb"],
|
||||||
"gpu_count": gpu_info["gpu_count"],
|
"gpu_count": gpu_info["gpu_count"],
|
||||||
|
"gpu_cores": gpu_info.get("gpu_cores"),
|
||||||
"gpus": gpu_info.get("gpus", []),
|
"gpus": gpu_info.get("gpus", []),
|
||||||
"gpu_groups": gpu_info.get("gpu_groups", []),
|
"gpu_groups": gpu_info.get("gpu_groups", []),
|
||||||
"homogeneous": gpu_info.get("homogeneous", True),
|
"homogeneous": gpu_info.get("homogeneous", True),
|
||||||
|
|||||||
@@ -201,11 +201,15 @@ def build_models_url(base: str) -> Optional[str]:
|
|||||||
return _ollama_api_root(base) + "/tags"
|
return _ollama_api_root(base) + "/tags"
|
||||||
if provider == "chatgpt-subscription":
|
if provider == "chatgpt-subscription":
|
||||||
return None
|
return None
|
||||||
# Generic OpenAI-compatible fallback: ensure the path lands on /v1/models
|
# Generic OpenAI-compatible fallback: local model servers with no explicit
|
||||||
# when the user omitted a path entirely. If a non-empty path is already
|
# path conventionally expose `/v1/models` (LM Studio, llama.cpp, vLLM).
|
||||||
# present (e.g. /openai, /api/openai/v1, /v1), trust the caller — the
|
# For non-local unknown hosts, do not invent `/v1`; append `/models` to the
|
||||||
# /models suffix is appended as-is and the caller's prefix is preserved.
|
# caller's base so look-alike provider hosts stay generic.
|
||||||
if not urlparse(base).path:
|
parsed = urlparse(base)
|
||||||
|
host = (parsed.hostname or "").lower()
|
||||||
|
is_local = host in {"localhost", "127.0.0.1", "::1", "host.docker.internal"}
|
||||||
|
uses_v1_models_by_default = is_local or host in {"api.deepseek.com"}
|
||||||
|
if not parsed.path and uses_v1_models_by_default:
|
||||||
base = base + "/v1"
|
base = base + "/v1"
|
||||||
return base + "/models"
|
return base + "/models"
|
||||||
|
|
||||||
|
|||||||
+2
-2
@@ -1467,8 +1467,8 @@ function initEndpointForm() {
|
|||||||
const localAddBtn = el('adm-epLocalAddBtn');
|
const localAddBtn = el('adm-epLocalAddBtn');
|
||||||
const localTestBtn = el('adm-epLocalTestBtn');
|
const localTestBtn = el('adm-epLocalTestBtn');
|
||||||
if (localTestBtn) {
|
if (localTestBtn) {
|
||||||
const testOriginalHtml = localTestBtn.innerHTML;
|
|
||||||
localTestBtn.addEventListener('click', async () => {
|
localTestBtn.addEventListener('click', async () => {
|
||||||
|
const testOriginalHtml = localTestBtn.innerHTML || '>Test';
|
||||||
const msg = _endpointMsg('local');
|
const msg = _endpointMsg('local');
|
||||||
msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
|
msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
|
||||||
const raw = (el('adm-epLocalUrl').value || '').trim();
|
const raw = (el('adm-epLocalUrl').value || '').trim();
|
||||||
@@ -1494,8 +1494,8 @@ function initEndpointForm() {
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
if (localAddBtn) {
|
if (localAddBtn) {
|
||||||
const addOriginalHtml = localAddBtn.innerHTML;
|
|
||||||
localAddBtn.addEventListener('click', async () => {
|
localAddBtn.addEventListener('click', async () => {
|
||||||
|
const addOriginalHtml = localAddBtn.innerHTML || '>Add';
|
||||||
const msg = _endpointMsg('local');
|
const msg = _endpointMsg('local');
|
||||||
msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
|
msg.textContent = ''; msg.className = 'adm-ep-inline-msg';
|
||||||
const raw = (el('adm-epLocalUrl').value || '').trim();
|
const raw = (el('adm-epLocalUrl').value || '').trim();
|
||||||
|
|||||||
@@ -41,8 +41,10 @@ def _seed(tmp_path):
|
|||||||
|
|
||||||
|
|
||||||
def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
|
def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
|
||||||
monkeypatch.chdir(tmp_path)
|
|
||||||
SessionLocal = _seed(tmp_path)
|
SessionLocal = _seed(tmp_path)
|
||||||
|
# GALLERY_IMAGE_DIR is an absolute path fixed at import, so a chdir can't
|
||||||
|
# redirect the delete; point the resolver at the seeded tmp dir directly.
|
||||||
|
monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
|
||||||
monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
|
monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
|
||||||
|
|
||||||
# A session whose commit always fails, to simulate a DB error mid-delete.
|
# A session whose commit always fails, to simulate a DB error mid-delete.
|
||||||
@@ -67,8 +69,8 @@ def test_file_kept_when_commit_fails(tmp_path, monkeypatch):
|
|||||||
|
|
||||||
|
|
||||||
def test_file_removed_on_successful_delete(tmp_path, monkeypatch):
|
def test_file_removed_on_successful_delete(tmp_path, monkeypatch):
|
||||||
monkeypatch.chdir(tmp_path)
|
|
||||||
SessionLocal = _seed(tmp_path)
|
SessionLocal = _seed(tmp_path)
|
||||||
|
monkeypatch.setattr(gallery_routes, "GALLERY_IMAGE_DIR", tmp_path / "data" / "generated_images")
|
||||||
monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
|
monkeypatch.setattr(gallery_routes, "get_current_user", lambda r: "alice")
|
||||||
monkeypatch.setattr(gallery_routes, "SessionLocal", SessionLocal)
|
monkeypatch.setattr(gallery_routes, "SessionLocal", SessionLocal)
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,59 @@
|
|||||||
|
from services.hwfit.fit import _lookup_apple_bandwidth, _lookup_bandwidth
|
||||||
|
|
||||||
|
|
||||||
|
def test_m3_max_bandwidth_uses_gpu_cores():
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M3 Max", "gpu_cores": 30}) == 300
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M3 Max", "gpu_cores": 40}) == 400
|
||||||
|
|
||||||
|
|
||||||
|
def test_m4_max_bandwidth_uses_gpu_cores():
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M4 Max", "gpu_cores": 32}) == 410
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M4 Max", "gpu_cores": 40}) == 546
|
||||||
|
|
||||||
|
|
||||||
|
def test_m5_max_bandwidth_uses_gpu_cores():
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M5 Max", "gpu_cores": 32}) == 460
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M5 Max", "gpu_cores": 40}) == 614
|
||||||
|
|
||||||
|
|
||||||
|
def test_apple_max_bandwidth_falls_back_conservatively_without_gpu_cores():
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M3 Max"}) == 300
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M4 Max"}) == 410
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M5 Max"}) == 460
|
||||||
|
|
||||||
|
|
||||||
|
def test_fixed_apple_bandwidth_entries_include_updated_m5_values():
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M5 Pro"}) == 307
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "Apple M5"}) == 153
|
||||||
|
|
||||||
|
|
||||||
|
def test_non_apple_gpu_does_not_match_apple_bandwidth():
|
||||||
|
"""NVIDIA Quadro M4 000 should NOT match Apple bandwidth lookup."""
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M4 000"}) is None
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M3 000"}) is None
|
||||||
|
assert _lookup_bandwidth({"gpu_name": "NVIDIA Quadro M5 000"}) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_non_apple_gpu_with_cores_does_not_match():
|
||||||
|
"""A non-Apple GPU that happens to carry a gpu_cores count must not be
|
||||||
|
matched by the APPLE bandwidth path. This asserts the Apple-specific
|
||||||
|
matcher directly: _lookup_bandwidth would (correctly) return these cards'
|
||||||
|
real bandwidth from the general GPU table (e.g. the RTX 4090's 1008 GB/s),
|
||||||
|
which is a different code path and not what this guard is about.
|
||||||
|
"""
|
||||||
|
assert _lookup_apple_bandwidth({"gpu_name": "NVIDIA GeForce RTX 4090", "gpu_cores": 128}) is None
|
||||||
|
assert _lookup_apple_bandwidth({"gpu_name": "AMD Radeon RX 9070 XT", "gpu_cores": 64}) is None
|
||||||
|
|
||||||
|
|
||||||
|
def test_apple_string_input_resolves_conservative_tier():
|
||||||
|
"""Bare-string callers must still get Apple bandwidth. #2564 moved the
|
||||||
|
Apple tiers out of the generic GPU table into the dict-only Apple helper,
|
||||||
|
so _lookup_bandwidth("Apple M3 Max") (no gpu_cores) regressed to None;
|
||||||
|
string inputs now route through the Apple helper and get the conservative
|
||||||
|
(lowest) tier for the model."""
|
||||||
|
assert _lookup_bandwidth("Apple M3 Max") == 300
|
||||||
|
assert _lookup_bandwidth("Apple M4 Max") == 410
|
||||||
|
assert _lookup_bandwidth("Apple M5 Max") == 460
|
||||||
|
# Non-Apple strings still fall through to the generic table.
|
||||||
|
assert _lookup_bandwidth("NVIDIA GeForce RTX 4090") == 1008
|
||||||
|
assert _lookup_bandwidth("Totally Unknown GPU") is None
|
||||||
@@ -4,6 +4,8 @@ Covers the Metal-specific behavior added for Apple Silicon and locks in the
|
|||||||
guarantee that non-macOS (Linux/Windows) detection is unchanged.
|
guarantee that non-macOS (Linux/Windows) detection is unchanged.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
from services.hwfit import hardware
|
from services.hwfit import hardware
|
||||||
from services.hwfit.fit import rank_models
|
from services.hwfit.fit import rank_models
|
||||||
from services.hwfit.models import get_models
|
from services.hwfit.models import get_models
|
||||||
@@ -22,7 +24,7 @@ def _metal_system(ram_gb=16.0, vram_gb=10.7):
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None):
|
def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None, display_json=None, display_text=None):
|
||||||
def run(cmd):
|
def run(cmd):
|
||||||
joined = " ".join(cmd)
|
joined = " ".join(cmd)
|
||||||
if "machdep.cpu.brand_string" in joined:
|
if "machdep.cpu.brand_string" in joined:
|
||||||
@@ -31,6 +33,12 @@ def _fake_sysctl(brand="Apple M2 Pro", memsize_gb=32, wired_mb=None):
|
|||||||
return str(int(memsize_gb * 1024**3))
|
return str(int(memsize_gb * 1024**3))
|
||||||
if "iogpu.wired_limit_mb" in joined:
|
if "iogpu.wired_limit_mb" in joined:
|
||||||
return str(wired_mb) if wired_mb is not None else None
|
return str(wired_mb) if wired_mb is not None else None
|
||||||
|
if "system_profiler SPDisplaysDataType -json" in joined:
|
||||||
|
if isinstance(display_json, (dict, list)):
|
||||||
|
return json.dumps(display_json)
|
||||||
|
return display_json
|
||||||
|
if "system_profiler SPDisplaysDataType" in joined:
|
||||||
|
return display_text
|
||||||
return None
|
return None
|
||||||
return run
|
return run
|
||||||
|
|
||||||
@@ -98,16 +106,47 @@ def test_apple_silicon_detected_as_metal(monkeypatch):
|
|||||||
monkeypatch.setattr(hardware, "_remote_host", None)
|
monkeypatch.setattr(hardware, "_remote_host", None)
|
||||||
monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
|
monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
|
||||||
monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
|
monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
|
||||||
monkeypatch.setattr(hardware, "_run", _fake_sysctl(memsize_gb=32))
|
monkeypatch.setattr(hardware, "_run", _fake_sysctl(
|
||||||
|
memsize_gb=32,
|
||||||
|
display_json={"SPDisplaysDataType": [{"sppci_model": "Apple M2 Pro", "sppci_cores": "19"}]},
|
||||||
|
))
|
||||||
|
|
||||||
info = hardware._detect_apple_silicon()
|
info = hardware._detect_apple_silicon()
|
||||||
assert info is not None
|
assert info is not None
|
||||||
assert info["backend"] == "metal"
|
assert info["backend"] == "metal"
|
||||||
assert info["gpu_name"] == "Apple M2 Pro"
|
assert info["gpu_name"] == "Apple M2 Pro"
|
||||||
assert info["unified_memory"] is True
|
assert info["unified_memory"] is True
|
||||||
|
assert info["gpu_cores"] == 19
|
||||||
assert info["gpu_vram_gb"] == 24.0 # 32GB * 0.75
|
assert info["gpu_vram_gb"] == 24.0 # 32GB * 0.75
|
||||||
|
|
||||||
|
|
||||||
|
def test_apple_silicon_gpu_cores_fall_back_to_plain_text(monkeypatch):
|
||||||
|
monkeypatch.setattr(hardware, "_remote_host", None)
|
||||||
|
monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
|
||||||
|
monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
|
||||||
|
monkeypatch.setattr(hardware, "_run", _fake_sysctl(
|
||||||
|
brand="Apple M4 Max",
|
||||||
|
memsize_gb=64,
|
||||||
|
display_json="{not-json",
|
||||||
|
display_text="Graphics/Displays:\n\nApple M4 Max:\n Total Number of Cores: 32\n",
|
||||||
|
))
|
||||||
|
|
||||||
|
info = hardware._detect_apple_silicon()
|
||||||
|
assert info is not None
|
||||||
|
assert info["gpu_cores"] == 32
|
||||||
|
|
||||||
|
|
||||||
|
def test_apple_silicon_gpu_cores_are_optional(monkeypatch):
|
||||||
|
monkeypatch.setattr(hardware, "_remote_host", None)
|
||||||
|
monkeypatch.setattr(hardware.platform, "system", lambda: "Darwin")
|
||||||
|
monkeypatch.setattr(hardware.platform, "machine", lambda: "arm64")
|
||||||
|
monkeypatch.setattr(hardware, "_run", _fake_sysctl(memsize_gb=32))
|
||||||
|
|
||||||
|
info = hardware._detect_apple_silicon()
|
||||||
|
assert info is not None
|
||||||
|
assert "gpu_cores" not in info
|
||||||
|
|
||||||
|
|
||||||
def test_apple_silicon_skipped_on_linux(monkeypatch):
|
def test_apple_silicon_skipped_on_linux(monkeypatch):
|
||||||
"""Guarantee Linux detection is untouched: the Metal probe bails immediately."""
|
"""Guarantee Linux detection is untouched: the Metal probe bails immediately."""
|
||||||
monkeypatch.setattr(hardware, "_remote_host", None)
|
monkeypatch.setattr(hardware, "_remote_host", None)
|
||||||
@@ -132,7 +171,7 @@ def test_detect_system_propagates_unified_memory(monkeypatch):
|
|||||||
monkeypatch.setattr(hardware, "_detect_apple_silicon", lambda: {
|
monkeypatch.setattr(hardware, "_detect_apple_silicon", lambda: {
|
||||||
"gpu_name": "Apple M4", "gpu_vram_gb": 10.7, "gpu_count": 1,
|
"gpu_name": "Apple M4", "gpu_vram_gb": 10.7, "gpu_count": 1,
|
||||||
"gpus": [], "gpu_groups": [], "homogeneous": True,
|
"gpus": [], "gpu_groups": [], "homogeneous": True,
|
||||||
"backend": "metal", "unified_memory": True,
|
"backend": "metal", "unified_memory": True, "gpu_cores": 10,
|
||||||
})
|
})
|
||||||
monkeypatch.setattr(hardware, "_get_ram_gb", lambda: 16.0)
|
monkeypatch.setattr(hardware, "_get_ram_gb", lambda: 16.0)
|
||||||
monkeypatch.setattr(hardware, "_get_available_ram_gb", lambda: 11.0)
|
monkeypatch.setattr(hardware, "_get_available_ram_gb", lambda: 11.0)
|
||||||
@@ -142,3 +181,4 @@ def test_detect_system_propagates_unified_memory(monkeypatch):
|
|||||||
s = hardware.detect_system(fresh=True)
|
s = hardware.detect_system(fresh=True)
|
||||||
assert s["backend"] == "metal"
|
assert s["backend"] == "metal"
|
||||||
assert s.get("unified_memory") is True
|
assert s.get("unified_memory") is True
|
||||||
|
assert s["gpu_cores"] == 10
|
||||||
|
|||||||
@@ -107,6 +107,7 @@ class TestBuildersRejectLookalikeHosts:
|
|||||||
assert build_chat_url("https://notanthropic.com") == "https://notanthropic.com/chat/completions"
|
assert build_chat_url("https://notanthropic.com") == "https://notanthropic.com/chat/completions"
|
||||||
|
|
||||||
def test_lookalike_anthropic_models_is_openai(self):
|
def test_lookalike_anthropic_models_is_openai(self):
|
||||||
|
assert llm_core._detect_provider("https://anthropic.com.evil.com") == "openai"
|
||||||
assert build_models_url("https://anthropic.com.evil.com") == "https://anthropic.com.evil.com/models"
|
assert build_models_url("https://anthropic.com.evil.com") == "https://anthropic.com.evil.com/models"
|
||||||
|
|
||||||
def test_anthropic_domain_in_path_is_openai(self):
|
def test_anthropic_domain_in_path_is_openai(self):
|
||||||
@@ -119,6 +120,7 @@ class TestBuildersRejectLookalikeHosts:
|
|||||||
assert build_chat_url("https://notollama.com") == "https://notollama.com/chat/completions"
|
assert build_chat_url("https://notollama.com") == "https://notollama.com/chat/completions"
|
||||||
|
|
||||||
def test_lookalike_ollama_models_is_openai(self):
|
def test_lookalike_ollama_models_is_openai(self):
|
||||||
|
assert llm_core._detect_provider("https://notollama.com") == "openai"
|
||||||
assert build_models_url("https://notollama.com") == "https://notollama.com/models"
|
assert build_models_url("https://notollama.com") == "https://notollama.com/models"
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user