docker: add NVIDIA/AMD GPU overlays via COMPOSE_FILE (#254)

Opt-in overlays under docker/ that pass the host GPU into the odysseus container. Pick one in .env: COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml Non-GPU users are unaffected (no default merge). README now points at the overlays instead of the old ad-hoc `gpus: all` suggestion. Each overlay header notes that it only exposes the GPU devices — the slim image still needs vLLM / llama-cpp-python / etc. installed via Cookbook -> Dependencies before models can serve on GPU. Tested on Arch + Docker 29.5.1 + RTX 4090: docker compose exec odysseus nvidia-smi -L GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-...) Cookbook hardware scan reports the 24 GB GPU and recommends GPU-fit models. `docker compose config` validates cleanly for all three COMPOSE_FILE variants (base, +nvidia, +amd). Builds on the structure proposed in #91 by @krllus with the path / docs fixes from the review on that PR. Closes #163. Co-authored-by: krllus <krllus@users.noreply.github.com>
2026-06-17 02:05:22 -04:00 · 2026-06-01 15:00:09 +10:00
parent 2537b80f88
commit 4c0aadbb5e
4 changed files with 78 additions and 3 deletions
@@ -123,3 +123,22 @@ SEARXNG_INSTANCE=http://localhost:8080
 # Empty/local/localhost runs scripts on the app host. Set to an SSH host alias
 # if you intentionally want scheduled scripts to run remotely.
 # ODYSSEUS_SCRIPT_HOST=localhost
 # ============================================================
 # GPU support (Docker Compose)
 # ============================================================
 # Pass the host GPU into the odysseus container. Default (unset) = CPU.
 # COMPOSE_FILE is a native `docker compose` feature: a colon-separated
 # list of files merged left-to-right. Pick ONE GPU line below, or leave
 # all commented for CPU.
 #
 # NVIDIA (requires nvidia-container-toolkit + `nvidia-ctk runtime
 # configure --runtime=docker` on the host):
 # COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
 #
 # AMD ROCm (requires ROCm drivers on the host):
 # COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
 #
 # These overlays only expose the GPU devices. The slim Odysseus image
 # still needs CUDA/ROCm userspace via Cookbook -> Dependencies (vLLM,
 # llama-cpp-python, etc.) before models can actually serve on GPU.
@@ -73,9 +73,18 @@ serve engines and Python CLIs are stored in `./data/local`, mounted as
 After downloading a model, open **Cookbook -> Serve**, pick the cached model,
 and launch it. When the server answers `/v1/models`, Odysseus adds it to the
-chat model picker automatically. For NVIDIA GPUs in Docker, install the NVIDIA
+chat model picker automatically. For NVIDIA / AMD GPUs in Docker, install
-Container Toolkit and add `gpus: all` to the `odysseus` service if `nvidia-smi`
+the host runtime (NVIDIA Container Toolkit or ROCm drivers) and enable the
-is not visible inside the container.
+matching overlay via `COMPOSE_FILE` in `.env`:
 ```bash
 # NVIDIA
 COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
 # AMD ROCm
 COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
 ```
 Verify with `docker compose exec odysseus nvidia-smi -L` (or `rocm-smi`).
 The default Docker image is intentionally slim. For Python-based serve engines,
 use **Cookbook -> Dependencies** to install vLLM, SGLang, llama-cpp-python, or
@@ -0,0 +1,18 @@
 # AMD ROCm GPU overlay. Enable by setting COMPOSE_FILE in .env:
 #   COMPOSE_FILE=docker-compose.yml:docker/gpu.amd.yml
 #
 # Requires ROCm drivers on the host (kfd + DRI devices). The host user
 # running Docker must be in the `video` and `render` groups.
 #
 # This overlay only passes the host GPU through to the container.
 # The slim Odysseus image does not bundle ROCm userspace or inference
 # engines — install ROCm-compatible builds of vLLM / llama-cpp-python
 # via Cookbook -> Dependencies (or pip) before serving GPU models.
 services:
  odysseus:
    devices:
      - /dev/kfd
      - /dev/dri
    group_add:
      - video
      - render
@@ -0,0 +1,29 @@
 # NVIDIA GPU overlay. Enable by setting COMPOSE_FILE in .env:
 #   COMPOSE_FILE=docker-compose.yml:docker/gpu.nvidia.yml
 #
 # Requires the NVIDIA Container Toolkit on the host.
 #   Arch:    sudo pacman -S nvidia-container-toolkit
 #   Debian:  sudo apt install nvidia-container-toolkit
 #   Fedora:  sudo dnf install nvidia-container-toolkit
 # Then:
 #   sudo nvidia-ctk runtime configure --runtime=docker
 #   sudo systemctl restart docker
 # Verify with:
 #   docker info | grep -i nvidia
 #
 # This overlay only passes the host GPU through to the container.
 # The slim Odysseus image does not bundle CUDA userspace or inference
 # engines — install vLLM / llama-cpp-python / SGLang via
 # Cookbook -> Dependencies (or pip) before serving GPU models.
 services:
  odysseus:
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]