mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-17 10:15:27 -04:00
docker: set CUDA_HOME for pip-installed vllm in Cookbook (#228)
When Cookbook installs vllm via `pip install --user vllm`, pip pulls in nvidia-cuda-* wheels under /app/.local but doesn't set CUDA_HOME or create /usr/local/cuda. vllm 0.22+ then crashes during engine init: RuntimeError: Could not find nvcc and default cuda_home='/usr/local/cuda' doesn't exist After that, the mixed cuda-nvcc 13.3 / cuda-runtime 13.0 wheel combo fails FlashInfer's JIT sampler with: error: "CUDA compiler and CUDA toolkit headers are incompatible" Detect the pip-installed nvcc on startup, point CUDA_HOME at it, and default VLLM_USE_FLASHINFER_SAMPLER=0 (sampler only, no attention impact) so the engine boots. No-op when vllm isn't installed. Fixes #214. Co-authored-by: sirs <sirs@local>
This commit is contained in:
@@ -46,6 +46,24 @@ for dir in /app /app/data /app/logs; do
|
|||||||
fi
|
fi
|
||||||
done
|
done
|
||||||
|
|
||||||
|
# Cookbook installs vllm/etc. via `pip install --user`, which pulls
|
||||||
|
# nvidia-cuda-* wheels into /app/.local but does not set CUDA_HOME or
|
||||||
|
# symlink /usr/local/cuda. vllm 0.22+ then crashes during engine init
|
||||||
|
# when FlashInfer tries to JIT a sampler kernel ("Could not find nvcc",
|
||||||
|
# then "CUDA compiler and toolkit headers are incompatible" on the
|
||||||
|
# mixed cuda-nvcc 13.3 / cuda-runtime 13.0 wheel combo).
|
||||||
|
#
|
||||||
|
# Auto-set CUDA_HOME if a pip-installed nvcc is present, and disable the
|
||||||
|
# FlashInfer JIT sampler — sampler only, no impact on attention path.
|
||||||
|
# No-op when vllm isn't installed.
|
||||||
|
for cu in /app/.local/lib/python*/site-packages/nvidia/cu13; do
|
||||||
|
if [ -x "$cu/bin/nvcc" ]; then
|
||||||
|
export CUDA_HOME="$cu"
|
||||||
|
export VLLM_USE_FLASHINFER_SAMPLER="${VLLM_USE_FLASHINFER_SAMPLER:-0}"
|
||||||
|
break
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
# Drop root and run the actual app. `gosu` is preferred over `su` /
|
# Drop root and run the actual app. `gosu` is preferred over `su` /
|
||||||
# `sudo` because it cleans up the process tree (no extra shell layer)
|
# `sudo` because it cleans up the process tree (no extra shell layer)
|
||||||
# so signals (SIGTERM from `docker stop`) reach uvicorn directly.
|
# so signals (SIGTERM from `docker stop`) reach uvicorn directly.
|
||||||
|
|||||||
Reference in New Issue
Block a user