Add a 'Rebuild llama.cpp' Cookbook action to force a fresh GPU build (#1787)

The serve bootstrap builds llama-server from source only when it is missing from PATH, so a host that first compiled CPU-only (no nvcc present at build time) reuses that CPU-only binary on every later serve and never gets a GPU build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to force a rebuild. Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally or on the selected remote server) so the next serve recompiles and picks up CUDA/HIP if a toolchain is now present. It installs and downloads nothing. - routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth) - routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses the existing SSH plumbing for remote hosts) - static/js/cookbook.js: header button + handler honoring the deps server selector - tests: cover the command shape and a clean run on a fresh HOME Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to re-trigger the build). Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-16 17:55:26 -04:00 · 2026-06-02 23:28:19 -05:00
parent 51857c9008
commit 6f001af2a3
4 changed files with 135 additions and 0 deletions
@@ -10,6 +10,7 @@ from routes.cookbook_helpers import (
    _append_llama_cpp_linux_accel_build_lines,
    _append_serve_exit_code_lines,
    _append_serve_preflight_exit_lines,
+    _llama_cpp_rebuild_cmd,
    _local_tooling_path_export,
    _pip_install_attempt,
    _pip_install_fallback_chain,
@@ -338,6 +339,38 @@ def test_llama_cpp_linux_bootstrap_keeps_cpu_fallback_when_no_gpu_toolchain():
    assert 'WARNING: no HIP/CUDA toolchain found — building llama-server for CPU only.' in script
    assert 'Install ROCm for AMD GPUs or vLLM/CUDA tooling for NVIDIA' in script

+
+def test_llama_cpp_rebuild_cmd_clears_cached_build_paths():
+    cmd = _llama_cpp_rebuild_cmd()
+
+    # Must remove both the cached symlink and the build dir the serve bootstrap
+    # links/creates, so the next serve recompiles from source.
+    assert 'rm -f "$HOME/bin/llama-server"' in cmd
+    assert 'rm -rf "$HOME/llama.cpp/build"' in cmd
+    # Recreates ~/bin so a never-served host does not error on a missing dir.
+    assert 'mkdir -p "$HOME/bin"' in cmd
+    # Diagnosis-only on the destructive side: it must not install or fetch.
+    assert 'pip install' not in cmd
+    assert 'git clone' not in cmd
+    assert 'curl' not in cmd and 'wget' not in cmd
+
+
+def test_llama_cpp_rebuild_cmd_runs_clean_on_a_fresh_home(tmp_path):
+    """The command should succeed even when neither path exists yet."""
+    import os
+
+    env = dict(os.environ)
+    env["HOME"] = str(tmp_path)
+    result = subprocess.run(
+        ["bash", "-c", _llama_cpp_rebuild_cmd()],
+        capture_output=True, text=True, env=env, timeout=10,
+    )
+
+    assert result.returncode == 0, result.stderr
+    assert (tmp_path / "bin").is_dir()
+    assert "Cleared the cached llama.cpp build" in result.stdout
+
+
 def test_cached_model_scan_reports_plain_dir_gguf(tmp_path):
    """Custom download dirs may sit inside the HF hub cache and contain plain
    per-model folders. They must show up in Serve and keep the GGUF signal."""