Commit Graph

38 Commits

Author SHA1 Message Date
Dividesbyzer0 627a52ac44 fix(cookbook): shim Windows Store python3 alias (#2610) 2026-06-15 20:25:30 +09:00
cyq aac589ee49 fix(cookbook): diagnose sglang native deps (#4112) 2026-06-15 15:14:37 +09:00
Dividesbyzer0 ec4f91afdd fix(cookbook): normalize llama-cpp-python cache types
Map llama-cpp-python --type_k/--type_v cache names to integer enum values after serve-command validation while preserving native llama-server flags.
2026-06-15 15:12:18 +09:00
Dividesbyzer0 ece6cebc03 fix(cookbook): create bin dir before llama-server link
Ensure ~/bin exists before the llama.cpp accelerated build script creates the llama-server link.
2026-06-15 15:03:55 +09:00
muhamed hamed 3b3c0d6254 fix: detect HuggingFace token when downloading cookbook models (#3459)
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
2026-06-11 21:53:16 +01:00
RaresKeY d1a5a7d680 fix(hwfit): validate remote SSH detection targets (#3718) 2026-06-11 00:43:49 +02:00
Disorder AA d9141c6e56 fix(cookbook): allow spaces and non-ASCII characters in model directory paths (#3473)
* fix(cookbook): allow spaces in model directory paths

Allow POSIX external-drive paths and Windows drive paths with spaces while keeping shell metacharacters rejected.

* fix(cookbook): also allow non-ASCII (Unicode) characters in model dir paths

The ASCII-only allowlist that rejected spaces also rejected Cyrillic,
accented Latin and CJK folder names (e.g. /Volumes/Модели,
D:\AI Models\Модели) with 400 Invalid local_dir. Switch the path
character class from [A-Za-z0-9._ -] to [\w. -] (\w is Unicode-aware on
Python 3 str patterns) so localized folder names validate, while shell
metacharacters (; & | ` $ quotes newlines) stay rejected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(cookbook): reject local_dir path segments starting with '-'

The local_dir allowlist includes '-', so a directory like /models/-rf
(or D:\models\-rf) could be parsed as a CLI flag by hf/etc. (option
injection) — and quoting does not stop a value from being read as an
option. Guard against it inside the validator so the safety stays fully
self-contained there rather than depending on consumers' quoting.

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 07:58:38 +02:00
pewdiepie-archdaemon fa8c93ec0a Cookbook UI: Ollama browser, advanced serve fold, API tokens form, diagnosis toolbar, polish
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.

Highlights:
- Ollama as a first-class backend in the Cookbook:
  * Download input accepts ollama-style names (name:tag) → backend=ollama
  * /api/cookbook/ollama/library (cached scrape of ollama.com + curated
    fallback so classic models like qwen2.5 stay reachable)
  * "Browse Ollama library" toggle below Download with size chips
  * Engine=Ollama in hwfit toolbar merges the Ollama library into the
    main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
    columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
  loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
  checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
  pulled up 8px to align with the row's checkboxes, GGUF File dropdown
  exposed for Ollama backend, GPU re-render on Edit serve restore,
  _forceBackend flag so saved serveState wins over backend detection,
  cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
  Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
  Clear-offline buttons in Added Models toolbar, offline-pill removed
  (opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
  reachable (vLLM returns 404 on bare /v1, fully working endpoints
  were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
  serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
  Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
  serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
  ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
  auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
2026-06-09 09:46:19 +09:00
Cookiejunky 4e497f4878 fix(cookbook): guard break-system-packages pip flag (#3510) 2026-06-08 23:10:20 +02:00
horribleCodes 9c90f62657 fix(platform): Improve WSL SSH remote compatibility (#3316)
* fix(platform): add WSL compatibility functions and path translation
fix(cookbook): enhance model scan script to support additional HuggingFace cache paths
fix(hardware): improve cache key generation for remote SSH context
test(tests): add tests for WSL detection and path translation functionality

* fix(cookbook): prefer prebuilt wheels for llama-cpp-python and normalize package aliases

* fix: enable StrictHostKeyChecking in nvidia probe
refactor: consolidate ssh & powershell command execution to utility functions in core module
refactor: consolidate nvidia path candidates in to single variables in core module
tests: add tests for new utility functions

* fix: correct wrong variable name
2026-06-08 00:33:50 +02:00
ooovenenoso 681a2a3f2a fix(cookbook): scan persisted HF cache paths (#3189) 2026-06-07 18:19:47 +02:00
Zen0-99 bec594904d Fix/windows llama cpp serve and test upstream (#2669)
* fix: code runner base64, Windows serve paths, endpoint cache clear, copy-log guards, model-picker remove-recent

* Revert model-picker 'remove from recent' feature and remove stray PR_DRAFT.md
2026-06-05 14:53:33 +02:00
the_peaceful b5c45326e4 Fix Windows Cookbook background tasks, exit statuses, and empty SSH logs wrapper (#1389)
This commit consolidates all Windows Cookbook background fixes into a single comprehensive commit based on the latest main branch.

Key fixes included:
1. React looksSuccessful Mismatch: Append 'DOWNLOAD_OK' for pip install commands in routes/cookbook_routes.py.
2. Local Windows SSH Wrapper & Log Directory Mismatch: Bypassed ssh wrappers and dynamically selected odysseus-tmux logs for local tasks in static/js/cookbookRunning.js.
3. WSL Bash Filtration: Filtered out the WSL bash stub at C:\Windows\System32\bash.exe in core/platform_compat.py.
4. Drive-Colon Path Normalization: Replaced .as_posix() with git_bash_path() in routes/shell_routes.py and src/bg_jobs.py.
5. GGUF-Only Hardware Fitting: Restructured local Windows recommendations to rank GGUF only in services/hwfit/fit.py.
6. Safe Win32 Process Liveness Probe: Replaced os.kill(pid, 0) with a safe Win32 API probe using GetExitCodeProcess in core/platform_compat.py.
7. Prebuilt llama-cpp-python Wheels: Supply the CPU extra index during compilation failure fallback.
8. Enforce UTF-8 log encoding: Set PYTHONIOENCODING=utf-8 on Windows bootstrap runners.
9. Fix Linux Llama.cpp Build script syntax error in routes/cookbook_helpers.py.
10. Page Reload Status Check: Run sys.executable instead of 'python3' to bypass Microsoft Store execution stubs on local Windows hosts.
11. Llama.cpp serve build bypass: Bypassed cmake compilation checks on local Windows and verified python bindings directly.
12. Serve Command Path Validation: Masked safe GGUF path printf subshells '' inside the serve command validator.
13. CPU Mismatch Diagnostics: Intercepted AVX2-lacking '0xc000001d' (Illegal Instruction) crashes in static/js/cookbook-diagnosis.js and guided users to Ollama.
14. Windows Pytest stability: Fixed stub import leakage in test files.
2026-06-05 14:41:07 +02:00
spooky f9e1d38cc2 fix: diagnose vllm serve runtime issues (#1198) 2026-06-05 11:03:04 +01:00
Lucas Daniel f5d834b0c5 fix(cookbook): surface backend diagnosis when serve fails in background (#1636)
* refactor(cookbook): move _diagnose_serve_output to module level in cookbook_helpers

Extracts the nested _diagnose_serve_output function from inside
setup_cookbook_routes() and moves it to module level in cookbook_helpers.py,
alongside the other helper functions it logically belongs with.

No behaviour change — the function is now importable directly for testing
and by other callers without going through the route factory closure.

* fix(cookbook): surface backend diagnosis when serve fails in background

The background poll (_pollBackgroundStatus) already received `diagnosis`
and `cmd` from /api/cookbook/tasks/status but discarded both. When a serve
job died while the Cookbook modal was closed, reopening it showed only a
red error badge with no context.

- Persist live.diagnosis into task._backendDiagnosis in localStorage so it
  survives modal close/reopen and page refresh
- Persist live.cmd into task.payload._cmd for agent-spawned tasks so the
  crash report includes the actual command
- After _renderRunningTab(), walk rendered cards and call _showDiagnosis()
  for any that have a stored _backendDiagnosis but no panel yet
- In _renderTaskCard(), use _backendDiagnosis as a fallback when the
  client-side _terminalServeDiagnosis() finds nothing

* test(cookbook): add coverage for _diagnose_serve_output error patterns

10 tests verifying the 16 serve-failure patterns:
- CUDA OOM, port-in-use, vLLM missing, gated model
- Traceback fallback fires without startup success marker
- Traceback suppressed when server actually started
- Clean/empty output returns None
- trust-remote-code and no-GGUF patterns
2026-06-05 09:52:07 +01:00
pewdiepie-archdaemon 9112861d8e cookbook agent debug loop: persistent log files, auto-adopt orphan tmux, Codex/Claude skill parity
Three converging fixes so the chat agent + external Codex/Claude skills can actually debug a crashed serve instead of staring at a post-crash neofetch banner:

* Serves now `tee` to /tmp/odysseus-tmux/SESSION.log on the host running them. Runner saves fds 3/4 before the tee and restores them right before `exec ${SHELL}`, so the post-crash interactive zsh banner does NOT pollute the log file.
* `tail_serve_output` (chat agent) and `/api/codex/cookbook/output/{sid}` (Codex+Claude skills) both prefer the persistent log file over the tmux pane. Pane is fallback for sessions predating the tee runner. Default tail bumped 150 -> 400.
* `list_served_models` "recent log" snippet seeks to the Traceback line instead of showing the last 6 lines (which was always the bash prompt).

Cookbook auto-adoption sweep on `/api/cookbook/tasks/status`: every 20s (rate-limited) the cookbook SSHes each configured server, finds `serve-*` / `cookbook-*` tmux sessions running an actual model process (vllm/python/llama-server/etc., filtered via `pane_current_command`), and writes them into state.tasks. So when the agent falls back to raw ssh+tmux, the session appears in the Cookbook UI on the next poll.

`serve_model` error path now reads `data["detail"]` in addition to `data["error"]` so the FastAPI HTTPException message ("Invalid characters in cmd") actually reaches the agent instead of being swallowed as a generic "Serve failed". Tool description updated to warn against `cd …`/`source …`/`&&` prefixes.

Intent-without-action supervisor in agent_loop: when the model writes "Let me tail the output" / "I'll check the logs" / "Let me investigate" and ends the turn without emitting a tool call, the loop injects a sharp system nudge ("You said you would X — DO IT NOW") and continues. Capped at 2 nudges per chat so a model that genuinely cannot use the tool does not pin the loop.

Codex/Claude skill parity: adds `/cookbook/cached`, `/cookbook/presets`, `/cookbook/preset/{name}`, `/cookbook/adopt` so external agents have the same surface as the chat agent. SKILL.md docs + odysseus_api.py wrapper updated for both bundles.

`adopt_served_model` promoted to the always-on tool set so the agent has a documented fallback when serve_model rejects a cmd.

Also various cookbook UI tweaks accumulated alongside the above (cookbook.js, cookbookRunning.js, cookbookServe.js, cookbook-diagnosis.js, settings.js, style.css).
2026-06-04 23:27:18 +09:00
hawktuahs 3d8c364689 [Bash] Fix Windows cookbook background tasks (#676)
* Fix Windows cookbook background tasks

* Add Windows Cookbook reliability follow-ups
2026-06-04 04:30:01 +01:00
Shaw b10e6bc870 fix(cookbook): install llama-cpp-python[server] so llama.cpp serving works (#730) (#1338)
The llama.cpp serve auto-install built a bare `llama-cpp-python` in the Linux
source-build fallback and the Termux path, but the serve command runs
`python3 -m llama_cpp.server`, which needs the `[server]` extra. Because the
"already installed?" guard only checks `import llama_cpp` (a bare install
satisfies it), the missing extra was never added, so serving crashed with
`ModuleNotFoundError: No module named 'starlette_context'` (issue #730).

- Request the `[server]` extra in both the Termux direct install and the Linux
  Python-bindings fallback (the Windows path already used `[server]`).
- Shell-quote the package spec in `_pip_install_fallback_chain` via `shlex.quote`
  so the `[server]` brackets aren't treated as a bash glob; plain names unaffected.

Tests: tests/test_cookbook_helpers.py gains extras-quoting coverage and a
serve-runner regression guard.
2026-06-03 14:24:26 +09:00
Alexandre Teixeira 1c2ec288dd Check cudart before llama.cpp CUDA build (#1466) 2026-06-03 14:23:55 +09:00
lekt8 ffb8fd16bc Disable pip cache for Cookbook dependency installs (off the home disk) (#1477)
Cookbook dependency installs (vLLM and friends) build large wheels; pip's
default cache lives under $HOME/.cache/pip, so on a small home filesystem the
build dies mid-way with "[Errno 28] No space left on device" (issue #1219) and
the dependency ends up "installed" but unusable (issue #1459).

Add `--no-cache-dir` to the dependency pip-install command (the maintainer's
suggested PIP_CACHE_DIR= workaround, made the default) via a small
_pip_install_no_cache() helper applied at the install chokepoint. Consistent
with the existing --no-cache-dir on the llama-cpp-python build. Idempotent;
non-pip-install serve commands are untouched.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:23:49 +09:00
ghreprimand 6f001af2a3 Add a 'Rebuild llama.cpp' Cookbook action to force a fresh GPU build (#1787)
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.

Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.

- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
  the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME

Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).

Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 13:28:19 +09:00
Michael Gerber e392be0d65 fix: Cookbook local GGUF serving inside Docker (#1264)
* fix: Cookbook local GGUF serving inside Docker

Cookbook’s in-container GGUF serve flow had multiple Docker-specific breakages that made local llama.cpp models fail or register against the wrong endpoint.

Fixes included here:

use the scanned model cache root when generating GGUF serve commands instead of hardcoding $HOME/.cache/huggingface/hub
fix malformed llama.cpp preflight build lines that generated invalid bash in serve runner scripts
preserve loopback model URLs inside Docker when the target port is already reachable from the Odysseus container, instead of rewriting them unconditionally to host.docker.internal
Before this change, Docker local serves could fail in several ways:

Cookbook pointed llama.cpp at the wrong GGUF path
generated serve runner scripts crashed before launch with a shell syntax error
successfully started in-container model servers were auto-registered as host.docker.internal: instead of localhost/127.0.0.1
This makes the Docker Cookbook path work as expected for: downloaded GGUF -> local llama.cpp serve -> endpoint registration

* test: add test for docker-local endpoint rewrites
2026-06-03 02:08:09 +09:00
red person 028a39b42c Fix local Cookbook dependency installs in venvs (#1082) 2026-06-02 22:39:02 +09:00
ooovenenoso bd2fa82c1e Cookbook: prefer ROCm for native llama.cpp bootstrap
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-02 20:59:44 +09:00
Ernest Hysa 996a2027dd Cookbook: surface pip install failures in logs
_pip_install_fallback_chain silently discarded pip stderr via
2>/dev/null on every attempt. When pip failed (network error, venv
mismatch, disk full), the wrapper exited 0 and the Cookbook UI showed
the download as running — the silent-failure mode from #354.

Extract _pip_install_attempt() which wraps each pip invocation in a
bash -c subshell that captures output to a temp file, prints tail -5
on failure, cleans up, and exits with pip's real exit code. This
avoids the | tail pipefail masking (the first blocker on #363) while
surfacing the last 5 lines of pip output in the tmux log so users
can see what went wrong.

Both local wrapper and remote SSH runner use the same helper through
_pip_install_fallback_chain, so the fix is symmetric.
2026-06-02 20:34:52 +09:00
spooky 8b3c0d8ad4 feat: select cached gguf artifacts for serve (#891) 2026-06-02 12:32:40 +09:00
lolwuttav c99193041a fix(cookbook): default Ollama serve to loopback (#872) 2026-06-02 12:27:04 +09:00
Tatlatat 9a1893760d fix(cookbook): skip pip --user fallback inside virtualenvs (#388) (#889)
The dependency-install fallback chain unconditionally ran
'pip install --user', which fails inside a virtualenv (and as root in
LXC/containers) with 'Can not perform a --user install. User site-packages
are not visible in this virtualenv.' — even though the function's docstring
already noted --user is invalid in venvs.

Guard the --user fallback with a venv check so it only runs outside a venv
(where --user is actually valid for PEP-668 system Pythons). Derive the venv
probe interpreter from the install command (python for 'pip', python3 for
'pip3'/'python3 -m pip') so the check runs in pip's own environment. System
PEP-668 installs keep the --user fallback; venv/LXC-root installs no longer
hit the --user error. Updated the unit test for the new chain.

Closes #388
2026-06-02 12:23:20 +09:00
hawktuahs a2f6183c4a Fix cookbook pip installs in venvs (#723) 2026-06-02 11:31:59 +09:00
pewdiepie-archdaemon 96618b01c0 Polish task UI slash commands and Ollama serving 2026-06-02 09:36:03 +09:00
pewdiepie-archdaemon ab0a480f30 Show Ollama models in Cookbook Serve 2026-06-02 07:38:45 +09:00
ooovenenoso 5e47e69e99 Allow serving cached local llama.cpp models
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-01 23:10:08 +09:00
Yizreel Schwartz Sipahutar 42380a8693 Keep Cookbook POSIX paths stable on Windows hosts 2026-06-01 23:08:39 +09:00
pewdiepie-archdaemon f2d55f8726 Fix cached GGUF model metadata in Cookbook Serve 2026-06-01 22:46:54 +09:00
pewdiepie-archdaemon e5b927597e Fix Cookbook serve exit code reporting 2026-06-01 22:41:25 +09:00
spooky 15822e91ff fix: keep serve preflight errors visible (#398) 2026-06-01 22:40:06 +09:00
John Chaplin f1817fd560 Add macOS Apple Silicon Cookbook support
* Add Apple Silicon (Metal) GPU detection and unified-memory fit tuning

hardware.py detects Apple Silicon locally and over SSH, reporting
backend=metal, the chip name, and a RAM-scaled fraction of unified
memory as the usable GPU budget. fit.py gains an M1-M4 memory-bandwidth
table for realistic tok/s and drops vLLM-only formats (AWQ/GPTQ/FP8)
that can't be served on Metal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
(cherry picked from commit 32ac81dbc6)

* Generate macOS/Metal serve commands and surface the Metal GPU

cookbook_routes.py adds a macOS serve path (Ollama, Metal-aware
llama.cpp build using `sysctl hw.ncpu` instead of `nproc`, and a clear
error if vLLM is attempted). The frontend defaults Metal serving to
llama.cpp and offers llama.cpp/Ollama instead of vLLM/SGLang. The
odysseus-cookbook CLI's `gpus` command reports the Metal GPU via
sysctl/vm_stat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
(cherry picked from commit 4ba01ce25d)

* Add launchd LaunchAgent for macOS (systemd equivalent)

com.odysseus.ui.plist + install-service-macos.sh run Odysseus at login
and restart on crash, the macOS counterpart to odysseus-ui.service. The
installer auto-fills paths from the venv, so there's no hand-editing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
(cherry picked from commit 3d4b6b2c7b)

* Document macOS install (brew, Ollama, AirPlay port, launchd)

README + setup.py cover the Homebrew / Apple Silicon path: brew install
python@3.11 tmux ollama, Metal serving via Ollama/llama.cpp, the launchd
service, and the macOS AirPlay Receiver conflict on ports 7000/5000.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
(cherry picked from commit 8dc9a3578a)

* Add downloadable macOS launcher app builder

build-macos-app.sh generates dist/Odysseus.app and a drag-to-Applications
dist/Odysseus.dmg. The app starts the local server from this repo's venv and
opens the UI in a chrome-less app window (Chromium --app mode, falling back to
the default browser). It's a launcher wrapper — it drives the venv rather than
bundling Python — so the install path is baked in at build time.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
(cherry picked from commit 7927940c38)

* Harden macOS Cookbook support: hide MLX, fix Metal build cache

Builds on the adopted PR #213 macOS/Metal work with two fixes and tests:

- fit.py: always drop MLX-quantized models. Odysseus only generates serve
  commands for llama.cpp/Ollama (Metal) and vLLM/SGLang (CUDA); MLX needs the
  mlx_lm runtime and the catalog's MLX repos ship no GGUF alternative, so they
  were surfaced on Apple Silicon but could never be served.
- cookbook_routes.py (macOS branch only): `rm -rf build` before configure so a
  poisoned CMakeCache from a prior failed CUDA attempt can't make every later
  build fail; explicit -DCMAKE_BUILD_TYPE=Release; a clear "brew install cmake"
  hint if cmake is missing. Linux/CUDA path unchanged.
- tests/test_hwfit_macos.py: MLX hidden on metal, MLX still hidden on CUDA
  (regression guard), Metal detection on Apple Silicon, and skipped on
  Linux/Intel (proves non-macOS detection is untouched).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Propagate unified_memory flag and document macOS GPU/Docker caveat

- hardware.py: detect_system now carries the unified_memory flag from GPU
  detection into the system dict (it was set by _detect_apple_silicon / AMD-APU
  detection but dropped during result assembly, so the API always reported
  null). Lets callers distinguish unified from discrete VRAM.
- README: prominent warning that Docker on Apple Silicon can't reach the Metal
  GPU (runs a Linux VM) — Cookbook must run natively for GPU serving; fix stale
  text that said Cookbook recommends MLX models (now hidden as unservable).
- test: detect_system propagates unified_memory.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Put Odysseus's venv bin on PATH for cookbook runners

Native (non-Docker) installs run from a virtualenv whose bin holds the `hf` CLI
and `python3` the cookbook download/serve tmux scripts shell out to. Those
scripts start in a fresh login shell with the venv NOT activated, so on a native
macOS install `hf download` failed with "hf: command not found" — and the
`pip --user` self-heal missed because macOS has no bare `pip` command.

- cookbook_helpers.py: _local_tooling_path_export() — pure helper returning a
  PATH export for the running interpreter's bin dir (escaped for double quotes).
- cookbook_routes.py: download + serve runners prepend that dir on local runs
  (gated off SSH/Windows); swap the `pip` install fallbacks to `python3 -m pip`.
- tests: helper output for normal and spaced paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Document macOS llama.cpp serving prerequisites

Clarify the two serving paths on Apple Silicon: the recommended zero-build
route (brew install llama.cpp ships a Metal llama-server Cookbook finds on PATH),
and the from-source fallback, which requires cmake + Xcode Command Line Tools.
Without those the build is skipped and serving silently degrades to a slow CPU
build, so new users now know to install them (or use the prebuilt) up front.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Recommend only GGUF-servable models on Metal

Apple Silicon's only serving engines are llama.cpp and Ollama, both GGUF-only
(vLLM/SGLang are CUDA/ROCm and don't run on macOS). The catalog tags raw
safetensors repos with a default Q4_K_M quant, so the fit-ranking was
recommending ~397/501 models that have no GGUF and fail to serve on Metal with
"No GGUF found" (e.g. microsoft/Phi-mini-MoE-instruct).

Drop any model without a real GGUF (is_gguf/gguf_sources) on Apple Silicon —
subsumes the previous AWQ/GPTQ/FP8 special-case into one rule. On CUDA these
stay visible since vLLM serves safetensors directly. Metal recommendations go
501 -> 104, all actually servable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Remove macOS launchd LaunchAgent (cherry-picked extra)

Drop the launchd service from the PR #213 cherry-picks: the
install-service-macos.sh installer, the com.odysseus.ui.plist template, and the
README section documenting them. Tangential to the core Cookbook/Metal support
and not wanted. The build-macos-app.sh launcher is kept.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Add one-command macOS quick start (start-macos.sh)

Running Odysseus natively on a Mac previously meant ~7 manual terminal steps
(brew deps, venv, activate, pip, setup.py, uvicorn with the right port) — not
friendly for a generic macOS user, and the native run is required because Docker
on macOS can't reach the Metal GPU.

- start-macos.sh: installs Homebrew deps (python@3.11, tmux, prebuilt Metal
  llama.cpp), creates the venv, installs requirements, runs setup, and launches
  on a non-AirPlay port (7860). Idempotent; re-run to start again.
- README: the Apple Silicon section now leads with this one-command quick start
  and the clickable .app, with engine/port/manual details folded into a
  collapsible block. Added a pointer at the top of the manual-install section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* macOS quick start: auto-open browser when ready

The "open this URL" line scrolled out of view as uvicorn kept logging after it,
so users missed it. Now start-macos.sh waits (in the background) until the
server accepts connections, prints a boxed "ready" banner at that point (i.e.
after the startup burst, not before), and opens the URL in the default browser
automatically. Skippable with ODYSSEUS_NO_OPEN=1 for headless/SSH use.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Don't assume/force a specific Python version on macOS

The README claimed "system Python is 3.9" — a machine-specific generalization
that's often wrong (macOS ships no recent Python by default; many users already
have 3.11+). Make it generic, and make start-macos.sh detect an existing
Python 3.11+ and use it, only installing python@3.11 when none is found instead
of forcing it on top of the user's Python.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Align start-macos.sh venv path with build-macos-app.sh

start-macos.sh created the environment in .venv/, but build-macos-app.sh and
the manual install steps use venv/ — so the clickable .app wouldn't reuse the
quick-start's environment and would rebuild a second one. Use venv/ everywhere.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* README: state clearly that MLX is unsupported on Apple Silicon

Odysseus has no mlx_lm runtime; it serves GGUF (llama.cpp/Ollama) and CUDA
(vLLM/SGLang) only. MLX-only models can't run on a Mac and are hidden from
Cookbook — make that explicit in both the quick start and the details.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* start-macos.sh: build the venv with an arm64 Python on Apple Silicon

A clean-room run surfaced this: with a universal2/x86 Python (e.g. the
python.org installer under /usr/local), the venv's compiled extensions install
as arm64 but get loaded as x86_64 when launched from the .app bundle, so it
crashes with "incompatible architecture (have arm64, need x86_64)". The terminal
run happened to work only because a universal binary defaults to arm64 there.

On Apple Silicon, look only under /opt/homebrew (arm64-only) for the build
Python, and install Homebrew's python@3.11 if none is present — so the venv is
arm64-only and launches correctly from both the terminal and the .app. Intel
and non-mac paths are unchanged. Verified end-to-end in a clean clone: .app now
boots on Metal with no arch error.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Address dev-exp review: macOS setup robustness + doc/UX fixes

From the voltagent dev-exp review of the branch:
- README: fix broken anchor links (the em-dash heading produced a slug the links
  didn't match); simplify the heading to a stable slug.
- cookbook_routes.py: add /opt/homebrew/bin and /usr/local/bin to the serve PATH
  so a brew-installed llama-server/ollama is found instead of falling back to a
  slow source build.
- start-macos.sh: guard against an empty Python path; fail fast with a clear
  message on port-in-use; ERR trap with a "safe to re-run" message; show pip
  progress (drop --quiet on the slow requirements install); stop the background
  browser-opener cleanly on exit/Ctrl+C (no orphaned poller).
- setup.py: bind hint to 127.0.0.1; suppress the manual run-hint when launched
  by start-macos.sh (ODYSSEUS_SKIP_RUN_HINT) so the URL isn't contradictory.
- build-macos-app.sh: the .app only opens the browser once the server is
  actually ready (not after the readiness timeout).
- cookbookServe.js: drop "Diffusers" from the Metal backend picker —
  diffusion_server.py is CUDA-only, so it was an unservable option on macOS.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: yunggilja <yunggilja@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-01 14:59:19 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00