Commit Graph

18 Commits

Author SHA1 Message Date
pewdiepie-archdaemon 3706d756f3 Merge remote-tracking branch 'origin/main' into visual-pr-playground
# Conflicts:
#	routes/cookbook_routes.py
#	routes/hwfit_routes.py
#	services/hwfit/fit.py
#	services/hwfit/models.py
#	static/js/cookbook-diagnosis.js
#	static/js/cookbook-hwfit.js
#	static/js/cookbook.js
#	static/js/cookbookRunning.js
2026-06-03 16:49:10 +09:00
pewdiepie-archdaemon eb79b76432 Cookbook: scoring fixes, UI polish, false-finished + stale-state bug fixes
Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
  sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
  cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
  FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
  M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
  the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
  killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
  "stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
  DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
  shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
  Q4_K_M -> BF16 (it is the native base, not a quant).

Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
  50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
  on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
  column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
  state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
  task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
  cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
  required; bare "Download complete" no longer marks the task done after
  the first config file. Clear button now sends tmux kill-session too.
  True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
  hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
  Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.

CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
2026-06-03 16:32:20 +09:00
lekt8 bf2a1365f6 Don't falsely declare a dependency build stale (#1568) (#1768)
Installing a heavy dependency like vllm crashes in a "stale — restarting" loop:
it restarts mid-install, reuses the cached wheels, then stalls again.

The download/install watchdog (cookbookRunning.js) keyed its stall signal purely
off the downloaded-byte counter ("1.81G/2.49G"). A dependency install spends long
stretches with NO byte counter — pip dependency resolution and the native CUDA
build/compile — so the signal froze and after STALE_PROGRESS_MS the watchdog
declared it stale and auto-restarted it mid-build, looping forever.

Extract the signal into a pure computeProgressSignal (cookbookProgressSignal.js):
keep the byte counter for the download phase (so a genuinely stuck download is
still caught, and an animating-but-frozen ETA frame is NOT mistaken for progress),
and when there's no byte counter fall back to a fingerprint of the output tail so
resolver/compile lines count as progress. Only a truly frozen tail now reads as
stalled.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:23:35 +09:00
Lucas Daniel 1d99429ba0 fix(cookbook): prevent auto-retry from restarting user-stopped downloads (#1778)
Two related bugs in the Cookbook task lifecycle:

1. "Stop all" fired kills via .click() inside a synchronous forEach but
   showed the success toast immediately after — the toast appeared before
   any of the async kill requests had been sent, giving the user false
   confidence the tasks were stopped.

2. The download auto-retry logic (triggered when DOWNLOAD_FAILED appears
   in the task output) had no way to distinguish a network interruption
   from a deliberate user stop. A download stopped via "Stop all" or the
   individual Stop button could be silently restarted up to two times by
   the background monitor.

Fix: persist _userStopped: true to localStorage at the moment the user
clicks Stop (individually) or Stop all. The auto-retry guard checks this
flag before relaunching the download. The flag is written BEFORE the
kill requests fire so there is no window where the monitor can race.

Fixes #1458
2026-06-03 13:22:39 +09:00
ghreprimand 1fda906407 Fix Cookbook container-local model endpoints (#1223)
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
2026-06-03 00:09:48 +09:00
spooky 5b87e69221 feat: add vllm kv cache dtype option (#1185) 2026-06-02 23:17:16 +09:00
pewdiepie-archdaemon ff93a6c63b Polish email and cookbook flows 2026-06-02 22:42:07 +09:00
Juan Pablo Jiménez eda99360d1 Fix Cookbook dependency install completion state
* Fix Cookbook dependency install completion state

Mark Cookbook dependency installs as complete when the background runner
exits successfully, even when HuggingFace-specific download markers are
absent.

* Add focused regression coverage for cookbook dependency completion.

Keep the fix narrowly scoped while carrying env_path through dependency tasks and locking the completion reconciliation behavior with targeted tests.
2026-06-02 12:59:29 +09:00
spooky 0f3280ee05 Expose advanced llama.cpp serve controls 2026-06-02 12:46:16 +09:00
pewdiepie-archdaemon 966b53df77 Improve Cookbook serve diagnostics and recommendations 2026-06-02 12:15:47 +09:00
ooovenenoso 15c7cb58e7 fix(cookbook): retry 0% HF download stalls sooner (#691)
Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-02 11:42:59 +09:00
pewdiepie-archdaemon 96618b01c0 Polish task UI slash commands and Ollama serving 2026-06-02 09:36:03 +09:00
pewdiepie-archdaemon 0ae30211d8 Merge branch 'pr-550' into visual-pr-playground 2026-06-02 06:26:32 +09:00
Collin Osborne 471ee494f0 fix: make transient dropdown/popup menus close on Escape
The global Escape arbiter in ui.js only sees `.modal` elements, so the many
ad-hoc dropdowns and context popups that are built on the fly and appended to
<body> ignored Escape entirely: document-library card/chat menus, chat
context/stats/overflow popups, cookbook serve & running menus, calendar event
menus, and compare pane menus.

Add a small DOM-free dismissal registry (static/js/escMenuStack.js). Menus
register a dismiss callback while open, and the arbiter closes the
most-recently-opened one first, so a menu opened over a modal closes before the
modal. bindMenuDismiss() wires the ubiquitous "append-to-body, close on outside
click" idiom to both the outside-click listener and the Escape stack in one
call, and dismissOrRemove() lets the pre-existing bulk removers (scroll/swipe/
modal-dismiss cleanup, reopen sweeps) tear a menu down through its real teardown
instead of orphaning its stack entry.

Covers ~14 menus across documentLibrary, chatRenderer, cookbookServe,
cookbookRunning, calendar, and compare/panes. Every teardown path — item click,
outside click, swipe, toggle, rebuild, bulk cleanup — routes through the
registry so no entry is ever stranded.

tests/test_esc_menu_stack_js.py pins the registry's LIFO and
exactly-one-per-press guarantees (node-driven; skips when node is absent).
2026-06-01 14:23:22 -04:00
ghreprimand 0aea1736ba Add Cookbook crash report copy action 2026-06-01 09:12:35 -05:00
Sanjay Davis 508fabcb3b Restore dependency refresh after install AND persist safe download mode on retries. (#499) 2026-06-01 22:28:06 +09:00
pewdiepie-archdaemon c953c078e5 Improve Cookbook serve reliability 2026-06-01 11:43:08 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00