The dashboard background status reconciler (_pollBackgroundStatus) only
recovered "done" for dependency installs when the backend reported a
finished task as "stopped". A real model download whose tmux pane is
gone after DOWNLOAD_OK (so the dead-session check misses the landed
snapshot) fell through to `task.type === 'download' ? 'crashed'`, so a
completed download was shown as crashed (and stalled on the Serve tab).
Recover "done" from the terminal DOWNLOAD_OK sentinel, mirroring the
dep-install recovery already present. The background poll runs blind, so
it keys off the conclusive exit-0 sentinel only — not the `/snapshots/`
path, which can be printed mid-stream for multi-file downloads and would
risk marking an incomplete download done.
Fixes#3897
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add check for mobile screen width (<= 768px) to prevent accidental submissions via the Enter key.
- Update event listeners in static/app.js and static/js/chat.js to respect this constraint.
When the backend (vllm / sglang / llama_cpp / diffusers) is missing on
the chosen serve target, the runtime-readiness note already flips red
and reads '<backend> missing on <host>.' but offered no fix path.
Append an accent-coloured link that calls openCookbookDependencies with
expandRecipe + the model's repo id, so one click switches to the
Dependencies tab, expands the right backend row's recipe panel, and
pre-selects the model so the user just hits Run.
- Drop the 'Install via' label and the pill-tag variant buttons. The
toggle is now the same sliding-pill mode-toggle used by the
Agent/Chat selector in the chat input. Pip/uv on the left, Docker on
the right, default = Pip/uv. CSS: extended .mode-chat::before's
translateX(100%) rule to also fire on .mode-right so non-chat
callers can use the same animation without claiming the chat-only
class name.
- Copy button moves inside the <pre>: absolute-positioned at the
top-right corner, icon-only, padding-right on the pre makes room.
Matches the Setup-token copy pattern in the integrations form.
Each recipe catalog entry now carries two variants:
variants.pip → uv pip install …
variants.docker → docker pull <image>
A small 'Install via' pill row in the panel toggles between them
(default = Pip/uv per the user's preference). Switching variant or
changing the model re-renders the <pre> via _refreshRecipePre(); the
display text drops the 'source venv/bin/activate' prefix for Docker
since docker pull doesn't need a venv. Run honours the active variant
so picking Docker queues 'docker pull …' as the tmux task.
Recipes now hold ONLY the install command(s). The rendered <pre>
prepends a 'source <envPath>/bin/activate' line so the user sees a
paste-ready sequence; Run uses env_prefix (same path the Install
button uses) to activate the configured venv before the install
command, so the install lands in the existing environment rather
than a fresh .venv in whatever CWD the tmux task happens to start in.
- cookbook-deps-recipes.js: trim each recipe to its single pip command
- cookbook.js: _recipeDisplayText() prepends the activate context for
display; pre's data-dep-recipe-install holds the raw install-only
command list so Run knows what to send; Run builds env_prefix the
same way _installDep does.
The recipe dropdown was a static catalog (MiniMax / Any vLLM model). Now
it lists every model already downloaded on the active server (the same
_cachedModelIds set the Launch tab + dl-dots already drive), plus an
'Other (generic …)' fallback. The change handler uses pickRecipe(backend,
modelId) to find the best match — MiniMax ids land on the MiniMax recipe,
everything else falls back to the generic install.
cookbook-diagnosis.js: openCookbookDependencies's pre-select logic now
matches by full option value (model id) instead of label substring, since
the dropdown values are full repo ids now.
Before the quickrun (Run) button fires /api/model/serve, ask the deps
API whether the chosen backend (vllm / sglang / llama_cpp) is actually
installed on the target server. If not:
- Toast: '<backend> not installed on <host>. Opening Dependencies …'
- Route the user into the Dependencies tab via the existing
_openCookbookDependencies helper (now exported as
openCookbookDependencies)
- Auto-expand the recipe panel for that backend
- Pre-select the user's model in the panel's picker so the right
recipe is highlighted out of the box
The serve task is suppressed; the Run button is re-enabled. Once the
install task finishes in Running, the user clicks Run again.
cookbook-diagnosis.js: openCookbookDependencies takes an opts object
that, when expandRecipe is set, finds the row's caret and clicks it,
then matches a recipe label by model (currently only MiniMax has a
specific entry; the generic fallback stays selected otherwise).
Each row for vllm, sglang, llama_cpp now carries an expand caret that
opens an inline recipe panel below the row. The panel has:
- 'Serving which model?' select populated from a new tiny catalog
- <pre> code block showing the exact shell sequence for that pair
- Copy: clipboard the commands
- Run: launch the joined 'cmd1 && cmd2 && …' as a tmux task on the
currently-selected deps server (same plumbing as Install)
New file: src/static/js/cookbook-deps-recipes.js — single source of
truth for the recipes. Seeded with MiniMax M2/M2.7 + a generic fallback
for each backend (all three use 'uv venv → source .venv/bin/activate
→ uv pip install ... --torch-backend auto', the recipe the user
pasted). Adding model-specific recipes is now a one-entry edit.
Next commit: Launch-tab pre-flight that intercepts the serve click
when the backend isn't installed and deep-links into this panel.
Was rendering as a separate body block below the Copy/× toolbar.
Now the diagnosis message and the suggested-action text sit inline
on the left of the toolbar, with Copy and × pinned to the right —
reads as one self-contained header strip instead of stacked rows.
These models OOM on --kv-cache-dtype auto (≈bf16) at any usable
context with current tensor-parallel layouts. _detectModelOptimizations
now seeds opts.kvCacheDtype='fp8' for them, and the serve panel's KV
Cache select picks that up as the default unless the user has a
saved override on this skill.
The DeepSeek branch in _detectModelOptimizations matched only V3
and R1 literally. DeepSeek-V4-Flash (and future Vx / Rx) didn't
hit any branch, so the Expert Parallel checkbox + Speculative
defaults never surfaced in the Run panel. Widened to a regex that
catches v3/v3.1/v4/v5/v10+ and r1/r2/… for both the expert-parallel
flag and the MTP speculative defaults.
The +/- step buttons next to the Speculative tokens count read as
clutter for a 1-10 single-digit input — the native number-input
spinner + manual typing is enough. Reduced the input width to 44px
so it sits tight next to the method dropdown.
Reads the last hwfit scan's backend (window._hwfitSystemCache.backend)
and picks the right vLLM install path per vendor:
- NVIDIA/CUDA (default)
- uv: uv pip install -U vllm --torch-backend auto
- docker: docker pull vllm/vllm-openai:latest
- AMD/ROCm
- uv: uv pip install -U vllm --torch-backend rocm
- docker: docker pull rocm/vllm-dev:main
The <pre> previews are re-painted on render to match what Run will
actually launch, and the confirm dialog tags the backend so the user
knows what they're committing to.
Was just a copy-paste reference. Each row now has a Run button that
launches the command as a tmux task on the currently-selected deps
server (same path Reinstall already uses) — Odysseus does the work,
the user watches progress in the Active tab. Dropped the plain
pip option since the existing per-package Install button already
covers it; kept uv (recommended) and Docker pull as the two
alternatives.
- Moved "Extra args" out from above the vLLM advanced checks
(Reasoning Parser, Speculative, MoE Env) to AFTER them, so it
reads as "after the advanced toggles, anything else".
- Added a collapsed "Manual install (vLLM)" details block to the
Dependencies tab description with three copy-paste recipes:
uv venv + uv pip (recommended), plain pip, and docker pull
vllm/vllm-openai:latest. Useful when the in-app Install button
can't run (offline target, custom torch backend, etc).
Provider SVGs in providers.js declare only viewBox (no width/height),
so when injected into the 18×18 logo chips they fell back to the
browser default of 300×150 and blew out the row.
- CSS: SVGs inside settings logo chips (`span[id$="-logo"]`,
the 18px wrappers in fallback rows) now stretch to 100%/100% of
their container.
- Added matching `-logo` chip next to the Endpoint dropdowns in
Default Chat Model and Utility Model cards.
- New `_syncEndpointLogo` helper mirrors the selected endpoint
option's text label through providerLogo() (the select value is
a UUID and wouldn't match anything otherwise), and
`_fillEndpointSelect` calls it on each render.
- The API occasionally returns the same skill twice (built-in
shadow vs user copy, or a write/read race) which made the
duplicate-detector tag BOTH copies as the "recommended" keeper
(the find-skills card showing duplicate #1 twice). Loading now
filters out repeats by lowercased name before render.
- Reordered the per-skill kebab menu: Publish/Unpublish → Select
→ Edit → Test → Audit → Delete. Select previously sat at the
bottom; lifting it next to Publish puts the bulk-mode entry
point with the other bulk-style action.
Auto handles 90%+ of cases — the row of category buttons was
visual noise on the main panel. Now:
- Removed the .research-category-row from above the textarea.
- Added a Format <select> inside Settings (next to Rounds) with
Auto / Product / Compare / How-to / Fact-check options. Default
is Auto, same as before.
- Updated all the JS that read .research-cat.active / data-cat to
read #research-category.value instead (_saveSettings, _readSettings,
_resetCategoryToAuto, _editJob, _restoreSavedSettings).
Same wire to the backend — settings.category still carries through.
Was rendering on its own row under "Multi-step web research with an
LLM-in-the-loop agent". Now appended to that same flex-wrap line as
"— past runs in Library, Research" so the header section stays one
visual block instead of two.
Was rendering on a second row below the "Past research" header,
inflating it to two rows. Now appended to the title span as a small
inline chip — "Past research — all in Library, Research" — keeping
the header at one row. Same click → close panel + open Library tab.
The capture-phase scroll listener was firing for scrolls anywhere
in the modal — including the Trending models list, which lives
inside the Direct Download fold body. Scrolling that list was
auto-folding the section that contains it.
Bail early if the scroll target is the fold body or a descendant —
the section only folds on scrolls in sibling scrollers (.cookbook-body,
.hwfit-list, .modal-content).
Each time the panel opens we pick a random entry from a list of 10
diverse research prompts (history, tech, food, science, fact-check,
how-to) so the textarea hint feels fresh and shows the breadth of
queries the tool handles instead of always nudging toward the same
Odysseus example.
- Removed standalone "Edit cmd & relaunch" — "Edit in serve panel"
renamed to "Edit & relaunch" and is now the single edit entry.
Tooltip notes that the raw cmd is still editable inside the panel.
- Tagged each item with a group (run / edit / endpoint / copy /
danger) and renderer inserts a thin divider whenever the group
changes, so the menu reads as visual blocks instead of one long
list.
- Header h2 inside the Active group now says "Active" (matches
the renamed tab) instead of "Running".
- Both context-menu Reconnect entries (the normal one and the
recover-from-vanished-process fix) say "Reconnect tmux" so the
user knows what the action actually does.
- Sibling cookbook-server-section-* blocks inside the Active group
get a top divider + 14px gap so transitions between server
groups (local / remote-host / etc) read clearly.
Previously the global GPU-toggle total was set once and never
overridden, so a first scan on the local 1-GPU container left
the Run-panel GPU button row stuck on GPU 0 even after switching
to a 4-GPU remote host. Now any scan returning a positive total
updates the binding; zero/missing values still don't clobber a
known-good count (no flicker during in-flight re-probes).
Mirrored the panel's runtime readiness note into a small chip
appended to the .memory-item-title at the top of the expanded
serve card. The in-panel note becomes a hidden source-of-truth.
This way the "vLLM ready on … : vLLM CLI: …; python package:
vllm 0.22.0" status sits inline with the model name where the
user is already looking, instead of buried below the toolbar row.