* feat(auth): add per-user admin promote/demote toggle
Admin-only API and Users-tab control to grant/revoke admin rights; refuses to demote the last admin.
* fix(auth): restore pre-admin privilege restrictions on demotion
Promoting now stashes the user's privilege map (privileges_before_admin)
and demoting restores it instead of resetting to defaults, so a
promote/demote round trip can no longer broaden a restricted user's
access. Users without a stash (created as admin, or promoted before this
fix) still demote to DEFAULT_PRIVILEGES so a born-admin's stored all-True
map — including can_use_bash — can't survive demotion.
---------
Co-authored-by: K M Merajul Arefin <merajul.arefin@therapservices.net>
windowDrag.js ran its own top-edge fullscreen system (cy <= SNAP_PX →
_enterFs()) independently of the tileManager.js snap zones, causing
duplicate/unexpected fullscreen behavior when dragging window chips
toward the top of the screen.
Hardcode enableFullscreen to false. tileManager.js remains the single
source of truth for fullscreen/maximize snap behavior and is untouched.
The dashboard background status reconciler (_pollBackgroundStatus) only
recovered "done" for dependency installs when the backend reported a
finished task as "stopped". A real model download whose tmux pane is
gone after DOWNLOAD_OK (so the dead-session check misses the landed
snapshot) fell through to `task.type === 'download' ? 'crashed'`, so a
completed download was shown as crashed (and stalled on the Serve tab).
Recover "done" from the terminal DOWNLOAD_OK sentinel, mirroring the
dep-install recovery already present. The background poll runs blind, so
it keys off the conclusive exit-0 sentinel only — not the `/snapshots/`
path, which can be printed mid-stream for multi-file downloads and would
risk marking an incomplete download done.
Fixes#3897
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add check for mobile screen width (<= 768px) to prevent accidental submissions via the Enter key.
- Update event listeners in static/app.js and static/js/chat.js to respect this constraint.
* fix: read allow_bash/allow_web_search from JSON body (#3229)
API callers using Content-Type: application/json had bash and web
tools silently disabled because allow_bash / allow_web_search were
only read from FormData (which is empty for JSON requests).
Changes:
- Fall back to JSON body for allow_bash and allow_web_search values
- Only add bash/web_search to disabled_tools when explicitly set to a
falsy value; when unset (None), defer to per-user privilege checks
- Admins with can_use_bash=True now get bash enabled by default
Fixes#3229
* fix: always send explicit allow_bash/allow_web_search from frontend
The backend 'is not None' guard (from prior commit) is correct for API
callers, but the frontend only sent allow_bash=true when the toggle was
ON — omission meant 'unspecified' which the backend treated as 'don't
disable'. Now the frontend always sends an explicit true/false value:
- allow_bash: sent on every request (checked ? 'true' : 'false')
- allow_web_search: explicit 'false' when toggle is off in agent mode
With explicit frontend values, the 'is not None' guard is safe:
- explicit true → tool enabled
- explicit false → tool disabled
- None (API caller omission) → defer to per-user privilege
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* fix: expand cookbook error output tail from 12 to 50 lines
When a task reaches status 'error', the status endpoint was returning
only the last 12 lines of the subprocess log. The existing context-menu
'Copy last 50 lines' action was therefore copying the same 12 lines,
making it useless for diagnosing failures that produce long stack traces
or build output.
- Set _tail_lines = 50 when status == 'error', keep 12 for running tasks
- Initialise exit_code = None before the status-classification block so
it is always defined in the result dict (was only set inside the
is_alive branch, potential NameError in the dead-session path)
- Include exit_code in the task-status response dict
- JS poller captures exit_code from live data into local task state
The frontend output panel and 'Copy last 50 lines' now show the actual
error context without any UI changes.
* refactor: extract output-tail logic to testable helper + behavioral tests
Addresses review feedback on #1538: the previous tests were source-level
string guards. Extract the tail-slicing into a dependency-free helper
(routes/cookbook_output.error_aware_output_tail) and replace the guards
with behavioral tests that exercise the actual logic:
- error status with a 200-line snapshot -> exactly the last 50 lines
- running/ready/completed/stopped/unknown -> last 12 lines
- short snapshot -> all lines, no padding
- empty snapshot -> empty string
- error tail is a strict superset (suffix-compatible) of the non-error tail
The helper has no FastAPI/SQLAlchemy imports so it unit-tests without
standing up the app.
---------
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* feat(agent): workspace confinement via context-local binding + get_workspace tool
Bind the per-turn workspace once in execute_tool_block; the shared path
resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd
helper (agent_cwd) read it, so file tools + bash/python are confined centrally
and a new tool that uses the shared helpers cannot accidentally bypass it.
Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory
modal (reusing existing modal/button CSS), the /workspace slash command, and a
get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic
(realpath/normcase/commonpath) and docker-safe (container paths, no host
assumptions). Reopens#2023.
* ux(workspace): clarify workspace is not a sandbox
Picker modal note + pill tooltip + get_workspace tool/output wording now state
plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder,
but bash/python only start there (cwd) and are not sandboxed. Modal note reuses
the existing .muted class.
* fix(agent): treat an active workspace as file-work intent
A vague low-signal message (e.g. "look at the local project") matches no
domain keywords, so tool retrieval is skipped and only always-available tools
are offered — leaving the agent with no file access even though a workspace is
set. When a workspace is active, include the file/code tools (incl.
get_workspace) on low-signal turns so the agent can act on the folder.
Also requires the tool index (ChromaDB) to be reachable for normal retrieval;
that is an environment dependency, not part of this change.
* ux(workspace): hide pill + overflow entry in chat mode
Workspace only scopes the agent's file/shell tools, so the pill and the
overflow 'Workspace' entry are agent-only now — hidden in chat mode like the
bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is
called from the agent/chat setMode handler.
* prompt(tools): steer bash/python to defer to the dedicated file tools
bash/python schema descriptions (what native-tool-calling models read) were
bare and gave no steer, so models would do file ops via the shell (e.g. writing
SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python
in the schema + tool-index + prompt section to prefer read_file/write_file/
edit_file/grep/glob/ls and only be used for what those do not cover.
* prompt(tools): keep bash/python deferral generic (no hardcoded tool names)
Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc.
by name, so the guidance does not go stale if those tools are renamed.
* style(workspace): drop em-dashes from added code comments/strings
* ux(workspace): terser non-sandbox note in picker (no tool-name list)
* ux(workspace): mirror terse non-sandbox wording in pill tooltip
* chore: untrack local venv symlink (run-only, not part of the feature)
* prompt(workspace): keep get_workspace text generic (no hardcoded tool names)
* fix(agent): low-signal + workspace surfaces only read-only file tools
Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message
in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but
not write_file/edit_file/bash/python -- those wait for a request that actually
calls for them (RAG retrieval still adds them on a real ask).
* feat(workspace): cap browse listing at 500 dirs with a truncated hint
Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local
_MAX_BROWSE_DIRS so a directory with thousands of children does not dump every
row into the picker; the response carries a truncated flag and the modal tells
the user to type a path to jump in.
* chore: untrack local venv symlink (run-only artifact)
* fix(workspace): vet the workspace root against the sensitive-path deny list at bind time
The in-workspace resolver deny-lists sensitive paths inside the workspace,
but the empty-path search root is the workspace itself, so a workspace of
~/.ssh could be listed via ls with no path. vet_workspace() (public, in
tool_execution next to the resolvers) rejects non-directories and sensitive
roots before the path is ever bound; chat_routes uses it instead of its
inline isdir check.
* fix(workspace): reject filesystem roots and stop showing rejected workspaces as active
Review findings from #3665:
P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes
every absolute path 'inside' the workspace and collapses confinement into
host-wide file access. A root is its own dirname, so reject when
dirname(resolved) == resolved; the browse response now carries a selectable
flag and the picker disables 'Use this folder' on unselectable dirs.
P3: /workspace set stored any string client-side and the chat route silently
dropped rejected values, so the pill could claim a confinement that was not
in effect. New admin-gated /api/workspace/vet validates manual paths before
they persist (canonical path returned), and when a posted workspace is
rejected at send time the stream emits workspace_rejected so the client
clears the stored value and toasts instead of continuing silently.
* fix(workspace): check caller privilege before vetting the posted workspace
Review finding: /api/chat_stream called vet_workspace() on the posted value
for every caller and emitted workspace_rejected on failure, so a non-admin
who can chat but cannot use file/shell tools could distinguish existing
directories from missing/file/sensitive/root paths by whether the event
appeared. The resolution now lives in _resolve_request_workspace, which
drops the submitted value uniformly for non-admin callers, with no vetting
and no event, before the path ever touches the filesystem. Admin and
single-user behavior is unchanged. Test pins that valid and invalid paths
are indistinguishable for a non-admin and that vet_workspace is never
invoked for them.
* fix: use correct element IDs for privilege-gated button hiding
The privilege-gated button hiding in initializeEventListeners() used
stale element IDs that no longer exist in the DOM:
- 'tool-bash-btn' -> 'bash-toggle-btn' (the actual shell button ID)
- 'tool-image-btn' -> 'set-imgEnabledToggle' (admin settings toggle,
since no standalone image button exists in the composer)
Without this fix, users without can_use_bash / can_generate_images
privileges still see buttons that appear to work but then fail.
* fix: remove incorrect image generation toggle targeting
The set-imgEnabledToggle is the global admin Image Generation master
switch, not a per-user composer control. Non-admins without
can_generate_images never render that toggle, so the lookup is null
and the branch no-ops. Admins without the privilege get the app-wide
toggle force-unchecked based on personal privilege, which is confusing.
There is no composer image button in the DOM, so nothing to hide here.
Drop the can_generate_images block entirely as vdmkenny requested.
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
The AI-message copy buttons copied dataset.raw, which is the full
accumulated model output — still containing the <think time="...">
reasoning block and any tool-call markup that the renderer strips for
display. Pasting therefore leaked the model's thinking, and the first
heading after </think> lost its markdown formatting because it was
glued to the closing tag.
Add chatRenderer.copyMessageText(), which mirrors the display pipeline
(stripToolBlocks then extractThinkingBlocks) and falls back to the raw
text when stripping leaves nothing (thinking-only turns), and route
both copy handlers — the message footer and the slash-reply footer —
through it. The interrupted-turn Continue flow intentionally keeps
reading dataset.raw.
Fixes#3722
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
* fix(ui): escaped SVG renders as raw markup during web_search tool label
The _toolLabels['web_search'] entry embedded an SVG HTML string
concatenated with label text. At render time the entire value was
passed through esc(), HTML-escaping <svg> tags so the icon
displayed as raw text instead of rendering visually.
Fix: separate icon from label text via a _toolIcons map. The SVG
is injected as raw innerHTML (unescaped) in .agent-thread-icon,
while the label text remains safely escaped.
* test: add behavioral test for web_search tool icon rendering
Co-authored-by: TheDragonTail <jakeoldfield2@gmail.com>
---------
Co-authored-by: TheDragonTail <jakeoldfield2@gmail.com>
* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com)
* feat: add NVIDIA option to provider settings dropdown and aliases
* test: add NVIDIA provider detection and endpoint tests
* Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering
- nvidia.com -> 'nvidia' curated key for proper provider routing
- _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed
- _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip,
kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse,
-embedqa, -nemoretriever
* Expand non-chat model filtering for NVIDIA embedding/guard/video models
Add _NON_CHAT_PREFIXES: embed, recurrent
Add _NON_CHAT_CONTAINS: topic-control, guard, calibration,
ai-synthetic-video, cosmos-reason2
Catches remaining unfiltered non-chat models from NVIDIA catalog:
embedding (llama-nemotron-embed, embed-qa), guard (llama-guard,
nemoguard-topic-control), calibration (ising-calibration),
video (ai-synthetic-video-detector, cosmos-reason2),
recurrent (recurrentgemma-2b)
* Filter non-chat models in _probe_endpoint via _is_chat_model()
Previously _is_chat_model() was only used in the per-model probe
and _first_chat_model(), so non-chat models still appeared in the
model picker even though they were filtered in those specific paths.
Applying the filter at _probe_endpoint() return ensures non-chat
models (embeddings, safety guards, reward, calibration, video
detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never
enter cached_models and never appear in the picker.
* Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models
Prefix checks (mid.startswith) miss models with org prefixes like
baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc.
Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught
regardless of the org prefix.
Adds: embed, bge, recurrent, starcoder, gemma-2b
* fix(model-routes): drop collision-prone substrings from global non-chat filter
The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES
and _NON_CHAT_CONTAINS tuples. These are intended to filter out
embedding, retrieval, safety, and vision models from NVIDIA's catalog
that are not chat-completions-capable. However, four of the added
substrings collide with legitimate chat models served by other providers:
- gemma-2b matches google/gemma-2b-it (instruct chat model)
- starcoder matches bigcode/starcoder2-15b (code completion model)
- recurrent matches google/recurrentgemma-2b (language model)
- guard matches meta-llama/Llama-Guard-3-8B (safety classifier)
Removing these four from the global tuples keeps the NVIDIA-specific
filtering intact (safety, embedding, retrieval, and vision models are
still caught by other tokens such as content-safety, -safety, -reward,
embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while
preventing false negatives for instruct/code models on other providers.
Tests added for gemma-2b-it, google/gemma-2b-it, and
bigcode/starcoder2-15b-instruct asserting they are recognized as chat
models.
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS
Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS
entries redundant since the prefix check runs first.
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
* style: fix indentation of groq and xai test cases in test_provider_endpoints.py
---------
Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
Surface a lot of accumulated cookbook + UI work as a single non-agent
commit so the agent rework lands cleanly.
Highlights:
- Ollama as a first-class backend in the Cookbook:
* Download input accepts ollama-style names (name:tag) → backend=ollama
* /api/cookbook/ollama/library (cached scrape of ollama.com + curated
fallback so classic models like qwen2.5 stay reachable)
* "Browse Ollama library" toggle below Download with size chips
* Engine=Ollama in hwfit toolbar merges the Ollama library into the
main scan list as per-tag rows with the same Fit/Param/Quant/VRAM
columns; click → fills Download input
- API Tokens form added to Integrations panel (matching wired
loadTokens()/initTokenForm() that had no HTML)
- Serve panel polish: Advanced fold tightening (-8px nudges on vLLM
checks, Extra args, Spec row), n_cpu_moe + Split Mode controls
pulled up 8px to align with the row's checkboxes, GGUF File dropdown
exposed for Ollama backend, GPU re-render on Edit serve restore,
_forceBackend flag so saved serveState wins over backend detection,
cookbook:servers-changed CustomEvent so panels don't need refresh
- Models page redesign: Add Models row (URL + hidden API key reveal +
Type select + Scan/Ollama/Key/Test/Add icon buttons), Probe All +
Clear-offline buttons in Added Models toolbar, offline-pill removed
(opacity already conveys state), Engine dropdown gains Ollama option
- _ping_endpoint probes /v1/models then base, accepts 4xx as
reachable (vLLM returns 404 on bare /v1, fully working endpoints
were showing offline)
- Diagnosis card: × dismiss + Copy bundle buttons restored on the
serve error feedback card
- Orphan tmux sweep re-enabled behind a 60s rate-limit + background
Thread (off the main event loop) so dead serves get discovered
- cookbook_routes auto-register watchdog: drops the endpoint if the
serve session exits non-zero within the first ~3min
- ollama-rocm sidecar awareness in download wrapper (`docker exec
ollama-rocm ollama pull` when host ollama isn't installed)
- Skill extractor sets initial_status="published" when
auto_approve_skills pref is on (audit demotes later)
- Skill list / model list / cookbook scan misc polish
Three issues combined to make the per-user 'Allowed models' checklist
unreliable (#3032):
1. admin.js _loadModelsForUser fetched /api/models, which is backed by
cached_models — endpoints that haven't been probed yet (e.g. a
freshly-added DeepSeek API endpoint) simply didn't show up in the
checklist. Switched to /api/model-endpoints, which always reflects
every configured endpoint regardless of cache state.
2. _saveModels sent allowed_models: [] both when the admin clicked
[All] (no restriction) and [None] (block everything) — the backend
had no way to distinguish the two.
3. _enforce_chat_privileges treated an empty allowed_models list as
'no restriction' (falsy -> skip the check), so [None] had no effect.
Added an explicit block_all_models privilege flag (defaulting to False,
and forced to False for admins) that admin.js now sets when zero models
are checked. _enforce_chat_privileges checks it first and 403s
regardless of allowed_models contents.