Commit Graph

27 Commits

Author SHA1 Message Date
pewdiepie-archdaemon e0af7bd8a0 Chat: show Chat/Agent tag next to message timestamp
Sometimes the user lands in chat mode without realizing — surface the
mode the message went out on as a small uppercase pill right after the
timestamp in the role header.

- roleTimestamp(when, mode) gains an optional mode arg. Agent renders
  in accent; Chat renders in muted/neutral. Other values render
  nothing (back-compat for older history without the field).
- The three roleTimestamp call sites pass metadata?.mode through.
- chat.js writes mode into the user-message metadata at send time and
  into the assistant metadata when the active-stream render lands,
  reading toggleState.mode so research/agent overrides upstream still
  flow through correctly.

Historical messages from before this change just don't show the pill —
graceful fallback, no migration needed.
2026-06-11 20:44:18 +09:00
pewdiepie-archdaemon e6b1009b89 Remove non-merge-ready workspace and terminal agent hooks 2026-06-09 09:48:59 +09:00
pewdiepie-archdaemon 646f8bd2a9 Remove remaining plan mode frontend code 2026-06-09 09:44:22 +09:00
Mostafa Eid d6882a895e feat(chat): recall last user message on empty composer ArrowUp (#1175)
Pressing ArrowUp on an empty #message composer restores the last sent user text, matching common chat-app UX (Slack, Discord, ChatGPT).

- Read from #chat-history .msg-user dataset.raw (same path as resend/regenerate), not session sidebar metadata

- Literal empty check (whitespace-only drafts are preserved); ignore Shift/Alt/Ctrl/Meta and IME composition

- Extract wiring to composerArrowUpRecall.js; rAF + 250ms retry only (no global MutationObserver)

- Add tests/test_composer_arrow_up_recall_js.py

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-08 13:06:05 +02:00
Mohammed Riaz 6ccd4500d7 fix(chat): show requested and actual reply models
Show requested and actual reply models in chat labels when fallback or provider routing changes the responding model.
2026-06-06 04:30:16 -06:00
Merajul Arefin 2e37d72155 fix(chat): stop code-block button flicker during streaming (#3023)
Render streamed markdown incrementally (freeze finalized blocks,
re-render only the growing tail) instead of re-rendering the whole
message every token, which recreated every <pre> and dropped CSS :hover.
2026-06-06 04:08:54 -06:00
Kenny Van de Maele 8ce945d338 feat: Add plan mode to the chat agent (#638)
* feat: Add plan mode to the chat agent

Adds a plan mode: the agent investigates read-only, proposes a checklist, and
waits for approval before changing anything. On approval it runs with full
tools and checks items off as it goes. Enforcement reuses the existing
disabled_tools gate.

Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the
plan toggle from the chat input.

- src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP).
- src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the
  plan directive, force agent mode.
- static/: plan toggle pill, Approve & Run, dockable plan window, task-list
  checkboxes, and the /plan slash command.
- tests/test_plan_mode.py.

* Plan mode: persistent re-referenceable plan + agent write-back

Three improvements so a long plan survives a weak model and stays in reach:

1. Re-reference the plan (out-of-context fix). On the execution turn the frontend
   sends the approved checklist back (`approved_plan`); the backend pins it as a
   top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so
   the agent can always re-read the plan instead of losing the thread on a long
   run. New `build_active_plan_note()` (unit-tested).

2. Re-open / dock the plan anytime. The plan checklist is stored per-session
   (localStorage). When a plan exists, the plan-mode button opens a small menu
   ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan
   window — so it can stay docked while the agent works. The window live-refreshes
   as the plan changes.

3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps
   `- [x]` after finishing them, or to revise steps when the user asks. Marker
   tool (no I/O) → `plan_update` SSE event → the stored plan + docked window
   update live. The ACTIVE PLAN note instructs the agent to use it.

Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb),
src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse
`approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools
/ tool_index (always-available, not admin-gated).
Frontend: static/js/chat.js (plan store, send `approved_plan`, handle
`plan_update`, capture restated checklists), static/app.js (plan-button menu),
static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key).
Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py.

* Plan mode: drop bash/python, rely on read-only discovery tools

Shell can mutate (write files, hit the network) and can't be constrained to
read-only at the tool layer, so plan mode no longer relies on a prompt to keep
it well-behaved — bash/python are removed from the read-only allowlist and added
to the fail-closed block set. Discovery is covered by the dedicated read-only
tools (read_file, grep, glob, ls) instead.

Rewrites the plan-mode directive to state shell is disabled and lists the
available read-only tools positively. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Comment: note _MCP_READONLY_VERBS are prefixes not whole words

Clarifies that entries like "summar" are intentional stems matched via
startswith (covers summarise/summarize/summary), not typos. Addresses review
feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: clarify why gating inverts the allowlist into a denylist

Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the
comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an
allowlist, so it returns the inverse (all known tool names minus the allowlist).
The static mutator set is a backstop for the schema-derived name list, which
misses XML-only tools and can fail to import. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: stop hardcoding the read-only tool list in the directive

The model is already shown its available (read-only) tools by _assemble_prompt,
which removes every disabled tool. Enumerating them again in the directive only
duplicated that list and would drift as tools change. Point at the tools listed
below instead. Addresses review feedback on #638.
2026-06-05 16:32:25 +02:00
Kenny Van de Maele 0a2adc9c96 Add ask_user tool: agent-posed multiple-choice questions (#2111)
Let the agent pause and ask the user a multiple-choice question when a
task is genuinely ambiguous and the answer changes what it does next —
choosing between approaches, confirming an assumption, picking a target —
instead of guessing.

Modeled on the existing `ui_control` marker pattern: the `ask_user` tool
returns an `ask_user` payload that the agent loop emits as an SSE event
and then ends the turn. The frontend renders the question with clickable
option buttons, a free-text "Other" input, and an x to dismiss; the user's
choice is sent as the next message and the agent resumes with it in
context.

- src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O.
  Validates a non-empty question + 2..6 options, normalizes string/object
  options, returns the payload.
- src/agent_loop.py: emit the `ask_user` event and break the round loop so
  the turn ends and waits for the user's selection. Stream the question as
  assistant text so it persists/replays (prevents a re-ask loop).
- Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS,
  FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any
  user can be asked); the structured args serialize via the default
  json.dumps path.
- routes/chat_routes.py: relay the `ask_user` event to the client.
- static/js/chat.js + static/style.css: render the question card (options +
  free-text Other + dismiss x; removed once answered). Reuses CSS vars and
  the .modal-close button; emoji go through the monochrome-SVG pipeline.
  Bump chat.js cache pin.
- tests/test_ask_user_tool.py: payload, multi flag, string options, option
  cap, validation errors, serializer round-trip, registration.
2026-06-05 11:49:11 +02:00
Kenny Van de Maele 2be3779e6e feat: Add workspace: confine agent tools to a folder (#1103)
* feat: Add workspace: confine agent tools to a folder

Pick a server folder as the agent's workspace so its file/shell tools work
there and don't touch files outside it. File tools are hard-confined; bash/
python run with cwd set to the folder.

Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` /
`clear` / `pick` (open the directory browser).

- routes/workspace_routes.py: GET /api/workspace/browse (admin-only).
- src/tool_execution.py: hard path confinement for read_file/write_file;
  bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block.
- src/agent_loop.py: workspace note prepended to the system prompt.
- static/: overflow menu item, input-bar pill, directory-browser modal, and
  the /workspace slash command.
- tests/test_workspace_confine.py.

* Wire workspace confinement into tools that landed after this PR

edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was
written, so they bypassed the workspace boundary. Thread the workspace through:
  - edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace
  - grep/glob/ls: _resolve_search_root confines to the workspace (root + paths)
  - bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir
    default when no workspace is set)
Tests cover edit_file + grep/ls confinement (inside ok, outside rejected).

* Workspace picker: editable path bar + modal style cohesion + cross-platform hardening

- Make the current-folder strip an editable address bar: type/paste a full
  path and press Enter to navigate (also reaches other Windows drives and
  hidden dirs the up-only browser cannot).
- Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn*
  in favour of base .modal-content/.modal-body and the .confirm-btn button
  family; separators/hover use var(--border). Net -31 CSS lines.
- Fix the path field overflowing the modal right edge (flex stretch + margin
  vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin.
- Cross-platform confinement: normcase the workspace commonpath check so
  containment holds on case-insensitive filesystems (Windows/macOS).
- Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd()
  instead of pwd. 5 pass.
2026-06-05 00:06:37 +02:00
Kenny Van de Maele 64d65b73c1 feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)
* feat: round-limit handling — Continue affordance at the cap + configurable cap

When the agent loop runs out of rounds (per-message step cap, default 20)
while still actively using tools, it stopped silently mid-task. Now:

1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows
   a "Continue" pill at the bottom of the chat that resumes the task from where
   it left off. Repeated cap-hits each get a fresh Continue (multiple continues
   in a row).
2. The cap is configurable in Settings → Agent ("Max steps per message"),
   validated on the client, at the save endpoint, and at the read site.

- src/agent_loop.py: track `_exhausted_rounds` (set only when a full
  tool-executing round completes on the last allowed round — i.e. the agent
  wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged).
- routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as
  `max_rounds`; forward the new event through the SSE relay.
- routes/auth_routes.py: validate numeric settings on save (int + clamp;
  agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int).
- src/settings.py: default `agent_max_rounds = 20`.
- static/: Settings input + client-side clamp; the Continue pill (reuses the
  existing .stopped-indicator / .continue-btn classes and theme vars
  --border/--fg/--bg/--accent); appended to the chat container so it survives
  the message re-render at stream finalize. chat.js cache version bumped.

* test: cover rounds_exhausted emission (cap-hit vs normal finish)

Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings:
a tool block every round exhausts the cap and must emit rounds_exhausted; a
plain answer hits the done-break and must not. Guards the for/else logic.
2026-06-04 22:36:05 +02:00
Vykos b59bbe80ce Harden chat streaming DOM sinks (#2498) 2026-06-04 20:49:37 +02:00
RaresKeY c12c2aa233 fix: normalize Gemma 4 thought-channel output (#2224) 2026-06-04 19:26:58 +02:00
Kenny Van de Maele 7443c36bd9 feat: Add edit_file tool + file-change diffs (#1239)
* Add edit_file tool + file-change diffs

edit_file is an exact old_string -> new_string replacement on a file on disk
(fails if old_string is missing or non-unique unless replace_all); write_file
also returns a unified diff. Diffs render collapsed in the tool bubble
(filename + +adds/-dels, theme colors); the raw JSON command box is hidden.

Security: edit_file is a sensitive filesystem-write tool, treated everywhere
write_file is —
  - added to NON_ADMIN_BLOCKED_TOOLS (is_public_blocked_tool / blocked_tools_for_owner),
    so on auth-enabled deployments a non-admin cannot run it; execute_tool_block
    refuses it for non-admin owners.
  - confined by the same path policy as read_file/write_file (allowlist +
    sensitive-file deny) via _resolve_tool_path.

Disambiguation in tool descriptions + bash prompt: edit_file/write_file are the
only way to write files (they show a diff) — never edit_document (editor panel)
or a bash heredoc/redirect.

Tests (tests/test_edit_file.py): non-admin block (policy + execution gate),
successful edit, not-found old_string, non-unique old_string (+ replace_all),
and path outside the allowed roots.

Files: src/tool_execution.py, src/agent_loop.py, src/tool_schemas.py,
src/agent_tools.py, src/tool_index.py, static/js/chat.js, static/style.css,
tests/test_edit_file.py.

* Drop redundant import os in write_file closure

os is already imported at module top.
2026-06-04 18:29:10 +02:00
Kenny Van de Maele 66fba78011 fix: live-resume chat stream on session re-entry (#2539) (#2561)
* fix: live-resume chat stream on session re-entry (#2539)

When a session was re-entered after a page refresh or in a new tab while
its agent run was still streaming, the UI showed a frozen "Generating
response..." spinner, polled stream_status until the run finished, and
then did a full reload. The live tokens were never shown.

Add resumeStream() in chat.js: it consumes GET /api/chat/resume/{id}
(which replays the run's buffer then streams live), renders reply tokens
as they arrive, and reloads the session on completion for the canonical
final render. sessions.js _checkServerStream now calls it on re-entry and
falls back to the previous spinner+poll path if it is unavailable.

* Finalize plain-text resume in place instead of reloading

On stream completion, resumeStream() called selectSession(), forcing a full
history re-fetch and a visible flicker right as the stream finished.

For plain text replies (no tool calls, sources, doc streaming, or multi-round
output) the live tokens are already rendered, so finalize in place: replace the
live bubble with a canonical single message via chatRenderer.addMessage (markdown
+ footer actions + metrics, the same renderer history uses), captured from the
streamed metrics event. No history refetch, no extra round-trip, no flicker.

Rich responses still reload, since their canonical render (tool bubbles, sources,
multi-bubble) is rebuilt from the saved DB record.

* Use a dedicated set for the resume re-attach lock; fix stale docblock

resumeStream() marked its re-attach lock in _backgroundStreams, which
checkBackgroundStream() also reads. On a second re-entry of the same session
while a resume was still live, checkBackgroundStream() mistook that entry for a
same-tab POST stream and spawned its own spinner+poll bubble. Move the lock to a
dedicated _resumingStreams set (also covered by hasActiveStream) so the two paths
no longer collide. Also update the resumeStream docblock to describe the
in-place finalize vs reload split.
2026-06-04 17:56:15 +02:00
Alexander Kenley 7b45a94b6d Fix calendar routing and user-local time context (#408)
* fix(chat): add user-local time context

* fix(chat): route calendar follow-up phrasing

* refactor(chat): log tool intent routing reasons

* test(chat): align user time prompt shim

---------

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>
2026-06-04 13:20:04 +01:00
Wes Huber 2e34bde07a fix(chat): clear input field when no model is selected (#1702)
When submitting a message without a model/session configured, the
error path showed a help message but never cleared the textarea,
leaving the user's text stuck in the input field. Clear the input
and trigger autoResize on both the no-default-model and catch paths.

Fixes #1475

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-03 13:37:06 +09:00
lekt8 57abe69173 Let the output "x" delete work when no model/session exists (#1431)
deleteMessage() bailed at `if (!sessionId) return;`, so the "x" on an output
shown before a model/API was selected did nothing — there's no session yet
(issue #1428). The session id is only needed for the server-side delete; without
one (or with no persisted message ids) we now fall through to removing the DOM,
so the "x" always at least dismisses the bubble.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 04:20:48 +09:00
3ASiC 521848da75 fix(ui): don't submit chat message on Enter during IME composition (#1091)
CJK and other IME users confirm a candidate from the input-method popup by pressing Enter. The chat composer and the in-place message editor each bind a keydown handler that treats Enter (without Shift) as "submit", but they did not exclude the composition state. Pressing Enter to accept an IME candidate therefore sent the half-composed text (e.g. a stray "ce's") instead of just confirming the candidate.

These textareas intentionally hijack Enter to submit (Enter sends, Shift+Enter inserts a newline), which bypasses the browser's native form submission and the IME guard that comes with it, so the guard has to be re-added explicitly.

Add '&& !e.isComposing' to the three Enter-to-submit handlers: static/app.js (the main composer's button-submit path and its send/new-chat path) and static/js/chat.js (the editor for an already-sent message). Normal Enter (isComposing false) still submits; Shift+Enter still inserts a newline.

Tested: node --check on both files; manually verified with a Chinese IME that pressing Enter to pick a candidate no longer sends, and a message is sent only after composition ends.
2026-06-02 22:32:50 +09:00
Kenny Van de Maele cfb7ec1c71 Accessibility: add labels and toggle states
* Accessibility: ARIA labels and toggle states

Screen readers couldn't name several icon-only controls or tell whether the
tool toggles were on. This adds accessible names and exposes toggle state,
with no behavior or layout change.

- Icon-only buttons get aria-label: web/shell tool toggles, the "more tools"
  overflow button (+ aria-haspopup), and the color-reset buttons.
- Unlabeled inputs/selects get aria-label: memory + skills search, model-picker
  search, memory sort, theme font/density selects, and the new-memory / skill
  (title, when-to-use, how, tags) fields, which only had a visual floating label.
- Toggle state via aria-pressed, kept in sync at the existing .active write
  sites: web/shell toggles (setupToggle) and the Agent/Chat mode buttons
  (initModeToggle). Static aria-pressed added in the markup so the attribute
  exists before JS runs.

Scope: first slice of the ROADMAP accessibility pass. Focus-visible/contrast,
reduced-motion, and modal dialog roles/focus-trap are left for follow-ups.

Checks: node --check static/app.js. No Python touched.

* Accessibility: mark chat log busy while streaming

The chat log is an aria-live="polite" region, so streaming a response
token-by-token made screen readers announce every partial update — noisy and
unreadable. Set aria-busy="true" on #chat-history while a response streams and
back to "false" in the stream's finally block. Assistive tech then waits for
the settled message and announces it once.

Checks: node --check static/js/chat.js.
2026-06-02 20:55:05 +09:00
nsgds 5645cce6d0 Support vLLM 0.20.2 / NIM reasoning-parser output end-to-end (surface + agent context + render) (#602)
* fix(stream): read 'reasoning' SSE field for vLLM 0.20.2 / NIM

vLLM 0.20.2 / NVIDIA NIM emit reasoning-parser output in the `reasoning` delta field; older builds use `reasoning_content`. stream_llm() read only the latter, so reasoning from models like Nemotron-3-Nano (--reasoning-parser) was silently dropped and never rendered. Accept either field.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent): keep reasoning_content only on the latest assistant turn

The agent loop echoed each round's reasoning back as `reasoning_content` on every assistant turn, assuming vendors ignore it. Nemotron's chat template re-injects ALL prior reasoning_content as <think> blocks, and the loop is trimmed only once (before it starts) — so reasoning accumulated unbounded across rounds, bloating context and feeding the model its own prior reasoning, which reinforced repetition/looping. Strip reasoning_content from earlier assistant turns so only the most recent round carries it (still satisfies DeepSeek's thinking-mode follow-up requirement).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(agent-ui): wrap each round's reasoning in its own <think> block

The streamed think-tag wrapper gated on whole-message substring checks (accumulated.includes('<think>')), which only ever wrapped ONE reasoning block per message. A multi-round agent response has a reasoning phase per round, so once round 1 closed its <think>...</think>, rounds 2+ reasoning was emitted unwrapped and leaked into the visible answer. Replace the substring checks with a stateful open/close flag that toggles per think/answer cycle, so each round's reasoning gets its own collapsible block. Single-turn chat is unchanged (one open, one close).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(stream): reasoning/reasoning_content delta surfaces as thinking chunk

Covers @pewdiepie-archdaemon's requested regression: a streamed {reasoning: ...} delta emits a thinking chunk while {content: ...} streams as normal content; plus the older reasoning_content field for backward compat. Mirrors the #591 scenario.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:48:17 +09:00
James Arslan 6776c7d691 Surface silent model fallback instead of masking it (#868)
When the selected model fails before producing output, stream_llm_with_fallback
quietly switches to the next candidate and the reply is shown under the
originally selected model's name, so a misconfigured provider looks like it
works. (Concretely: a Bedrock gateway that 400s every Anthropic/Claude request
appears fine because another model silently answers under the Claude label.)

Emit a `fallback` SSE event ({selected_model, answered_by, reason}) the first
time a non-primary candidate produces output, forward it through the agent loop
and both chat-route paths, stamp the response metrics with the model that
actually answered, and show a notice + relabel the reply in the UI.

Tested: python -m pytest tests/test_llm_core_fallback.py (3 pass);
python -m py_compile src/llm_core.py src/agent_loop.py routes/chat_routes.py;
node --check static/js/chat.js.
2026-06-02 11:37:25 +09:00
pewdiepie-archdaemon 664acf73ee Merge branch 'pr-469' into visual-pr-playground 2026-06-02 06:26:31 +09:00
red person e1102585bf Fix chat stream recovery and PDF library indexing (#468) 2026-06-01 22:33:35 +09:00
Sirsyorrz 6a2f0d5904 Add slash command autocomplete popup
Typing / in the chat composer now shows a filtered popup listing all
available commands with their description. Arrow keys or Tab to select,
Enter/Tab to insert, Esc to close, click also works.

- New module: static/js/slashAutocomplete.js
  Reads the existing COMMANDS registry (and LEGACY_ALIASES) from
  slashCommands.js — no command logic added here, just discovery UI.
  Excludes easter-egg commands (flip, roll, 8ball, fortune, odyssey,
  ascii). Promotes short legacy aliases (/new, /clear, /web, /compact,
  /research, etc.) as first-class rows so users don't have to know the
  full /session new form.

- slashCommands.js: export COMMANDS and LEGACY_ALIASES so the new
  module can read the registry.

- chat.js: lazy-import slashAutocomplete on init, wire to #message
  textarea.

- style.css: popup + row styles using existing CSS variables.
2026-06-01 21:33:46 +10:00
pewdiepie-archdaemon be260f43e8 Handle incomplete detached agent streams 2026-06-01 16:54:11 +09:00
pewdiepie-archdaemon fc7f107b22 Improve Ollama setup and model endpoint handling 2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00