load_settings() already catches PermissionError, but load_features() caught only
FileNotFoundError/JSONDecodeError/ValueError. An existing-but-unreadable
data/features.json (e.g. root-owned after a deploy) therefore raised instead of
falling back to DEFAULT_FEATURES, taking down GET /api/auth/features and anything
that reads feature flags. Add PermissionError to the except tuple to match
load_settings().
Adds tests/test_load_features_permission_error.py.
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* ci: add security scanning suite and governance
Consolidates the security CI work into one reviewable change. Adds, as
separate workflow files under .github/workflows/:
- secret-scan.yml gitleaks (pinned + checksum-verified), full history
- workflow-security.yml actionlint + zizmor, audits the workflows themselves
- dependency-review.yml PR dependency gate + advisory pip-audit
- container-scan.yml hadolint (blocking) + Trivy image scan (advisory)
- codeql.yml CodeQL for Python and JS, main + weekly
Plus .github/dependabot.yml (pip/npm/actions/docker), .github/CODEOWNERS,
and docs/security-ci.md explaining each check and the one-time settings.
All additive: no existing files are modified. Actions are pinned to commit
SHAs, tokens default-deny (permissions: {}), advisory scans never block,
and SARIF upload is gated to push so fork PRs do not fail on a read-only
token. Composes with the correctness CI in #1015.
* ci(security): isolate Trivy from the Dockerfile lint gate
Address review on #1314 (points 2 and 3).
container-scan.yml now runs only hadolint (the blocking Dockerfile lint)
and keeps the broad pull_request + push:[main] trigger so the required
check always reports and never hangs a PR.
The advisory image scan moves to container-trivy.yml, split by event:
- pull_request / workflow_dispatch: build and scan under contents:read
only, no SARIF upload. The image build runs PR-supplied Dockerfile
instructions, so this path holds no write scope.
- push to main: build, scan, and upload SARIF with security-events:write.
Only this trusted path is granted write.
This stops PR jobs from requesting security-events:write they never use,
and a paths-ignore (matching docker-publish.yml) skips the image rebuild
on docs-only changes.
docs/security-ci.md: correct the trigger description to "every pull
request and every push to main", matching the workflows and the existing
ci.yml convention.
Verified locally: zizmor --offline --min-severity=low and actionlint are
clean on the changed and new workflow files.
---------
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* fix: read allow_bash/allow_web_search from JSON body (#3229)
API callers using Content-Type: application/json had bash and web
tools silently disabled because allow_bash / allow_web_search were
only read from FormData (which is empty for JSON requests).
Changes:
- Fall back to JSON body for allow_bash and allow_web_search values
- Only add bash/web_search to disabled_tools when explicitly set to a
falsy value; when unset (None), defer to per-user privilege checks
- Admins with can_use_bash=True now get bash enabled by default
Fixes#3229
* fix: always send explicit allow_bash/allow_web_search from frontend
The backend 'is not None' guard (from prior commit) is correct for API
callers, but the frontend only sent allow_bash=true when the toggle was
ON — omission meant 'unspecified' which the backend treated as 'don't
disable'. Now the frontend always sends an explicit true/false value:
- allow_bash: sent on every request (checked ? 'true' : 'false')
- allow_web_search: explicit 'false' when toggle is off in agent mode
With explicit frontend values, the 'is not None' guard is safe:
- explicit true → tool enabled
- explicit false → tool disabled
- None (API caller omission) → defer to per-user privilege
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* fix: expand cookbook error output tail from 12 to 50 lines
When a task reaches status 'error', the status endpoint was returning
only the last 12 lines of the subprocess log. The existing context-menu
'Copy last 50 lines' action was therefore copying the same 12 lines,
making it useless for diagnosing failures that produce long stack traces
or build output.
- Set _tail_lines = 50 when status == 'error', keep 12 for running tasks
- Initialise exit_code = None before the status-classification block so
it is always defined in the result dict (was only set inside the
is_alive branch, potential NameError in the dead-session path)
- Include exit_code in the task-status response dict
- JS poller captures exit_code from live data into local task state
The frontend output panel and 'Copy last 50 lines' now show the actual
error context without any UI changes.
* refactor: extract output-tail logic to testable helper + behavioral tests
Addresses review feedback on #1538: the previous tests were source-level
string guards. Extract the tail-slicing into a dependency-free helper
(routes/cookbook_output.error_aware_output_tail) and replace the guards
with behavioral tests that exercise the actual logic:
- error status with a 200-line snapshot -> exactly the last 50 lines
- running/ready/completed/stopped/unknown -> last 12 lines
- short snapshot -> all lines, no padding
- empty snapshot -> empty string
- error tail is a strict superset (suffix-compatible) of the non-error tail
The helper has no FastAPI/SQLAlchemy imports so it unit-tests without
standing up the app.
---------
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
Add a documentation-only test layout inventory for the first low-risk split of the flat tests directory.
Records the current 28-file area_cli set, including tests/test_research_cli_status.py, and documents validation/non-goals for the future mechanical move.
Closes#3712
Part of #2523
* feat(agent): workspace confinement via context-local binding + get_workspace tool
Bind the per-turn workspace once in execute_tool_block; the shared path
resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd
helper (agent_cwd) read it, so file tools + bash/python are confined centrally
and a new tool that uses the shared helpers cannot accidentally bypass it.
Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory
modal (reusing existing modal/button CSS), the /workspace slash command, and a
get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic
(realpath/normcase/commonpath) and docker-safe (container paths, no host
assumptions). Reopens#2023.
* ux(workspace): clarify workspace is not a sandbox
Picker modal note + pill tooltip + get_workspace tool/output wording now state
plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder,
but bash/python only start there (cwd) and are not sandboxed. Modal note reuses
the existing .muted class.
* fix(agent): treat an active workspace as file-work intent
A vague low-signal message (e.g. "look at the local project") matches no
domain keywords, so tool retrieval is skipped and only always-available tools
are offered — leaving the agent with no file access even though a workspace is
set. When a workspace is active, include the file/code tools (incl.
get_workspace) on low-signal turns so the agent can act on the folder.
Also requires the tool index (ChromaDB) to be reachable for normal retrieval;
that is an environment dependency, not part of this change.
* ux(workspace): hide pill + overflow entry in chat mode
Workspace only scopes the agent's file/shell tools, so the pill and the
overflow 'Workspace' entry are agent-only now — hidden in chat mode like the
bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is
called from the agent/chat setMode handler.
* prompt(tools): steer bash/python to defer to the dedicated file tools
bash/python schema descriptions (what native-tool-calling models read) were
bare and gave no steer, so models would do file ops via the shell (e.g. writing
SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python
in the schema + tool-index + prompt section to prefer read_file/write_file/
edit_file/grep/glob/ls and only be used for what those do not cover.
* prompt(tools): keep bash/python deferral generic (no hardcoded tool names)
Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc.
by name, so the guidance does not go stale if those tools are renamed.
* style(workspace): drop em-dashes from added code comments/strings
* ux(workspace): terser non-sandbox note in picker (no tool-name list)
* ux(workspace): mirror terse non-sandbox wording in pill tooltip
* chore: untrack local venv symlink (run-only, not part of the feature)
* prompt(workspace): keep get_workspace text generic (no hardcoded tool names)
* fix(agent): low-signal + workspace surfaces only read-only file tools
Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message
in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but
not write_file/edit_file/bash/python -- those wait for a request that actually
calls for them (RAG retrieval still adds them on a real ask).
* feat(workspace): cap browse listing at 500 dirs with a truncated hint
Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local
_MAX_BROWSE_DIRS so a directory with thousands of children does not dump every
row into the picker; the response carries a truncated flag and the modal tells
the user to type a path to jump in.
* chore: untrack local venv symlink (run-only artifact)
* fix(workspace): vet the workspace root against the sensitive-path deny list at bind time
The in-workspace resolver deny-lists sensitive paths inside the workspace,
but the empty-path search root is the workspace itself, so a workspace of
~/.ssh could be listed via ls with no path. vet_workspace() (public, in
tool_execution next to the resolvers) rejects non-directories and sensitive
roots before the path is ever bound; chat_routes uses it instead of its
inline isdir check.
* fix(workspace): reject filesystem roots and stop showing rejected workspaces as active
Review findings from #3665:
P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes
every absolute path 'inside' the workspace and collapses confinement into
host-wide file access. A root is its own dirname, so reject when
dirname(resolved) == resolved; the browse response now carries a selectable
flag and the picker disables 'Use this folder' on unselectable dirs.
P3: /workspace set stored any string client-side and the chat route silently
dropped rejected values, so the pill could claim a confinement that was not
in effect. New admin-gated /api/workspace/vet validates manual paths before
they persist (canonical path returned), and when a posted workspace is
rejected at send time the stream emits workspace_rejected so the client
clears the stored value and toasts instead of continuing silently.
* fix(workspace): check caller privilege before vetting the posted workspace
Review finding: /api/chat_stream called vet_workspace() on the posted value
for every caller and emitted workspace_rejected on failure, so a non-admin
who can chat but cannot use file/shell tools could distinguish existing
directories from missing/file/sensitive/root paths by whether the event
appeared. The resolution now lives in _resolve_request_workspace, which
drops the submitted value uniformly for non-admin callers, with no vetting
and no event, before the path ever touches the filesystem. Admin and
single-user behavior is unchanged. Test pins that valid and invalid paths
are indistinguishable for a non-admin and that vet_workspace is never
invoked for them.
Replace hardcoded [:2000] and [:4000] slicing with the shared _truncate
helper from tool_utils, which uses MAX_OUTPUT_CHARS and adds an explicit
truncation indicator when content is cut.
Scoped down from the original PR: only agent/tool-output display
behavior, no integrations.py changes.
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>
* fix(llm): stop sending llama.cpp slot-affinity fields to cloud providers
_apply_local_cache_affinity adds session_id + cache_prompt for llama.cpp
KV-cache slot affinity (#2927), gated on _is_self_hosted_openai_compatible,
which treated any unknown OpenAI-compatible host as self-hosted. Strict
cloud providers added as custom endpoints (Mistral at api.mistral.ai)
reject unknown body fields, so every request failed with 422
extra_forbidden. Self-hosted now also requires the endpoint to resolve as
local via model_context.is_local_endpoint: loopback/private/tailscale
host, or endpoint kind explicitly configured as "local" (the escape hatch
for tunneled self-hosted servers). is_local_endpoint is promoted to a
public name since llm_core now shares it.
Fixes#3793
* test(llm): sweep cloud OpenAI-compatible hosts in affinity gating
Parametrized cases adapted from #3839 (credit: Shabablinchikow): deepseek,
x.ai, together, fireworks, and the Gemini OpenAI-compat endpoint must all
stay free of the llama.cpp extras, not just the Mistral host from #3793.
* fix(llm): narrow the Tailscale range to 100.64.0.0/10 in is_local_endpoint
Review finding on #3945: _PRIVATE_PREFIXES carried a bare "100." prefix,
treating all of 100.0.0.0/8 as local while Tailscale only uses the CGNAT
block 100.64.0.0/10. Public 100.x hosts (e.g. AWS ranges outside the
block) were classified local and still received the llama.cpp extras
this PR exists to keep away from strict providers. Match the narrowed
classification routes/model_routes.py already uses, with boundary tests
just below, inside, and just above the range.
_search_fts ran the FTS MATCH query, then looked up each hit's full row with its
own db.query(...).filter(id == message_id).first() inside a loop, so a search
returning N hits issued N extra SELECTs. Fetch all hit rows in a single IN(...)
query via _fetch_messages_by_id and reassemble results in hit (relevance) order.
Adds tests/test_session_search_batch_fetch.py asserting a single batched query
(and no query for empty input). Existing session-search tests stay green.
raw.githubusercontent.com serves Markdown as text/plain, JSON APIs and raw
config files serve application/json, and a lot of code and tool documentation
lives in .md/.txt. fetch_webpage_content only handled PDF and HTML, so a
non-HTML body produced empty content and web_fetch reported 'no readable text
content'. Add a branch that returns the body verbatim for non-HTML text/*,
JSON (application/json and +json), and a .md/.txt/.text/.json URL-suffix
fallback for mislabeled octet-stream. HTML and PDF handling unchanged.
Fixes#3808
* fix: use correct element IDs for privilege-gated button hiding
The privilege-gated button hiding in initializeEventListeners() used
stale element IDs that no longer exist in the DOM:
- 'tool-bash-btn' -> 'bash-toggle-btn' (the actual shell button ID)
- 'tool-image-btn' -> 'set-imgEnabledToggle' (admin settings toggle,
since no standalone image button exists in the composer)
Without this fix, users without can_use_bash / can_generate_images
privileges still see buttons that appear to work but then fail.
* fix: remove incorrect image generation toggle targeting
The set-imgEnabledToggle is the global admin Image Generation master
switch, not a per-user composer control. Non-admins without
can_generate_images never render that toggle, so the lookup is null
and the branch no-ops. Admins without the privilege get the app-wide
toggle force-unchecked based on personal privilege, which is confusing.
There is no composer image button in the DOM, so nothing to hide here.
Drop the can_generate_images block entirely as vdmkenny requested.
---------
Co-authored-by: michaelxer <michaelxer@users.noreply.github.com>
* fix(email): keep FETCH attributes Gmail sends after the header literal
imaplib returns a UID FETCH response as an interleaved list of
(meta, literal) tuples plus bare bytes elements. Which attributes land
where is server-specific: Dovecot sends FLAGS before the RFC822.HEADER
literal (inside the tuple meta), Gmail sends them after it, as a bare
` FLAGS (\Seen))` element. The email list grouping loop and the search
loop only inspected tuples, so on Gmail every message lost its FLAGS and
the whole mailbox rendered as unread/unflagged, with mark-read appearing
to have no effect.
Extract the grouping into _group_uid_fetch_records(), fold bare bytes
parts into the current message meta there, and reuse it in both the
batched list fetch and the per-UID search fetch. Covered by unit tests
with captured Gmail-shaped and Dovecot-shaped responses.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* test(email): use raw byte literals for IMAP backslash escapes
---------
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Replaced the grid layout (which made From row height depend on
cluster height, causing To/Cc to shoot up or down at the wrap
breakpoint) with a plain block stack:
- meta = position:relative block
- From row + details = natural block flow with padding-right
reserving space for the absolute cluster on the right
- cluster = position:absolute top-right, width changes per
container query (308px wide / 158px narrow / 180px overlay)
- padding-right tightens from 320px → 170px → 0 as the cluster
shrinks and finally goes overlay
- details margin-top dropped from -10px to 0 since there's no
grid row gap to compensate for
To/Cc now hugs From with no jumps when the cluster wraps or
overlays.
fire() and fire_and_forget() scheduled delivery with bare create_task()/
loop.create_task() and kept no reference. asyncio holds only a weak reference to
a task, so the GC could collect a delivery (or the fire() coroutine itself)
before it completed, silently dropping the webhook.
Track in-flight tasks in a set on the manager via a _spawn_tracked() helper that
holds a strong reference for the task's lifetime and discards it on completion
(add_done_callback), and route both schedule sites through it.
Adds tests/test_webhook_task_refs.py.
Starlette 1.2.0 prefers httpx2 in the test client and emits a
StarletteDeprecationWarning on TestClient import when only classic httpx
is installed. Adding httpx2 silences the suite-wide warning; runtime code
keeps importing httpx directly and is unaffected.
Fixes#3942
Removed the medium-mode -12px details margin compensation — it
under/over-shot depending on grid row sizing. Replaced with a
:has() rule: when the user expands To/Cc, the From row gets
min-height 92px (matching the cluster's 2-row max height). Row 1
becomes the same size whether the cluster is 1 row (wide) or 2
rows (narrow), so resizing across the 600px wrap breakpoint no
longer makes To/Cc shoot up 4px.
When the cluster snaps to absolute overlay at <380px, it stops
contributing to grid row sizing — row 1 was collapsing to the From
row's natural height, which made the To/Cc details slide upward and
left the floating cluster visually misaligned against them. Setting
min-height:88px on the From row inside the same container query
holds row 1 at the cluster's two-row height so nothing jumps.
PATCH and DELETE /api/tokens/{id} both called require_admin but never
checked that the token belonged to the requesting admin. Any admin could
rename, re-scope, or delete another admin's token by ID.
create_token already stamps owner on every token — update and delete
just never read it. Fixed by comparing token.owner against
get_current_user(request) after the 404 guard, same pattern the rest of
the auth routes use. Check is skipped when current_user is falsy
(AUTH_ENABLED=false / single-user mode).
Fixes#3898
Was fanning out to 3 rows because the 152px max-width (3 icons +
2 gaps exact) had no slack — subpixel rounding could push the
third icon over and trigger another wrap. Bumped to 158px in the
in-grid mode (600px breakpoint) and 180px in the absolute-overlay
mode (380px breakpoint, where the 22px padding-left from the
gradient fade was also eating into the 3-icon row width).
Was wrapping into 4+ rows at narrow widths because the cluster's
grid column could shrink below the 3-icon cap. Set both min-width
and max-width to the 3-icon row width and justify-self:end on the
cluster so the icons stay glued to the right edge instead of
sliding toward the middle when the cluster is wider than its
content.
Anthropic removed the sampling parameters (temperature, top_p, top_k)
starting with Claude Opus 4.7 — sending temperature at all, even 0.0,
returns HTTP 400. _build_anthropic_payload sent it unconditionally, so
every native-Anthropic request to Opus 4.7/4.8 failed: the research probe
(ResearchHandler._probe_endpoint, temperature=0) aborted runs before they
started, and all DeepResearcher._llm calls 400'd.
Add _anthropic_rejects_temperature (version-gates opus-N-M >= (4,7)) and
omit temperature in the Anthropic builder for those models. Older Claude
models (Opus 4.6 and below, Sonnet/Haiku) keep temperature and the
existing [0,1] clamp.
The version gate is hardened against real-world model id shapes:
- a word-boundary anchor so a substring like `octopus-4-8` is not read
as Opus and stripped of temperature;
- a 1-2 digit minor cap so a dated id such as `claude-opus-4-20250514`
(Opus 4.0, listed in ANTHROPIC_MODELS) parses as major-only and keeps
temperature, while dated 4.7+ snapshots still match;
- a non-string guard so a non-string model can't raise AttributeError
(the previous builder never called .lower() on it).
Adds regression tests covering 4.7/4.8 omission, older/dated/legacy
retention, the substring overmatch, and non-string inputs.
Fixes#3065
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 600px / 380px breakpoints were @container docpane queries but
the email reader isn't inside a docpane container — they never
fired and the cluster wrapped to 3+ rows at narrow widths. Added
container-type:inline-size + container-name:emailreader on
.email-reader-header and switched the queries to that container,
so the 2-row cap now actually applies.
Three-step shrink:
1. > 600px pane: cluster sits in col 2 as 1 row of 6
2. 380-600px pane: cluster capped at 3-icon width so wrapping
stops at 3 + 3 (max 2 rows) — chips share width with the 2-row
cluster instead of multiplying into 3+ rows
3. < 380px pane: cluster snaps to absolute overlay with left-edge
box-shadow, still capped at 3-icon width so it's the same 2-row
shape but floating above the truncated chips
Grid tracks now:
- col 1: minmax(60px, 250px) — chip natural width capped at 250px,
with the 60px (4 char) floor enforced on From / To / Cc alike
- col 2: minmax(48px, 1fr) — takes the rest, shrinks first when
the pane narrows
Removed the hard max-width on the action cluster so on wide panes
it stays as one row of 6. Once col 2 shrinks below the 1-row width,
flex-wrap kicks in and the icons re-stack to 3+3. Chips only start
to shrink past that point.
- Action cluster's max-width is calc(48*3 + 4*2) so the 6 icons
always lay out as 3 top / 3 bottom by default.
- When the pane narrows the chips in col 1 shrink first (with 60px
min so 4 chars + ellipsis stay visible).
- At <380px the cluster snaps to absolute overlay with a left-edge
box-shadow so it reads as floating above the truncated chip.
Two-step shrink behavior:
1. As the pane narrows, the action cluster (max-width:50% of meta)
wraps to a 2-row icon stack first
2. Then the recipient chip span starts overflow-scrolling, but
keeps a 60px min-width (~4 chars) so the first chars of the
sender/recipient name stay visible
Previously only the From row affected the action cluster's column
width — To/Cc detail rows spanned both columns and ignored the
cluster. Now:
- meta-details lives in col 1 only so the To/Cc chips shrink
together with the From chip when the pane narrows
- action cluster spans rows 1 and 2 so its width is set by the
widest col-1 content; a long To/Cc list triggers the wrap to a
2-row icon stack just like a long From sender does
Meta switched to CSS grid in undocked mode:
- row 1, col 1: From row (label + chip + chevron)
- row 1, col 2: action cluster
- row 2, span: To/Cc details
The cluster shrinks alongside the chip and flex-wraps into a 2-row
icon stack before crowding the chip. At very narrow pane widths
(< 380px via @container docpane) it snaps back to absolute overlay
so From: still fits.
Docked mode overrides meta back to flex column so the cluster
flows naturally last — under From, and under To/Cc when expanded.
Was rendering as a transparent ghost — From chip / sender text bled
through the gaps between icons. Added a left-fading gradient
backed by var(--bg) so the cluster reads as an opaque overlay
while chips poking out from underneath blend smoothly into its
left edge.
- Window-level recipient-chip click handler now bails if the chip
is inside .email-reader-meta — the per-reader handler still
toggles the expanded-address view on click.
- The from-sender (magnifying glass) search button SVG is now
tinted with var(--accent-primary) so it stands out as a deliberate
search action against the neutral Reply / Forward / etc icons.
Moved the action cluster out of the From row to a sibling of meta
inside .email-reader-meta. Undocked: cluster is absolute-positioned
top-right of the header so it overlays the From line as before.
Docked: cluster is in-flow at the bottom of the meta column, so it
sits below the From row when collapsed and below the To/Cc rows
when the user expands the recipient details via the chevron.
The previous commit read toggleState.mode before it was declared
(send-time site near line 632) and outside its closure (assistant
finalize site near line 3426). Both threw ReferenceError / TDZ on
first send, which crashed the chat send + render pipeline.
Read fresh via Storage.loadToggleState() at each site, defaulting to
'chat' on any error. Mode-tag rendering otherwise unchanged.
Cluster is now in-flow with margin-left:auto and flex-wrap:wrap so
when the chip text grows long enough to crowd it, the buttons split
to a second row of icons before they have to cover the chip. The
absolute-overlay behavior kicks back in at very narrow pane widths
(<380px via @container docpane) so From: still fits on one row when
the pane is truly cramped.
Pulled the From row's negative margin from -8 to -4 so the whole
row (From: label AND chip) sits 4px lower together. Action cluster
below now justifies flex-end so the icons sit at the right edge
of the row instead of left-aligned.
Find-GitBash accepted the Microsoft Store / WSL bash.exe alias and only probed <root>\Git, so it never detected per-user Git for Windows installs under %LocalAppData%\Programs\Git and could skip the launcher's "install Git Bash" note even when no usable Git Bash was present.
Reject the WSL stub (system32/sysnative/windowsapps) and also probe %LocalAppData%\Programs\Git, mirroring core/platform_compat.find_bash.
Refs #3740
Sometimes the user lands in chat mode without realizing — surface the
mode the message went out on as a small uppercase pill right after the
timestamp in the role header.
- roleTimestamp(when, mode) gains an optional mode arg. Agent renders
in accent; Chat renders in muted/neutral. Other values render
nothing (back-compat for older history without the field).
- The three roleTimestamp call sites pass metadata?.mode through.
- chat.js writes mode into the user-message metadata at send time and
into the assistant metadata when the active-stream render lands,
reading toggleState.mode so research/agent overrides upstream still
flow through correctly.
Historical messages from before this change just don't show the pill —
graceful fallback, no migration needed.