* fix(research): preserve Discuss spin-off primer during context trimming
trim_for_context() kept only system_msgs[:1] as essential and dropped the
rest under budget pressure. A research "Discuss" spin-off seeds the report
as a system message that sits after the preface system messages, so it
landed in extra_system and was the first thing evicted once the chat grew
— the conversation then lost its grounding and drifted off task.
Treat any system message carrying research_spinoff_from metadata as
essential, alongside the leading system prompt, so the seeded report
survives trimming. maybe_compact already retains all system messages.
Tests: tests/test_context_compactor.py::TestResearchPrimerPreserved
* fix(research): ground Discuss spin-off chats on the seeded report
build_chat_context injected global memory (pinned + hybrid-retrieved) and
personal-doc RAG every turn, keyed off the user-level memory_enabled pref
and a request-scoped use_rag flag — never the session. A research spin-off,
whose primer declares the report the sole knowledge base, thus had
unrelated keyword-matched facts pulled in ("wrong data") competing with the
report; its rag=False flag was also ignored (use_rag defaulted on).
Add _session_is_research_spinoff(sess) (detects the primer research_spinoff_from
metadata; handles ChatMessage and dict forms) and, for such sessions,
disable memory injection and force RAG off.
Tests: tests/test_chat_helpers.py spin-off detection cases
---------
Co-authored-by: Dan (cirim) <claude@cirim.org>
Clicking the card body outside the edit <textarea> bubbled to the card's
click handler and collapsed the card, silently discarding unsaved skill
edits (issue #4002). The textarea's own stopPropagation only shields
clicks landing on it. Bail out of the card click handler while a
.skill-md-editor is present so the card only leaves edit mode via Save
(Cancel button is handled separately by #3580). Mirrors the same guard
into the built-in capability card, which shared the bug.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(auth): add per-user admin promote/demote toggle
Admin-only API and Users-tab control to grant/revoke admin rights; refuses to demote the last admin.
* fix(auth): restore pre-admin privilege restrictions on demotion
Promoting now stashes the user's privilege map (privileges_before_admin)
and demoting restores it instead of resetting to defaults, so a
promote/demote round trip can no longer broaden a restricted user's
access. Users without a stash (created as admin, or promoted before this
fix) still demote to DEFAULT_PRIVILEGES so a born-admin's stored all-True
map — including can_use_bash — can't survive demotion.
---------
Co-authored-by: K M Merajul Arefin <merajul.arefin@therapservices.net>
delete_gallery_image() deleted the on-disk file before setting
is_active=False and committing. If that commit failed and rolled back,
the record stayed active but its file was already gone — a broken,
unviewable image (data loss).
Soft-delete and commit first, then remove the file best-effort, so a
missing or locked file can no longer 500 a delete that already succeeded
logically.
Adds tests/test_gallery_delete_file_ordering.py covering the
commit-failure (file kept) and success (file removed) paths.
* fix(security): encrypt CardDAV password at rest in settings.json
CardDAV password was stored in plaintext in data/settings.json, while
other secrets (email, CalDAV) are encrypted using src.secret_storage.
On read (_get_carddav_config): decrypt the password via decrypt().
On write (update_config): encrypt the password via encrypt() before
saving to settings.json.
decrypt() is a no-op on plaintext, so existing deployments upgrade
transparently on the first read after the next config save.
* test: add coverage for CardDAV password encryption
Nine tests covering:
- encrypt-on-save and decrypt-on-read round-trip
- encrypted value is stored with enc: prefix (plaintext absent from file)
- legacy plaintext passthrough
- CARDDAV_PASSWORD env var passthrough (not decrypted)
- empty password / no settings file
- double-save does not corrupt
- encrypt() idempotent on already-encrypted value
* fix(kimi): resolve Kimi Code API 403 errors and User-Agent restrictions
Kimi Code subscription keys require a whitelisted coding-agent User-Agent to avoid access_terminated_error 403s. This adds User-Agent probing and caching for Kimi Code endpoints.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix(kimi): omit temperature for kimi-for-coding API calls
Kimi Code rejects any non-default temperature with HTTP 400, which broke deep research probes and low-temp LLM rounds.
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
The dashboard background status reconciler (_pollBackgroundStatus) only
recovered "done" for dependency installs when the backend reported a
finished task as "stopped". A real model download whose tmux pane is
gone after DOWNLOAD_OK (so the dead-session check misses the landed
snapshot) fell through to `task.type === 'download' ? 'crashed'`, so a
completed download was shown as crashed (and stalled on the Serve tab).
Recover "done" from the terminal DOWNLOAD_OK sentinel, mirroring the
dep-install recovery already present. The background poll runs blind, so
it keys off the conclusive exit-0 sentinel only — not the `/snapshots/`
path, which can be printed mid-stream for multi-file downloads and would
risk marking an incomplete download done.
Fixes#3897
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When a download's tmux pane is gone, the status endpoint trusted only the
HF-cache probe to tell completed from stopped. The probe derives its cache
root from its own environment, but the download runner exports
HF_HOME=<local_dir> (the #2722 fix), so custom-dir downloads land in
<local_dir>/hub where the probe never looks - and ollama pulls don't touch
the HF cache at all. Finished downloads were reported as stopped forever,
and tasks already persisted as completed were demoted back to stopped on
the next poll. This is the backend half of #3897, deliberately left out of
the frontend fix in #4000.
- honor the conclusive runner markers first: DOWNLOAD_OK -> completed
(keeping the "Fetching 0 files" error guard), DOWNLOAD_FAILED -> error
- pass the task's local_dir through to the cache probes so they check the
cache the download actually wrote to, keeping the env-var fallback for
default-cache downloads
- move the probe scripts and marker classification into
routes/cookbook_output.py (dependency-free) with behavioral tests
Fixes#4017
* fix(agent): don't let a materialized default budget defeat context scaling
#1230 scales agent_input_token_budget to the model's context window unless
the user explicitly set a budget, detected via is_setting_overridden(). But
the settings-save path materializes every DEFAULT_SETTINGS key into
settings.json (load_settings merges defaults; handlers persist the merged
dict), so the persisted default 6000 reads as "overridden" and the budget
code takes the min(6000, ctx) branch — silently re-capping long-context
models at 6000 for anyone who has ever saved a setting. This reintroduces
the exact regression #1170/#1230 set out to fix.
Add is_setting_customized() (saved value != default) and gate the scaling
on it instead of mere presence. A persisted default is not a user choice.
is_setting_overridden has exactly one consumer (this budget path), so the
change is contained. Tests cover the materialized-default regression, a
deliberately-chosen budget still being honoured, and the absent-key case.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(agent): rework context-budget fix per review (#4122)
Address RaresKeY's review:
P2 (explicitness): is_setting_customized treated a saved value equal to the
default as "not explicit", which ALSO blocked a user from deliberately pinning
the default budget. Reframe the default value itself as the AUTO sentinel —
agent_input_token_budget == DEFAULT_BUDGET means "scale to the model's context
window", any other value is an explicit cap. A materialized default still reads
as auto (fixing the original regression), and any non-default value the user
chooses is now honoured. Drop the now-unused is_setting_customized helper.
P2 (fallback context): auto-scaling trusted get_context_length() even when it
returned only the bare DEFAULT_CONTEXT fallback (no endpoint-reported / known
window), over-allocating on self-hosted/proxy setups. Add get_context_length_known()
(also returns whether the window was actually discovered); the budget block
passes 0 when unknown so auto-scaling stays conservative instead of inflating to
an unproven window.
hard_max stays auto-only — a deliberate explicit budget wins (#1190); kept that
contract and answered the reviewer's question rather than silently reversing it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test(agent): lock the materialized-default budget regression (review on #4121)
Per WGlynn's review on the issue: add an end-to-end regression that saves an
UNRELATED setting (which makes the settings-save path materialize the budget
default into settings.json) and asserts the budget still auto-scales rather than
re-reading as an explicit 6000 cap — locking the exact reopening shut.
To make the test bite the production decision (not just re-derive it), extract
`budget_is_explicit()` into src/context_budget.py and use it from the agent loop.
It keys off value-vs-default (the default is the auto sentinel), NOT settings
presence — which is the whole point, since the save path materializes defaults.
Note: after this PR's rework, is_setting_overridden has ZERO production callers,
so the merged-dict materialization smell can't reach any setting through a
presence check today (WGlynn's durability concern).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(agent): bind the budget context window to its own provenance (review #4122)
RaresKeY caught a correctness bug in the fallback-context guard: stream_agent_loop
kept only the `known` flag from get_context_length_known() and budgeted off the
passed-in `context_length`, which can come from a *different* lookup. Two failures:
- local endpoints are re-queried, so the passed value can be a stale DEFAULT_CONTEXT
fallback while the fresh probe proves the real (smaller) served context — we'd
scale off the stale value;
- callers that don't pass context_length (scheduled tasks, teacher escalation,
skill test runs, bg_monitor) were capped at 6000 even when a long window is
discoverable.
Extract budget_context_for_model() which returns the freshly-probed window when
known else 0, binding the flag to the value it proves; the agent loop uses it.
Regression tests cover the stale-fallback, no-arg-caller, and probe-error paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(agent): fix stale budget comments + tighten to the contract (review #4122)
- settings.py: an explicit budget is clamped to the window only — hard_max is
auto-only (#1190); drop the incorrect "and to hard_max".
- is_setting_overridden docstring: drop the stale "adaptive budgets" example;
point value-sensitive callers at context_budget.budget_is_explicit.
- Tighten the budget-block comments to the contract (default = auto sentinel,
non-default = explicit cap, hard_max = auto-only ceiling).
Comment/docstring-only; no behaviour change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(agent): correct budget issue citations (#1190 → merged #1230/#1273)
The context-budget contract (auto-sentinel, explicit budgets honoured,
hard_max auto-only) merged via #1230 — #1190 was the earlier, closed,
superseded PR. Re-point the contract comments at #1230 (the live source,
already cited for the auto-sentinel two lines up in settings.py).
The configurable hard_max setting (`agent_input_token_hard_max`) was a
reviewer requirement first raised on #1190, omitted from the merged #1230,
and actually added in #1273 — credit #1273 for it and correct the test
comment's history (it previously implied this PR completed the requirement).
Comment/docstring-only; no behaviour change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make src.youtube_handler a compatibility wrapper around services.youtube.youtube_handler so transcript state, URL parsing, and timeout behavior no longer diverge.
Resolve remove_directory_from_rag paths through the same PERSONAL_DIR confinement helper used by add_directory_to_rag before removal sinks are reached.
Lock the API key encryption key file to owner-only permissions on creation and when reading existing keys, with regression coverage for permissions and encryption roundtrip.