Validate only token-supplied direct base_url values for API-token chat requests, while keeping admin-configured endpoints available for local/LAN providers.
Scope configured endpoint fallback selection to the API token owner, fail closed for unknown token owners, and preserve strict session ownership checks when resuming sessions from chat-scoped API tokens.
Add focused regression coverage for direct base_url SSRF rejection, configured endpoint fallback behavior, token-owner scoping, URL validation, and null-owner session/endpoint handling.
Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
"stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
Q4_K_M -> BF16 (it is the native base, not a quant).
Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
required; bare "Download complete" no longer marks the task done after
the first config file. Clear button now sends tmux kill-session too.
True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.
CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
The llama.cpp serve auto-install built a bare `llama-cpp-python` in the Linux
source-build fallback and the Termux path, but the serve command runs
`python3 -m llama_cpp.server`, which needs the `[server]` extra. Because the
"already installed?" guard only checks `import llama_cpp` (a bare install
satisfies it), the missing extra was never added, so serving crashed with
`ModuleNotFoundError: No module named 'starlette_context'` (issue #730).
- Request the `[server]` extra in both the Termux direct install and the Linux
Python-bindings fallback (the Windows path already used `[server]`).
- Shell-quote the package spec in `_pip_install_fallback_chain` via `shlex.quote`
so the `[server]` brackets aren't treated as a bash glob; plain names unaffected.
Tests: tests/test_cookbook_helpers.py gains extras-quoting coverage and a
serve-runner regression guard.
A header that declares an unknown or invalid MIME charset (e.g. a malformed
or spam Subject like =?x-unknown-charset?B?...?=) raised an uncaught
LookupError. bytes.decode(..., errors="replace") only handles byte-decode
errors, not codec *lookup* failures, so the "replace" safety net did not
apply.
_decode_header decodes Subject/From/To/Cc for the inbox list, single-message
fetch, and the background mail pollers (routes/email_routes.py,
routes/email_pollers.py, src/builtin_actions.py), so a single bad message
could crash the whole inbox render or the poller loop.
Wrap the per-part decode in try/except (LookupError, ValueError) and fall
back to utf-8/replace. Valid charsets (utf-8, iso-8859-1, ...) are unchanged.
Adds tests/test_email_decode_header.py — the unknown-charset case fails
before this change and passes after.
Serving a diffusion model auto-registered an image endpoint so it appeared in the model picker, but serving an LLM (llama.cpp/vLLM/SGLang/Ollama) did not — a downloaded-and-served model never showed up until the user manually ran /setup. Add _auto_register_llm_endpoint (text sibling of _auto_register_image_endpoint): parse the serve port (explicit --port, else Ollama 11434, else llama.cpp 8080), point an endpoint at http://host:port/v1, dedupe by base_url, and set supports_tools from --enable-auto-tool-choice. Wire it into /api/model/serve for any non-pip, non-diffusion serve.
Events are stored with a naive (UTC) dtstart, but standard .ics exporters
(Google, Apple, Outlook, Fastmail) write the recurrence bound as an absolute
UTC value, e.g. FREQ=DAILY;UNTIL=20240105T090000Z. dateutil refuses to mix a
tz-aware UNTIL with a naive DTSTART ("RRULE UNTIL values must be specified in
UTC when DTSTART is timezone-aware"), so _expand_rrule's except branch swallowed
the ValueError and silently downgraded the event to non-recurring — every
occurrence after the first vanished from the calendar.
When dtstart is naive, strip the trailing Z from UNTIL so it matches the naive
DTSTART before parsing. No effect on tz-aware dtstarts or naive-UNTIL rules.
Adds tests/test_calendar_rrule_until_utc.py — a daily series bounded by a UTC
UNTIL expands to all 5 occurrences (fails before: returns 1, non-recurring).
Co-authored-by: NubsCarson <nubs@nubs.site>
_parse_vcards matched property names with a bare line.startswith("EMAIL") /
"TEL" / "FN:" / "UID:". RFC 6350 property groups — emitted by default by Apple
Contacts / iCloud and many CardDAV servers — prefix the name with a group token,
e.g. item1.EMAIL;type=pref:jane@example.com. Those lines never matched, so emails
and phone numbers from any Apple-synced or Apple-exported address book were
silently dropped (breaking contact search by email, composer autocomplete, and
vCard/CSV export round-trips).
Strip an optional leading group token before matching and value extraction;
no-op for non-grouped lines.
Adds tests/test_contacts_vcard_parse.py (grouped + plain) — the grouped case
fails before this change and passes after.
Co-authored-by: NubsCarson <nubs@nubs.site>
Cookbook dependency installs (vLLM and friends) build large wheels; pip's
default cache lives under $HOME/.cache/pip, so on a small home filesystem the
build dies mid-way with "[Errno 28] No space left on device" (issue #1219) and
the dependency ends up "installed" but unusable (issue #1459).
Add `--no-cache-dir` to the dependency pip-install command (the maintainer's
suggested PIP_CACHE_DIR= workaround, made the default) via a small
_pip_install_no_cache() helper applied at the install chokepoint. Consistent
with the existing --no-cache-dir on the llama-cpp-python build. Idempotent;
non-pip-install serve commands are untouched.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: auto-naming for 24h time format
needs_auto_name() required AM/PM suffix for default
frontend-generated names like 'deepseek-v4-flash 17:46:02'.
Frontend uses toLocaleTimeString() which outputs 24h
format in most locales — so the regex never matched and
auto-naming silently skipped.
Made AM/PM optional and added re.IGNORECASE for 'am'/'pm'.
* test: add regression tests for needs_auto_name (24h + 12h + custom)
---------
Co-authored-by: Calculator Dev <dev@calculator.local>
When uploads.json contains a malformed entry without an 'id' key,
the file-serve and lookup helpers crash with KeyError instead of
gracefully skipping the entry.
Normalize scheduled email send_at values with timezone offsets or Z suffixes to naive UTC before storing, matching the poller's lexicographic comparison format and preventing early/late sends.
POST /api/image/harmonize and POST /api/image/inpaint read an `_endpoint` from
the request body and issue server-side httpx POSTs to it with no validation. A
caller can set `_endpoint` to http://169.254.169.254/ (cloud instance metadata)
or any internal/loopback address the server can reach, turning these routes into
an SSRF primitive.
routes/embedding_routes.py already runs its user-supplied endpoint through
src.url_safety.check_outbound_url; these two routes were missing the same guard.
Validate `_endpoint` the same way before any outbound request: non-HTTP(S)
schemes and the link-local metadata range are always rejected, and
IMAGE_BLOCK_PRIVATE_IPS=true blocks private/loopback for full lockdown (the
local-first default still allows LAN diffusion servers).
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
GET /api/history/{session_id} skips messages whose metadata has `hidden` (e.g.
compaction summaries kept for AI context, not shown to the user) on the
in-memory path. The DB fallback — used when the in-memory history is empty,
e.g. after a restart — built the response from every stored row with no such
filter, so hidden messages leaked to the client on DB-served sessions.
Filter `hidden` out of the response on the DB path too. The rebuilt in-memory
session.history still includes them, so AI context (the compaction summaries)
is preserved.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
_resolve_allowed_personal_dir confined a user-supplied path to PERSONAL_DIR with
os.path.abspath + os.path.commonpath. abspath normalises `..` but does NOT
resolve symlinks, so a symlink placed inside PERSONAL_DIR pointing outside it
passes the commonpath check and lets index_personal_documents read files outside
the root. Use os.path.realpath for both the base and the candidate so symlinks
are resolved before the confinement check.
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Three endpoints in history_routes.py ordered by
DbChatMessage.created_at, but the ChatMessage model has no
created_at column — only timestamp. This caused AttributeError
(HTTP 500) on mark-stopped, update-last-meta, and
merge-last-assistant. Other queries in the same file already use
the correct column.
Fixes#1659
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On a large Gmail mailbox the email-summary poller's SINCE scan often finds
nothing (INTERNALDATE/date-header quirks), so it falls back to SEARCH ALL. That
returns one enormous UID line; the socket read can time out mid-response, and the
exception was swallowed — leaving the unread '* SEARCH 325188 …' bytes on the
socket. The next command (the downstream re-select) then read those leftover
bytes and failed with 'EXAMINE => unexpected response: b'325188 …''.
Extract the fallback into _latest_inbox_fallback_uids(conn, reconnect): on a
failed SEARCH ALL it logs out the poisoned connection and reconnects, returning
the fresh connection for downstream use. Reconnecting is correct by construction
— a new connection cannot carry the old one's leftover bytes — so the re-select
always runs on a clean socket.
The same SEARCH ALL + reuse pattern also exists in mcp_servers/email_server.py
and routes/email_routes.py; left for a separate change to keep this surgical.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The serve bootstrap builds llama-server from source only when it is missing
from PATH, so a host that first compiled CPU-only (no nvcc present at build
time) reuses that CPU-only binary on every later serve and never gets a GPU
build, even after a CUDA/ROCm toolkit is installed. There was no UI lever to
force a rebuild.
Adds a 'Rebuild llama.cpp' button to the Cookbook Dependencies tab. It clears
the cached ~/bin/llama-server symlink and ~/llama.cpp/build directory (locally
or on the selected remote server) so the next serve recompiles and picks up
CUDA/HIP if a toolchain is now present. It installs and downloads nothing.
- routes/cookbook_helpers.py: _llama_cpp_rebuild_cmd() (single source of truth)
- routes/shell_routes.py: POST /api/cookbook/rebuild-engine (admin-only, reuses
the existing SSH plumbing for remote hosts)
- static/js/cookbook.js: header button + handler honoring the deps server selector
- tests: cover the command shape and a clean run on a fresh HOME
Motivated by #831 (RTX 4070 user stuck on a CPU-only build with no way to
re-trigger the build).
Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>
POST /api/calendar/test issues a single PROPFIND with raw httpx
Basic auth. CalDAV servers configured for Digest (Baïkal default,
SabreDAV-based servers, Radicale with htdigest) reject Basic with
401, so the UI "Test connection" button surfaces "Auth failed —
check username/password" even when the URL and credentials are
correct.
src/caldav_sync.py (the real sync path) uses caldav.DAVClient,
which negotiates the scheme via niquests, so production sync
already works against these servers. The test endpoint just
doesn't match. Bring it to parity: keep the cheap Basic first
attempt, and on a 401-with-Digest-challenge retry once with
httpx.DigestAuth before deciding it's an auth failure.
Repro: configure CalDAV against a stock Baïkal install — test
button returns 401, sync succeeds.
Co-authored-by: Shatti2 <codered5678@gmail.com>
On Windows, Python defaults to the active code page (cp1252) for
subprocess I/O. HuggingFace CLI outputs U+2713 (✓) when validating
tokens, which cp1252 cannot encode, crashing the download process.
Set PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 in the subprocess
environment so Unicode output from hf/pip/llama-server is handled
correctly.
Fixes#1543
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 'from urllib.parse import quote as _q' at line 734 shadows the
module-level _q (istrstrstrstrstrstrIMAPutility) imported from email_helpers, causing
UnboundLocalError at lines 191 and 278 where _q is used before the
local import executes. This silently breaks the entire auto-summarize
pass.
The AI document-tidy endpoint parses verdicts from LLM JSON output
and calls .lower().strip() directly. If the model returns null or a
non-string element, this crashes with AttributeError. Coerce to str
so malformed output is treated as 'keep' instead of crashing.
Every other uid.decode() call in this function uses
'uid.decode() if isinstance(uid, bytes) else str(uid)' but the
warning at line 832 does bare uid.decode(), crashing with
AttributeError when uid is already a string.
The shell pattern 'if [ $? -eq 0 ]; ... else ... echo DOWNLOAD_FAILED (exit $?)' always reports 'exit 1' because $? inside the else branch is the exit code of the [ test command, not the download. Capture into _ec first.
`_auto_summarize_pass_single` in `routes/email_pollers.py` opens a
long-lived IMAP connection at line 172 and then performs ~700 lines of
work — IMAP `select`/`FETCH`/`SEARCH`, network POSTs to the LLM
endpoint, SQLite writes, and per-uid awaits. The only `conn.logout()`
calls were on three safe paths (early `"No recent emails"`, early
`"No model configured"`, and the happy path at the very end). If any
exception fired between `conn` being created and the final happy path,
the outer `except` block at line 921 caught it, logged, and returned —
without ever calling `conn.logout()`. The IMAP socket leaked until
the server's idle timeout killed it.
This is the same shape as the just-merged upstream fixes#1325
(`_imap_move` in `routes/email_helpers.py`) and #1330 (`_list_emails_sync`
in `routes/email_routes.py`), but in the *background* poller path —
`_auto_summarize_poller` invokes it every 30 min, so the leak
accumulates on every crashed pass instead of being a transient
request-path leak.
The fix is the exact try/finally pattern from #1330:
1. initialize `conn = None` before the try
2. let the try-block assign `conn = _imap_connect(...)`
3. drop the three explicit `conn.logout()` calls on safe paths
4. add a `finally:` block that calls `conn.logout()` if `conn` was set
Tests in `tests/test_email_polly_imap_leak.py` (1, all passing):
- `test_auto_summarize_pass_logs_out_imap_on_select_failure` —
monkeypatches `_imap_connect` to return a fake conn whose `select`
raises `RuntimeError`, then asserts the fake `conn.logout` was
called exactly once and the function returned an `Error: ...`
string. Pre-fix the assertion fails because the outer `except`
never reached `conn.logout`; post-fix the `finally` block
guarantees it on every exit path.
Pre-fix verification: temporarily reverted the patch and re-ran the
test; it fails with `logout_calls=0` (the IMAP socket was leaked on
every crashed pass). Post-fix: `logout_calls=1`.
Uniqueness:
- `git log --all --oneline -S 'conn.logout' -- routes/email_pollers.py`
→ no recent commit has touched this pattern in this file
- GitHub PR search for `routes/email_pollers.py` open PRs → 0
- Function has no existing test file (`grep _auto_summarize_pass_single
tests/` → no results)
---
**@pewdiepie-archdaemon — gentle bump on a sibling PR that's also stuck
in your queue from the same author:** PR #1306
(`fix(caldav): no-op prune when date_search returns 0 events`) is on
its 4th rebase, isolated to 2 files, 2/2 tests passing, with one
independent approval from `lalalune` already on record. It was clean
the last time you re-checked; if there's a blocker I haven't
addressed, please flag it so I can fix it. Otherwise, both #1306 and
this one are ready to merge.
Co-authored-by: isharak7m <192635824+isharak7m@users.noreply.github.com>
* Stop multi-file uploads from tripping the per-IP concurrency guard
The /api/upload concurrency check summed its condition over `files`, but the
condition didn't reference the loop variable — so it collapsed to len(files)
whenever the IP had any recent upload. A single multi-file batch sent right
after another upload therefore counted itself as N concurrent uploads and hit
max_concurrent_uploads (3), returning 429. The browser swallows the 429 (no
`files` in the body) and sends the chat with no attachments, so the model
"doesn't even see" them (issue #1346).
Count genuine recent upload events instead, via a pure count_recent_uploads()
helper, independent of the current batch's file count. save_upload still
enforces the per-minute sliding-window rate limit per file, so throttling is
preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Also reconcile the per-minute upload rate limit with the batch cap
Follow-up within #1346: even after the concurrency-guard fix, a 6+ file batch
still failed because save_upload() counts each file against upload_rate_limit
(was 5/min) while the composer allows MAX_FILES=10 per batch — the reporter saw
"5 attachments work, 6 fail". Raise the per-minute file cap to 60 so a single
full batch (and a few of them) isn't self-rejected; burst abuse stays bounded by
max_concurrent_uploads. Add a real 6-file regression + a config guard that the
cap exceeds the frontend MAX_FILES.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
POST /api/skills/{id}/markdown set sk.name = slugify(sk.name or match['name']),
taking the name parsed from the edited markdown frontmatter. A changed name
makes update_skill() move the skill directory on disk and re-key its usage
sidecar, orphaning the original id. The UI still holds that original id, so the
next DELETE /api/skills/{id} fails the name/id lookup and 404s — 'can't delete
them now'.
The audit save path (_apply_skill_md) already guards against exactly this with
sk.name = name and an explicit 'must NEVER rename the skill' comment. Apply the
same pin here: keep the stored name on markdown save (content edits still take
effect; only the rename is suppressed). Drops the now-unused slugify import.
Adds tests/test_skill_save_no_rename.py: saving markdown whose frontmatter
renames the skill keeps the original name and applies the edit, and a
subsequent delete-by-original-id succeeds. Pure unit test — calls the route
handlers directly with a mock Request (no server/network), like
test_skills_delete_owner.py.
Co-authored-by: lalalune <shawgotbags@gmail.com>
If any exception occurred after conn was created but before the
explicit conn.logout() call, the IMAP connection leaked. Use
try/finally to guarantee cleanup on all exit paths.