* feat: Add ChatGPT Subscription support and related features
- Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI.
- Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status.
- Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance.
- Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints.
- Added new slash commands for managing skills, including listing, searching, and invoking skills.
- Implemented caching for skill catalog to optimize performance.
- Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing.
- Refactored existing code to accommodate new features and improve maintainability.
* refactor: share provider device-flow setup
- reuse one device-flow backend for Copilot and ChatGPT Subscription
- add one frontend device-flow helper for Settings and /setup
- put GitHub Copilot back into Add Models, now as a dropdown option
- make provider selection just select; clicking Add starts sign-in
- stop ChatGPT Subscription setup from opening auth tabs automatically
- make /setup copilot and /setup chatgpt-subscription work from chat
- show ChatGPT Subscription in the /setup suggestions
- show the real error message when setup fails
- add focused tests for the shared flow and setup UI
* feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX
Backend:
- Resolve runtime bearer for provider-auth endpoints at probe time via a
shared _resolve_probe_key() that delegates to resolve_endpoint_runtime,
applied across all probe/refresh call sites.
- Skip live completion probes and health pings for discovery-only providers
(centralized behind _is_discovery_only_provider) — the Codex/Responses API
has no such endpoints, so status is derived from cached models.
- Never persist the short lived ChatGPT bearer to the plaintext sessions
table; proactively clear any stale bearer left by an earlier code path.
- Revoke orphaned ProviderAuthSession credentials when the last endpoint
backing them is deleted (_delete_orphaned_provider_auth), surfaced via
cleared_provider_auth in the delete response.
Frontend (admin.js):
- Auto-start the device-auth flow on provider selection so the authorization
panel (code + Authorize) shows immediately instead of behind a "Sign in" click.
- Remove the redundant top button for device auth providers, move retry
into the panel via an inline "Try again".
- Drop the self-evident hint text and add an execCommand clipboard fallback so
Copy works in non-secure (HTTP/LAN) contexts.
* fix: harden chatgpt subscription provider
* chore: remove PR media from branch
* Fix chatgpt subscription recovery and token handling
---------
Co-authored-by: 5p00kyy <admin@5p00ky.dev>
Backend (services/hwfit + routes):
- rank_models picks visible set by REQUESTED column, not always score —
sorting by Param now shows highest-param models PERIOD (incl. too_tight).
- New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang
cannot serve them); default non-prequantized to BF16 on 2+ GPUs.
- AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so
FP8 wins when both fit.
- Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above
M2.5 on equal composite score; >=100B integers not misread as versions.
- /api/cookbook/hf-latest no longer drops models without an "NB" pattern in
the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered).
- Cached-model scan: atexit flushes models JSON even if the script is
killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s.
- KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New
"stalled" status for shells <1 MB with no .incomplete files.
- /api/cookbook/state POST guard: rejects "done" download tasks lacking
DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned
shard is N<total — stops stale tabs from poisoning persisted state.
- hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization
Q4_K_M -> BF16 (it is the native base, not a quant).
Frontend (static/js):
- Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/
50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score
on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit
column header tagged with active budget (RAM / GPU / N GPU).
- Foldable Download admin-card: the Download h2 is the chevron trigger;
state persists in localStorage.
- Download card surfaces destination dir (Dir: <path>). Same dir on running
task row, font/color matched to uptime (9px Fira Code muted, opacity .4).
- Serve panel ctx text input always resets to model max on open. Sub-MB
cached models show with red "download stalled" badge.
- Bulk-select Cancel + Delete reset the Select button label on exit.
- Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/
required; bare "Download complete" no longer marks the task done after
the first config file. Clear button now sends tmux kill-session too.
True overall % for multi-shard downloads: ((N-1)+frac)/total instead of
hf_transfer per-shard aggregate.
- Diagnosis card simplified: removed fold toggle, copy button, dismiss X.
Suggestion font matches message body (12px).
- HF token field flashes green check + "Saved" on save.
- Cached scan no longer counts stalled rows as downloaded in Scan/Download.
CSS:
- dep Install button width pinned to 76px to match Installed split.
- task-sub row +1px; task-status badge gets margin-right 8px.
- Ctx slider styled like gallery editor sliders (thin pill rail, red thumb).
- Bulk-select cancel button top -3px -> -5px.
Typing / in the chat composer now shows a filtered popup listing all
available commands with their description. Arrow keys or Tab to select,
Enter/Tab to insert, Esc to close, click also works.
- New module: static/js/slashAutocomplete.js
Reads the existing COMMANDS registry (and LEGACY_ALIASES) from
slashCommands.js — no command logic added here, just discovery UI.
Excludes easter-egg commands (flip, roll, 8ball, fortune, odyssey,
ascii). Promotes short legacy aliases (/new, /clear, /web, /compact,
/research, etc.) as first-class rows so users don't have to know the
full /session new form.
- slashCommands.js: export COMMANDS and LEGACY_ALIASES so the new
module can read the registry.
- chat.js: lazy-import slashAutocomplete on init, wire to #message
textarea.
- style.css: popup + row styles using existing CSS variables.