Salastil/odysseus - odysseus - Gitea: Git with a cup of tea

Salastil/odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-16 09:45:24 -04:00

Author	SHA1	Message	Date
Maruf Hasan	c3fcaf15b7	feat(providers): add NVIDIA AI provider endpoint support (#3456 ) * feat: add NVIDIA as an AI provider (integrate.api.nvidia.com) * feat: add NVIDIA option to provider settings dropdown and aliases * test: add NVIDIA provider detection and endpoint tests * Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering - nvidia.com -> 'nvidia' curated key for proper provider routing - _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed - _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip, kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse, -embedqa, -nemoretriever * Expand non-chat model filtering for NVIDIA embedding/guard/video models Add _NON_CHAT_PREFIXES: embed, recurrent Add _NON_CHAT_CONTAINS: topic-control, guard, calibration, ai-synthetic-video, cosmos-reason2 Catches remaining unfiltered non-chat models from NVIDIA catalog: embedding (llama-nemotron-embed, embed-qa), guard (llama-guard, nemoguard-topic-control), calibration (ising-calibration), video (ai-synthetic-video-detector, cosmos-reason2), recurrent (recurrentgemma-2b) * Filter non-chat models in _probe_endpoint via _is_chat_model() Previously _is_chat_model() was only used in the per-model probe and _first_chat_model(), so non-chat models still appeared in the model picker even though they were filtered in those specific paths. Applying the filter at _probe_endpoint() return ensures non-chat models (embeddings, safety guards, reward, calibration, video detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never enter cached_models and never appear in the picker. * Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models Prefix checks (mid.startswith) miss models with org prefixes like baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc. Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught regardless of the org prefix. Adds: embed, bge, recurrent, starcoder, gemma-2b * fix(model-routes): drop collision-prone substrings from global non-chat filter The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES and _NON_CHAT_CONTAINS tuples. These are intended to filter out embedding, retrieval, safety, and vision models from NVIDIA's catalog that are not chat-completions-capable. However, four of the added substrings collide with legitimate chat models served by other providers: - gemma-2b matches google/gemma-2b-it (instruct chat model) - starcoder matches bigcode/starcoder2-15b (code completion model) - recurrent matches google/recurrentgemma-2b (language model) - guard matches meta-llama/Llama-Guard-3-8B (safety classifier) Removing these four from the global tuples keeps the NVIDIA-specific filtering intact (safety, embedding, retrieval, and vision models are still caught by other tokens such as content-safety, -safety, -reward, embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while preventing false negatives for instruct/code models on other providers. Tests added for gemma-2b-it, google/gemma-2b-it, and bigcode/starcoder2-15b-instruct asserting they are recognized as chat models. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS entries redundant since the prefix check runs first. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * style: fix indentation of groq and xai test cases in test_provider_endpoints.py --------- Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>	2026-06-09 11:06:12 +02:00
pewdiepie-archdaemon	6f29b287f6	Remove stale plan slash toggle	2026-06-09 09:54:46 +09:00
pewdiepie-archdaemon	e6b1009b89	Remove non-merge-ready workspace and terminal agent hooks	2026-06-09 09:48:59 +09:00
pewdiepie-archdaemon	2a2a93d845	Remove plan mode from merge-ready UI	2026-06-09 09:40:20 +09:00
stocky789	1e0d9b92af	feat: add ChatGPT Subscription provider (#2876 ) * feat: Add ChatGPT Subscription support and related features - Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI. - Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status. - Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance. - Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints. - Added new slash commands for managing skills, including listing, searching, and invoking skills. - Implemented caching for skill catalog to optimize performance. - Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing. - Refactored existing code to accommodate new features and improve maintainability. * refactor: share provider device-flow setup - reuse one device-flow backend for Copilot and ChatGPT Subscription - add one frontend device-flow helper for Settings and /setup - put GitHub Copilot back into Add Models, now as a dropdown option - make provider selection just select; clicking Add starts sign-in - stop ChatGPT Subscription setup from opening auth tabs automatically - make /setup copilot and /setup chatgpt-subscription work from chat - show ChatGPT Subscription in the /setup suggestions - show the real error message when setup fails - add focused tests for the shared flow and setup UI * feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX Backend: - Resolve runtime bearer for provider-auth endpoints at probe time via a shared _resolve_probe_key() that delegates to resolve_endpoint_runtime, applied across all probe/refresh call sites. - Skip live completion probes and health pings for discovery-only providers (centralized behind _is_discovery_only_provider) — the Codex/Responses API has no such endpoints, so status is derived from cached models. - Never persist the short lived ChatGPT bearer to the plaintext sessions table; proactively clear any stale bearer left by an earlier code path. - Revoke orphaned ProviderAuthSession credentials when the last endpoint backing them is deleted (_delete_orphaned_provider_auth), surfaced via cleared_provider_auth in the delete response. Frontend (admin.js): - Auto-start the device-auth flow on provider selection so the authorization panel (code + Authorize) shows immediately instead of behind a "Sign in" click. - Remove the redundant top button for device auth providers, move retry into the panel via an inline "Try again". - Drop the self-evident hint text and add an execCommand clipboard fallback so Copy works in non-secure (HTTP/LAN) contexts. * fix: harden chatgpt subscription provider * chore: remove PR media from branch * Fix chatgpt subscription recovery and token handling --------- Co-authored-by: 5p00kyy <admin@5p00ky.dev>	2026-06-08 10:19:18 +02:00
M57	12cb39cbd9	feat: add OpenCode Zen and Go as provider options (#26 ) - Add OpenCode Zen (https://opencode.ai/zen/v1) and Go (https://opencode.ai/zen/go/v1) - Add provider detection via _host_match() in llm_core.py - Add curated model list entries in model_routes.py - Add webhook provider URLs - Add provider icon (providers.js) and dropdown options (index.html) - Add auto-detection patterns and setup URLs (slashCommands.js) - Whitelist opencode.ai in URL validation (admin.js) - Rebased on main to fix merge conflicts with _HOST_TO_CURATED refactor Co-authored-by: M57 <hy4ri@users.noreply.github.com>	2026-06-07 16:43:00 +02:00
Kenny Van de Maele	8ce945d338	feat: Add plan mode to the chat agent (#638 ) * feat: Add plan mode to the chat agent Adds a plan mode: the agent investigates read-only, proposes a checklist, and waits for approval before changing anything. On approval it runs with full tools and checks items off as it goes. Enforcement reuses the existing disabled_tools gate. Includes a slash command: `/plan [on\|off]` (and `/toggle plan`) to flip the plan toggle from the chat input. - src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP). - src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the plan directive, force agent mode. - static/: plan toggle pill, Approve & Run, dockable plan window, task-list checkboxes, and the /plan slash command. - tests/test_plan_mode.py. * Plan mode: persistent re-referenceable plan + agent write-back Three improvements so a long plan survives a weak model and stays in reach: 1. Re-reference the plan (out-of-context fix). On the execution turn the frontend sends the approved checklist back (`approved_plan`); the backend pins it as a top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so the agent can always re-read the plan instead of losing the thread on a long run. New `build_active_plan_note()` (unit-tested). 2. Re-open / dock the plan anytime. The plan checklist is stored per-session (localStorage). When a plan exists, the plan-mode button opens a small menu ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan window — so it can stay docked while the agent works. The window live-refreshes as the plan changes. 3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps `- [x]` after finishing them, or to revise steps when the user asks. Marker tool (no I/O) → `plan_update` SSE event → the stored plan + docked window update live. The ACTIVE PLAN note instructs the agent to use it. Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb), src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse `approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools / tool_index (always-available, not admin-gated). Frontend: static/js/chat.js (plan store, send `approved_plan`, handle `plan_update`, capture restated checklists), static/app.js (plan-button menu), static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key). Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py. * Plan mode: drop bash/python, rely on read-only discovery tools Shell can mutate (write files, hit the network) and can't be constrained to read-only at the tool layer, so plan mode no longer relies on a prompt to keep it well-behaved — bash/python are removed from the read-only allowlist and added to the fail-closed block set. Discovery is covered by the dedicated read-only tools (read_file, grep, glob, ls) instead. Rewrites the plan-mode directive to state shell is disabled and lists the available read-only tools positively. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comment: note _MCP_READONLY_VERBS are prefixes not whole words Clarifies that entries like "summar" are intentional stems matched via startswith (covers summarise/summarize/summary), not typos. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: clarify why gating inverts the allowlist into a denylist Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an allowlist, so it returns the inverse (all known tool names minus the allowlist). The static mutator set is a backstop for the schema-derived name list, which misses XML-only tools and can fail to import. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: stop hardcoding the read-only tool list in the directive The model is already shown its available (read-only) tools by _assemble_prompt, which removes every disabled tool. Enumerating them again in the directive only duplicated that list and would drift as tools change. Point at the tools listed below instead. Addresses review feedback on #638.	2026-06-05 16:32:25 +02:00
pewdiepie-archdaemon	2ba77e3aa3	Settings polish: /setup provider subs, Add API defaults to api kind, picker shows offline endpoints, doc library tracks sub-tab - /setup gains explicit provider subcommands (deepseek, openai, anthropic, openrouter, groq, gemini, xai, ollama, copilot, local, endpoint) so the autocomplete popup surfaces "/setup de…" suggestions with format hints, and bare-provider invocations still prompt for the key. - Add API endpoint defaults to kind=api (auto-refresh /v1/models) instead of kind=proxy. Proxy was a frequent footgun for OpenAI- compatible endpoints that DO serve /v1/models — the user got an empty model list and had to flip the dropdown. - Model picker now includes offline endpoints with stale:true so a briefly-down local server doesn't vanish from the picker (it dims and shows the offline pill, clickable anyway). Dedup prefers the online entry when the same model is exposed by both. - Document library modal header reflects the active sub-tab via _TAB_HEADERS so it no longer shows the wrong section name when switching between Documents / Skills / Templates.	2026-06-05 14:41:54 +09:00
Kenny Van de Maele	2be3779e6e	feat: Add workspace: confine agent tools to a folder (#1103 ) * feat: Add workspace: confine agent tools to a folder Pick a server folder as the agent's workspace so its file/shell tools work there and don't touch files outside it. File tools are hard-confined; bash/ python run with cwd set to the folder. Includes a slash command: `/workspace` (alias `/ws`) — show / `set <path>` / `clear` / `pick` (open the directory browser). - routes/workspace_routes.py: GET /api/workspace/browse (admin-only). - src/tool_execution.py: hard path confinement for read_file/write_file; bash/python cwd. Threaded route → stream_agent_loop → execute_tool_block. - src/agent_loop.py: workspace note prepended to the system prompt. - static/: overflow menu item, input-bar pill, directory-browser modal, and the /workspace slash command. - tests/test_workspace_confine.py. * Wire workspace confinement into tools that landed after this PR edit_file (#1239) and grep/glob/ls (#1670) merged after workspace-confine was written, so they bypassed the workspace boundary. Thread the workspace through: - edit_file: _do_edit_file resolves via _resolve_tool_path_in_workspace - grep/glob/ls: _resolve_search_root confines to the workspace (root + paths) - bash/python/bg cwd: workspace or _AGENT_WORKDIR (keep the #2586 data-dir default when no workspace is set) Tests cover edit_file + grep/ls confinement (inside ok, outside rejected). * Workspace picker: editable path bar + modal style cohesion + cross-platform hardening - Make the current-folder strip an editable address bar: type/paste a full path and press Enter to navigate (also reaches other Windows drives and hidden dirs the up-only browser cannot). - Reuse shared modal CSS: drop bespoke .workspace-modal-content/.workspace-btn* in favour of base .modal-content/.modal-body and the .confirm-btn button family; separators/hover use var(--border). Net -31 CSS lines. - Fix the path field overflowing the modal right edge (flex stretch + margin vs an overflow:auto scrollbar-feedback loop): full-bleed, no h-margin. - Cross-platform confinement: normcase the workspace commonpath check so containment holds on case-insensitive filesystems (Windows/macOS). - Make tests OS-portable: sibling temp dirs instead of /etc, python os.getcwd() instead of pwd. 5 pass.	2026-06-05 00:06:37 +02:00
Kenny Van de Maele	67782e684e	fix: exclude slash-command/setup messages from LLM context (#2634 ) (#2640 ) Slash-command replies and the echoed /setup command are persisted to session history so they render in the transcript, but they are UI chatter the user never meant as conversation. They were sent to the model on the next turn, which then commented on '/setup ...' and exposed transient values (e.g. the Copilot device user_code) to the LLM. - get_context_messages() (the LLM-API view) now skips messages tagged metadata.source == 'slash'. Display/history-load paths use raw history and are unaffected. - slashCommands.js tags the echoed user command with source:'slash' too (the assistant replies already carried it); the user line was the one untagged path that still reached context. Fixes #2634.	2026-06-04 21:42:23 +02:00
Kenny Van de Maele	1cd0aa2b8c	feat(provider): add GitHub Copilot provider with device-flow auth (#1480 ) * feat(provider): add GitHub Copilot provider with device-flow auth Adds GitHub Copilot as a model provider, so Copilot models (gpt-4o/4.1/5, Claude, Gemini, …) work through the normal chat + agent loop, incl. native tool calling and vision. Auth is one-click via the GitHub OAuth device flow; the access token is stored as the endpoint's (encrypted) api_key and sent directly as `Authorization: Bearer` (no Copilot-token exchange, no refresh — matching how editors talk to the Copilot API). Copilot is a normal ModelEndpoint detected by host; the only provider-specific behaviour is a small set of required request headers, injected centrally. Sign-in is available from Settings → model endpoints ("Connect GitHub Copilot") and from chat via `/setup copilot`. - src/copilot.py (new), routes/copilot_routes.py (new): constants, header builders, device-flow start/poll, model discovery, owner-scoped endpoint provisioning. - src/llm_core.py, src/endpoint_resolver.py: detect `copilot`, inject headers, per-request x-initiator/vision. - src/agent_loop.py: allowlist api.githubcopilot.com for native tool schemas. - src/model_context.py: known context windows for Copilot (no unauthenticated /models probe). - static/, README, tests/test_copilot.py. Tidy copilot_routes: clarify supports_tools, note _PENDING is per-process	2026-06-04 21:13:14 +02:00
pewdiepie-archdaemon	eb79b76432	Cookbook: scoring fixes, UI polish, false-finished + stale-state bug fixes Backend (services/hwfit + routes): - rank_models picks visible set by REQUESTED column, not always score — sorting by Param now shows highest-param models PERIOD (incl. too_tight). - New fit_only param. Multi-GPU rigs filter GGUF Q*/IQ quants (vLLM/SGLang cannot serve them); default non-prequantized to BF16 on 2+ GPUs. - AWQ / GPTQ-8bit get a -1.0 quality penalty (was 0.0, tied with FP8), so FP8 wins when both fit. - Version-aware tiebreaker (parse Mn.n / Vn) — MiniMax-M2.7 ranks above M2.5 on equal composite score; >=100B integers not misread as versions. - /api/cookbook/hf-latest no longer drops models without an "NB" pattern in the repo id (MiniMax-M2.7, DeepSeek-V4-Pro etc. were silently filtered). - Cached-model scan: atexit flushes models JSON even if the script is killed mid-walk; each scan_dir wrapped in try/except; timeout 60s -> 180s. - KB granularity for sub-MB sizes (was "0 MB" for 12 KB shells). New "stalled" status for shells <1 MB with no .incomplete files. - /api/cookbook/state POST guard: rejects "done" download tasks lacking DOWNLOAD_OK / DOWNLOAD_FAILED / /snapshots/ when the last-mentioned shard is N<total — stops stale tabs from poisoning persisted state. - hf_models.json: add zai-org/GLM-5.1; flip zai-org/GLM-5 quantization Q4_K_M -> BF16 (it is the native base, not a quant). Frontend (static/js): - Scan/Download toolbar: quant defaults to All; ctx slider (8k/16k/32k/ 50k/128k/Max) ported from origin/main with sort=fit on drag, sort=score on Max. GPU toggle commits _activeCount to maxGpu on initial render. Fit column header tagged with active budget (RAM / GPU / N GPU). - Foldable Download admin-card: the Download h2 is the chevron trigger; state persists in localStorage. - Download card surfaces destination dir (Dir: <path>). Same dir on running task row, font/color matched to uptime (9px Fira Code muted, opacity .4). - Serve panel ctx text input always resets to model max on open. Sub-MB cached models show with red "download stalled" badge. - Bulk-select Cancel + Delete reset the Select button label on exit. - Cookbook running: false-finished bug fixed — DOWNLOAD_OK or /snapshots/ required; bare "Download complete" no longer marks the task done after the first config file. Clear button now sends tmux kill-session too. True overall % for multi-shard downloads: ((N-1)+frac)/total instead of hf_transfer per-shard aggregate. - Diagnosis card simplified: removed fold toggle, copy button, dismiss X. Suggestion font matches message body (12px). - HF token field flashes green check + "Saved" on save. - Cached scan no longer counts stalled rows as downloaded in Scan/Download. CSS: - dep Install button width pinned to 76px to match Installed split. - task-sub row +1px; task-status badge gets margin-right 8px. - Ctx slider styled like gallery editor sliders (thin pill rail, red thumb). - Bulk-select cancel button top -3px -> -5px.	2026-06-03 16:32:20 +09:00
pewdiepie-archdaemon	4a112175e2	Remove broken remind slash command	2026-06-02 06:48:41 +09:00
pewdiepie-archdaemon	d5c7e3d3e4	Add direct tool slash commands	2026-06-02 06:44:29 +09:00
pewdiepie-archdaemon	3959eec602	Refresh slash command hints	2026-06-02 06:40:23 +09:00
pewdiepie-archdaemon	e5cae37d15	Merge branch 'pr-673' into visual-pr-playground	2026-06-02 06:26:32 +09:00
pewdiepie-archdaemon	664acf73ee	Merge branch 'pr-469' into visual-pr-playground	2026-06-02 06:26:31 +09:00
Afonso Coutinho	5da662441c	Validate slash command time minutes * fix: reject hour > 23 in 'today/tomorrow' reminder time parsing * fix: reject minute > 59 in reminder time parsing	2026-06-02 05:50:19 +09:00
k.greyZ	7a3871fc95	feat(onboarding): improve setup UX with clickable triggers and auto-fill buttons - Turn the "/setup" text on the welcome screen and fallback state into a clickable link that automatically runs the setup command. - Add an interactive down-arrow "Use in Chat" button next to copy button on typewriter-generated setup code blocks. - Programmatically trim the "..." placeholder when inserting API keys, focusing the cursor right after "sk-". - Implement click-delegation for supported provider spans and raw code elements inside the setup guide to instantly pre-populate the input bar.	2026-06-01 21:11:47 +03:00
Sirsyorrz	6a2f0d5904	Add slash command autocomplete popup Typing / in the chat composer now shows a filtered popup listing all available commands with their description. Arrow keys or Tab to select, Enter/Tab to insert, Esc to close, click also works. - New module: static/js/slashAutocomplete.js Reads the existing COMMANDS registry (and LEGACY_ALIASES) from slashCommands.js — no command logic added here, just discovery UI. Excludes easter-egg commands (flip, roll, 8ball, fortune, odyssey, ascii). Promotes short legacy aliases (/new, /clear, /web, /compact, /research, etc.) as first-class rows so users don't have to know the full /session new form. - slashCommands.js: export COMMANDS and LEGACY_ALIASES so the new module can read the registry. - chat.js: lazy-import slashAutocomplete on init, wire to #message textarea. - style.css: popup + row styles using existing CSS variables.	2026-06-01 21:33:46 +10:00
Alexander Kenley	2c4b8b57dd	feat(ai): add OpenRouter and Ollama Cloud providers (#231 ) Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>	2026-06-01 14:26:10 +09:00
pewdiepie-archdaemon	d026e13a5a	Fix provider setup and strip message metadata	2026-06-01 10:20:18 +09:00
pewdiepie-archdaemon	fc7f107b22	Improve Ollama setup and model endpoint handling	2026-06-01 10:00:15 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00