feat(providers): add NVIDIA AI provider endpoint support (#3456 )

* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com) * feat: add NVIDIA option to provider settings dropdown and aliases * test: add NVIDIA provider detection and endpoint tests * Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering - nvidia.com -> 'nvidia' curated key for proper provider routing - _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed - _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip, kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse, -embedqa, -nemoretriever * Expand non-chat model filtering for NVIDIA embedding/guard/video models Add _NON_CHAT_PREFIXES: embed, recurrent Add _NON_CHAT_CONTAINS: topic-control, guard, calibration, ai-synthetic-video, cosmos-reason2 Catches remaining unfiltered non-chat models from NVIDIA catalog: embedding (llama-nemotron-embed, embed-qa), guard (llama-guard, nemoguard-topic-control), calibration (ising-calibration), video (ai-synthetic-video-detector, cosmos-reason2), recurrent (recurrentgemma-2b) * Filter non-chat models in _probe_endpoint via _is_chat_model() Previously _is_chat_model() was only used in the per-model probe and _first_chat_model(), so non-chat models still appeared in the model picker even though they were filtered in those specific paths. Applying the filter at _probe_endpoint() return ensures non-chat models (embeddings, safety guards, reward, calibration, video detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never enter cached_models and never appear in the picker. * Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models Prefix checks (mid.startswith) miss models with org prefixes like baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc. Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught regardless of the org prefix. Adds: embed, bge, recurrent, starcoder, gemma-2b * fix(model-routes): drop collision-prone substrings from global non-chat filter The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES and _NON_CHAT_CONTAINS tuples. These are intended to filter out embedding, retrieval, safety, and vision models from NVIDIA's catalog that are not chat-completions-capable. However, four of the added substrings collide with legitimate chat models served by other providers: - gemma-2b matches google/gemma-2b-it (instruct chat model) - starcoder matches bigcode/starcoder2-15b (code completion model) - recurrent matches google/recurrentgemma-2b (language model) - guard matches meta-llama/Llama-Guard-3-8B (safety classifier) Removing these four from the global tuples keeps the NVIDIA-specific filtering intact (safety, embedding, retrieval, and vision models are still caught by other tokens such as content-safety, -safety, -reward, embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while preventing false negatives for instruct/code models on other providers. Tests added for gemma-2b-it, google/gemma-2b-it, and bigcode/starcoder2-15b-instruct asserting they are recognized as chat models. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS entries redundant since the prefix check runs first. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * style: fix indentation of groq and xai test cases in test_provider_endpoints.py --------- Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
fix(embeddings): survive numpy embeddings when restoring a reset lane (#3410 )
2026-06-15 17:25:26 -04:00 · 2026-06-09 11:06:12 +02:00 · 2026-06-09 10:40:17 +02:00 · 2026-06-09 10:19:45 +02:00 · 2026-06-09 09:51:29 +02:00 · 2026-06-09 08:30:50 +02:00
24 changed files with 1092 additions and 252 deletions
@@ -472,6 +472,7 @@ components = initialize_managers(BASE_DIR, rag_manager)
 session_manager   = components["session_manager"]
 from src.assistant_log import set_session_manager as _set_asst_sm
 _set_asst_sm(session_manager)
+app.state.session_manager = session_manager
 memory_manager    = components["memory_manager"]
 memory_vector     = components.get("memory_vector")
 upload_handler    = components["upload_handler"]
@@ -7,7 +7,13 @@ import asyncio
 import logging
 import os

+import json
+import re
+from pathlib import Path
+
+from core.atomic_io import atomic_write_json, atomic_write_text
 from core.auth import AuthManager
+from src.constants import DEEP_RESEARCH_DIR, MEMORY_FILE, SKILLS_DIR
 from src.rate_limiter import RateLimiter
 from src.settings_scrub import scrub_settings
 from src.settings import (
@@ -291,9 +297,17 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
        if new_username in auth_manager.users:
            raise HTTPException(409, "Username already taken")

+        # Gate on auth first. Every mutation below is contingent on this
+        # succeeding — doing it last meant a rejected rename (e.g. reserved
+        # username) left file-backed owner fields already rewritten with no
+        # way to roll them back.
+        ok = auth_manager.rename_user(old_username, new_username, user)
+        if not ok:
+            raise HTTPException(400, "Cannot rename user")
+
        # Usernames are ownership keys for user data. Rename the common
-        # owner-scoped DB rows before changing auth so the account keeps
-        # access to its sessions, docs, email accounts, tasks, etc.
+        # owner-scoped DB rows so the account keeps access to its sessions,
+        # docs, email accounts, tasks, etc.
        try:
            from sqlalchemy import func
            from core.database import Base, SessionLocal
@@ -335,9 +349,90 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
        except Exception as e:
            logger.warning("Failed to rename user prefs %s -> %s: %s", old_username, new_username, e)

-        ok = auth_manager.rename_user(old_username, new_username, user)
-        if not ok:
-            raise HTTPException(400, "Cannot rename user")
+        # deep_research: each completed report is a standalone JSON file with
+        # an `owner` field. research_routes filters by d.get("owner") == user,
+        # so a stale owner makes every report invisible to the renamed user.
+        try:
+            dr_dir = Path(DEEP_RESEARCH_DIR)
+            if dr_dir.is_dir():
+                for p in dr_dir.glob("*.json"):
+                    try:
+                        d = json.loads(p.read_text(encoding="utf-8"))
+                        if str(d.get("owner", "")).strip().lower() == old_username:
+                            d["owner"] = new_username
+                            atomic_write_json(str(p), d)
+                    except Exception as err:
+                        logger.warning("Failed to update research owner in %s: %s", p.name, err)
+        except Exception as e:
+            logger.warning("Failed to rename research owner references %s -> %s: %s", old_username, new_username, e)
+
+        # memory.json: a flat JSON array where each entry carries an `owner`
+        # field. memory_manager.load(owner=user) filters on it, so stale
+        # entries disappear from the memory panel.
+        try:
+            if os.path.isfile(MEMORY_FILE):
+                with open(MEMORY_FILE, encoding="utf-8") as fh:
+                    entries = json.loads(fh.read())
+                if isinstance(entries, list):
+                    changed = False
+                    for entry in entries:
+                        if isinstance(entry, dict) and str(entry.get("owner", "")).strip().lower() == old_username:
+                            entry["owner"] = new_username
+                            changed = True
+                    if changed:
+                        atomic_write_json(MEMORY_FILE, entries)
+        except Exception as e:
+            logger.warning("Failed to rename memory.json owner references %s -> %s: %s", old_username, new_username, e)
+
+        # skills: SKILL.md frontmatter carries owner: <username>; the usage
+        # sidecar (_usage.json) keys entries as owner::skill-name. Both must
+        # be updated or the renamed user's Skills panel goes empty.
+        try:
+            skills_root = Path(SKILLS_DIR)
+            if skills_root.is_dir():
+                _owner_re = re.compile(
+                    r'(?m)^(owner:\s*)' + re.escape(old_username) + r'\s*$'
+                )
+                for p in skills_root.rglob("SKILL.md"):
+                    try:
+                        text = p.read_text(encoding="utf-8")
+                        new_text = _owner_re.sub(r'\g<1>' + new_username, text)
+                        if new_text != text:
+                            atomic_write_text(str(p), new_text)
+                    except Exception as err:
+                        logger.warning("Failed to update skill owner in %s: %s", p, err)
+                usage_path = skills_root / "_usage.json"
+                if usage_path.is_file():
+                    try:
+                        usage = json.loads(usage_path.read_text(encoding="utf-8"))
+                        if isinstance(usage, dict):
+                            prefix = old_username + "::"
+                            new_usage = {}
+                            changed = False
+                            for k, v in usage.items():
+                                if k.startswith(prefix):
+                                    new_usage[new_username + "::" + k[len(prefix):]] = v
+                                    changed = True
+                                else:
+                                    new_usage[k] = v
+                            if changed:
+                                atomic_write_json(str(usage_path), new_usage)
+                    except Exception as err:
+                        logger.warning("Failed to update skills usage keys %s -> %s: %s", old_username, new_username, err)
+        except Exception as e:
+            logger.warning("Failed to rename skills owner references %s -> %s: %s", old_username, new_username, e)
+
+        # The in-memory session cache (session_manager.sessions) stores each
+        # session's owner at load time. Without this patch the renamed user's
+        # sessions are invisible on the next /api/sessions call because
+        # get_sessions_for_user does an exact `s.owner == username` comparison
+        # against stale in-memory values.
+        sm = getattr(request.app.state, "session_manager", None)
+        if sm is not None:
+            for sess in list(getattr(sm, "sessions", {}).values()):
+                if str(getattr(sess, "owner", None) or "").strip().lower() == old_username:
+                    sess.owner = new_username
+
        # The owner-rename loop above updated ApiToken.owner in the DB, but the
        # bearer-token cache still maps each token to the OLD owner. Without
        # refreshing it, the renamed user's API tokens resolve to the old (now
@@ -101,11 +101,17 @@ def setup_backup_routes(memory_manager, preset_manager, skills_manager) -> APIRo
        # ── Skills ──
        if "skills" in body and isinstance(body["skills"], list):
            existing = skills_manager.load_all()
-            existing_names = {s.get("name") for s in existing if s.get("name")}
-            existing_ids = {s.get("id") for s in existing if s.get("id")}
+            # Dedup against THIS user's own skills only. Using every tenant's
+            # rows (load_all) meant a skill whose id/name/title matched any
+            # other user's was silently skipped, so the importing user lost
+            # their own data — same cross-tenant bug fixed for memories above.
+            # The full store is still saved back below.
+            own = [s for s in existing if s.get("owner") == user]
+            existing_names = {s.get("name") for s in own if s.get("name")}
+            existing_ids = {s.get("id") for s in own if s.get("id")}
            existing_titles = {
                (s.get("title") or s.get("description") or "").strip().lower()
-                for s in existing
+                for s in own
            }
            added = 0
            for skill in body["skills"]:
@@ -456,7 +456,6 @@ def setup_chat_routes(
        # manual form posts that still send plan_mode=true.
        plan_mode = False
        chat_mode = str(form_data.get("mode", "")).lower()  # 'chat' or 'agent'
-        workspace = ""
        # Plan mode is a modifier on agent mode — it only makes sense with tools.
        if plan_mode:
            chat_mode = "agent"
@@ -1135,7 +1134,6 @@ def setup_chat_routes(
                        tool_policy=tool_policy,
                        owner=_user,
                        fallbacks=_fallback_candidates,
-                        workspace=None,
                        plan_mode=plan_mode,
                        approved_plan=approved_plan or None,
                    ):
@@ -42,9 +42,16 @@ _SESSION_ID_RE = re.compile(r"^[A-Za-z0-9_-]{1,64}$")
 _SSH_PORT_RE = re.compile(r"^\d{1,5}$")
 _GPU_LIST_RE = re.compile(r"^\d+(?:,\d+)*$")
 # A download target directory. Absolute or ~-relative path; safe path glyphs
-# only (no quotes, shell metacharacters, or spaces) since it lands in a shell
-# command. A leading ~ is expanded to $HOME at command-build time.
-_LOCAL_DIR_RE = re.compile(r"^~?/[A-Za-z0-9._/-]*$|^~$")
+# only (no quotes or shell metacharacters). Spaces are allowed because command
+# builders pass the value through quoted shell/Python contexts. The character
+# class uses ``\w`` — Unicode word characters under Python 3's default str
+# matching — so non-ASCII folder names pass validation too: Cyrillic, accented
+# Latin, CJK, e.g. ``/Volumes/Модели`` or ``D:\AI Models\Модели``. This stays
+# shell-safe: none of ``; & | ` $ '' "" () {}`` newlines etc. are in ``[\w. -]``,
+# so injection vectors remain rejected. A leading ~ is expanded to $HOME at
+# command-build time. (Drive letters stay ASCII: ``[A-Za-z]:``.)
+_LOCAL_DIR_RE = re.compile(r"^~?(?:/[\w. -]*)+$|^~$")
+_WINDOWS_LOCAL_DIR_RE = re.compile(r"^[A-Za-z]:[\\/](?:[\w. -]+(?:[\\/][\w. -]+)*[\\/]?)?$")
 _WINDOWS_DRIVE_PATH_RE = re.compile(r"^[A-Za-z]:[\\/]")


@@ -97,9 +104,19 @@ def _validate_token(v: str | None) -> str | None:
 def _validate_local_dir(v: str | None) -> str | None:
    if v is None or v == "":
        return None
+    if len(v) >= 2 and v[0] == v[-1] and v[0] in {"'", '"'}:
+        v = v[1:-1]
    v = v.rstrip("/") or "/"
-    if not _LOCAL_DIR_RE.match(v):
-        raise HTTPException(400, "Invalid local_dir — must be an absolute or ~ path with no spaces or shell metacharacters")
+    if not (_LOCAL_DIR_RE.match(v) or _WINDOWS_LOCAL_DIR_RE.match(v)):
+        raise HTTPException(400, "Invalid local_dir — must be an absolute or ~ path with no shell metacharacters")
+    # Reject path segments that start with '-' (option injection). '-' is in the
+    # allowlist, so a dir like ``/models/-rf`` or ``D:\models\-rf`` could be read
+    # as a CLI flag by hf/etc. — and quoting does NOT stop a value from being
+    # parsed as an option. This is the one residual that command-build-time
+    # quoting can't cover, so the guard lives here, keeping the safety wholly
+    # inside the validator rather than relying on consumers.
+    if any(seg.startswith("-") for seg in re.split(r"[\\/]", v) if seg):
+        raise HTTPException(400, "Invalid local_dir — path segments cannot start with '-'")
    return v


@@ -125,7 +142,7 @@ def _validate_gpus(v: str | None) -> str | None:
 def _shell_path(p: str) -> str:
    """Render a validated path for a double-quoted shell context, expanding a
    leading ~ to $HOME (single quotes wouldn't expand it). Safe because
-    _validate_local_dir already restricts the charset."""
+    _validate_local_dir already rejects quotes and shell metacharacters."""
    if p == "~":
        return '"$HOME"'
    if p.startswith("~/"):
@@ -386,6 +403,7 @@ def _cached_model_scan_script(model_dirs: list[str] | None = None, add_hf_cache:
        "    for root, dirs, fns in safe_walk(base):",
        "        for fn in sorted(fns):",
        "            if not fn.lower().endswith('.gguf'): continue",
+        "            if fn.startswith('._'): continue  # macOS AppleDouble sidecar, not a real GGUF",
        "            fp = os.path.join(root, fn)",
        "            try: size = os.path.getsize(fp)",
        "            except Exception: size = 0",
@@ -283,6 +283,7 @@ _HOST_TO_CURATED = (
    ("fireworks.ai", "fireworks"),
    ("googleapis.com", "google"),
    ("x.ai", "xai"),
+    ("nvidia.com", "nvidia"),
    ("openrouter.ai", "openrouter"),
    ("ollama.com", "ollama"),
 )
@@ -477,10 +478,17 @@ _NON_CHAT_PREFIXES = (
    "dall-e", "tts-", "whisper", "text-embedding", "embedding",
    "davinci", "babbage", "moderation", "omni-moderation",
    "sora", "gpt-image", "chatgpt-image",
+    # embedding / retrieval / non-chat models (common across providers)
+    "snowflake/arctic-embed", "nvidia/nv-embed", "embed",
 )
 _NON_CHAT_CONTAINS = (
    "-realtime", "-transcribe", "-tts", "-codex",
-    "codex-",
+    "codex-", "content-safety", "-safety", "-reward", "nvclip",
+    "kosmos", "fuyu", "deplot", "vila", "neva",
+    "gliner", "riva", "-parse", "-embedqa", "-nemoretriever",
+    "topic-control", "calibration",
+    "ai-synthetic-video", "cosmos-reason2",
+    "bge", "llama-guard",
 )
 _NON_CHAT_EXACT_PREFIXES = (
    "gpt-audio",  # gpt-audio, gpt-audio-mini etc. (not gpt-4o-audio-preview which is chat)
@@ -731,7 +739,7 @@ def _probe_endpoint(base_url: str, api_key: str = None, timeout: int = 5) -> Lis
                for _e in _PROVIDER_CURATED.get(_ck, []):
                    if _e not in set(models) and not any(m.startswith(_e) for m in models):
                        models.append(_e)
-            return models
+            return [m for m in models if _is_chat_model(m)]
    except httpx.HTTPStatusError as e:
        if api_key:
            status = e.response.status_code if e.response is not None else "unknown"
@@ -755,7 +763,7 @@ def _probe_endpoint(base_url: str, api_key: str = None, timeout: int = 5) -> Lis
            data = r.json()
            models = [m.get("name") or m.get("model") for m in (data.get("models") or []) if m.get("name") or m.get("model")]
            if models:
-                return models
+                return [m for m in models if _is_chat_model(m)]
    except Exception as e:
        logger.debug(f"Ollama /api/tags probe failed for {base}: {e}")
    # Fall back to curated list if the provider has a URL-based match (e.g. z.ai has no /models endpoint)
@@ -855,7 +855,7 @@ def _build_system_prompt(
        _ov_sig = _hl.sha256(_json.dumps(get_builtin_overrides() or {}, sort_keys=True).encode()).hexdigest()
    except Exception:
        _ov_sig = ""
-    cache_key = (frozenset(disabled_tools or []), bool(mcp_mgr), needs_admin, _rt_key, compact, _ov_sig, suppress_local_context)
+    cache_key = (frozenset(disabled_tools or []), bool(mcp_mgr), needs_admin, _rt_key, compact, _ov_sig, owner, suppress_local_context)
    if _cached_base_prompt and _cached_base_prompt_key == cache_key and not active_document:
        agent_prompt = _cached_base_prompt
        # Skill index is user-editable (name + description), so it must never
@@ -863,7 +863,7 @@ def _build_system_prompt(
        # when the cache hits.
        _, _skill_index_block = _build_base_prompt(
            disabled_tools, mcp_mgr, needs_admin, relevant_tools,
-            mcp_disabled_map=mcp_disabled_map, compact=compact,
+            mcp_disabled_map=mcp_disabled_map, compact=compact, owner=owner,
            suppress_local_context=suppress_local_context,
        )
    else:
@@ -874,6 +874,7 @@ def _build_system_prompt(
            relevant_tools,
            mcp_disabled_map=mcp_disabled_map,
            compact=compact,
+            owner=owner,
            suppress_local_context=suppress_local_context,
        )
        if not active_document:
@@ -1246,6 +1247,7 @@ def _build_base_prompt(
    relevant_tools=None,
    mcp_disabled_map=None,
    compact: bool = False,
+    owner: Optional[str] = None,
    suppress_local_context: bool = False,
 ):
    """Build the agent prompt with only relevant tools included.
@@ -1299,7 +1301,7 @@ def _build_base_prompt(
            from src.constants import DATA_DIR
            _sm = SkillsManager(DATA_DIR)
            active_tools = list(set(TOOL_SECTIONS.keys()) - set(disabled or []))
-            skill_idx = _sm.index_for(owner=None, active_toolsets=active_tools)
+            skill_idx = _sm.index_for(owner=owner, active_toolsets=active_tools)
            if skill_idx:
                lines = ["## Available skills",
                         "Procedures the assistant should consult before doing domain work. "
@@ -1707,7 +1709,6 @@ async def stream_agent_loop(
    owner: Optional[str] = None,
    relevant_tools: Optional[Set[str]] = None,
    fallbacks: Optional[List[tuple]] = None,
-    workspace: Optional[str] = None,
    plan_mode: bool = False,
    approved_plan: Optional[str] = None,
    tool_policy: Optional[ToolPolicy] = None,
@@ -1935,27 +1936,6 @@ async def stream_agent_loop(
        owner=owner,
        suppress_local_context=guide_only,
    )
-    if workspace and not guide_only:
-        # PREPEND (not append) so it dominates the large base prompt — appended
-        # at the end, small models ignored it and asked the user for code. The
-        # folder IS the project; the agent must explore it, not ask.
-        _ws_note = (
-            f"## ACTIVE WORKSPACE — READ FIRST\n"
-            f"The user is working in this folder: {workspace}\n"
-            f"It IS the project. bash/python run with cwd set here and "
-            f"read_file/write_file are confined to it (paths outside are rejected).\n"
-            f"When the user says \"the code\" / \"this project\" / \"the workspace\" "
-            f"or asks to review/find/edit something WITHOUT a path, they mean THIS "
-            f"folder. Do NOT ask the user for code or a path, and do NOT read a file "
-            f"literally named \"workspace\". ALWAYS start by exploring it yourself: "
-            f"run `bash` → `git ls-files` (or `ls -R`) to see the files, then "
-            f"read_file the relevant ones by path RELATIVE to the workspace."
-        )
-        if messages and messages[0].get("role") == "system":
-            messages[0]["content"] = _ws_note + "\n\n" + (messages[0].get("content") or "")
-        else:
-            messages.insert(0, {"role": "system", "content": _ws_note})
-        logger.info("[workspace] active for this turn: %s", workspace)
    if plan_mode and not guide_only:
        # Steer the model to investigate-then-propose. Hard tool gating handles
        # every write path except shell; this directive is what keeps the
@@ -2649,7 +2629,6 @@ async def stream_agent_loop(
                            tool_policy=tool_policy,
                            owner=owner,
                            progress_cb=_push_progress,
-                            workspace=workspace,
                        )
                    finally:
                        # Sentinel so the drainer knows to stop.
@@ -196,13 +196,22 @@ def _get_or_reset_collection(chroma_client, name: str, metadata: Dict[str, Any],
        try:
            chroma_client.delete_collection(name)
            restored = chroma_client.get_or_create_collection(name=name, metadata=current)
-            old_embeddings = preserved.get("embeddings") or []
-            if ids and docs and old_embeddings:
+            # chromadb returns embeddings as a numpy ndarray, whose truth value
+            # is ambiguous — `preserved.get("embeddings") or []` and a bare
+            # `if ... and old_embeddings:` both raise ValueError, which aborts
+            # the restore and loses the rows the reset was supposed to keep.
+            # Use explicit None/len checks instead.
+            old_embeddings = preserved.get("embeddings")
+            if old_embeddings is None:
+                old_embeddings = []
+            if ids and docs and len(old_embeddings):
                for start in range(0, len(ids), 100):
                    batch_ids = ids[start:start + 100]
                    batch_docs = docs[start:start + 100]
                    batch_metas = metas[start:start + 100]
                    batch_embeddings = old_embeddings[start:start + 100]
+                    if hasattr(batch_embeddings, "tolist"):
+                        batch_embeddings = batch_embeddings.tolist()
                    if len(batch_metas) < len(batch_ids):
                        batch_metas += [{}] * (len(batch_ids) - len(batch_metas))
                    restored.add(
@@ -276,6 +276,24 @@ def _is_ollama_native_url(url: str) -> bool:
    return local_ollama_host and (path == "" or path == "/api" or path.startswith("/api/"))


+def _is_ollama_openai_compat_url(url: str) -> bool:
+    """Return True for local Ollama's OpenAI-compatible /v1 surface.
+
+    Mirrors the host detection used by ``_is_ollama_native_url`` so that the
+    two helpers stay in lockstep: a localhost Ollama on a non-default port
+    (custom ``OLLAMA_HOST``, reverse proxy, container port remap) is treated
+    the same way here as it is on the native ``/api`` path.
+    """
+    try:
+        parsed = urlparse(url or "")
+    except Exception:
+        return False
+    host = parsed.hostname or ""
+    path = (parsed.path or "").rstrip("/")
+    local_ollama_host = host in {"localhost", "127.0.0.1", "0.0.0.0", "::1"} or parsed.port == 11434
+    return local_ollama_host and (path == "/v1" or path.startswith("/v1/"))
+
+
 def _ollama_api_root(url: str) -> str:
    """Return a native Ollama API root such as https://ollama.com/api."""
    url = (url or "").strip().rstrip("/")
@@ -426,6 +444,8 @@ def _detect_provider(url: str) -> str:
        return "openrouter"
    if _host_match(url, "groq.com"):
        return "groq"
+    if _host_match(url, "nvidia.com"):
+        return "nvidia"
    from src.chatgpt_subscription import is_chatgpt_subscription_base
    if is_chatgpt_subscription_base(url):
        return "chatgpt-subscription"
@@ -471,6 +491,7 @@ def _provider_label(url: str) -> str:
    if is_copilot_base(url): return "GitHub Copilot"
    if _host_match(url, "mistral.ai"): return "Mistral"
    if _host_match(url, "deepseek.com"): return "DeepSeek"
+    if _host_match(url, "nvidia.com"): return "NVIDIA"
    if _host_match(url, "googleapis.com"): return "Google"
    if _host_match(url, "together.xyz", "together.ai"): return "Together"
    if _host_match(url, "fireworks.ai"): return "Fireworks"
@@ -1344,6 +1365,9 @@ async def llm_call_async(
        if max_tokens and max_tokens > 0:
            tok_key = "max_completion_tokens" if _uses_max_completion_tokens(model) else "max_tokens"
            payload[tok_key] = max_tokens
+        # Suppress thinking for qwen3/gemma4 on Ollama /v1 — same as stream_llm.
+        if _is_ollama_openai_compat_url(url) and _supports_thinking(model):
+            payload["think"] = False

    if _is_host_dead(target_url):
        raise HTTPException(503, f"Upstream {_host_key(target_url)} marked unreachable (cooldown active)")
@@ -1461,6 +1485,11 @@ async def stream_llm(url: str, model: str, messages: List[Dict], temperature: fl
            payload[tok_key] = max_tokens
        if tools:
            payload["tools"] = tools
+        # For Ollama's OpenAI-compat /v1 endpoint with thinking models (qwen3,
+        # gemma4, etc.), suppress thinking so tool calls aren't swallowed inside
+        # <think> blocks. Ollama /v1 accepts "think": false as a top-level param.
+        if _is_ollama_openai_compat_url(url) and _supports_thinking(model):
+            payload["think"] = False
        h = _provider_headers(provider, headers)
        if provider == "copilot":
            from src.copilot import apply_request_headers
@@ -67,13 +67,12 @@ def _unified_diff(old: str, new: str, path: str) -> Optional[Dict[str, Any]]:
    }


-async def _do_edit_file(content: str, workspace: Optional[str] = None) -> Dict[str, Any]:
+async def _do_edit_file(content: str) -> Dict[str, Any]:
    """Exact string-replacement edit of an on-disk file.

    content is JSON: {"path", "old_string", "new_string", "replace_all"?}.
    Fails if old_string is missing or non-unique (unless replace_all) so the
    model can't silently edit the wrong place. Returns a unified diff for the UI.
-    Confined to the workspace when one is set (same policy as write_file).
    """
    try:
        args = json.loads(content) if content.strip().startswith("{") else {}
@@ -85,11 +84,9 @@ async def _do_edit_file(content: str, workspace: Optional[str] = None) -> Dict[s
    replace_all = bool(args.get("replace_all", False))
    if not raw_path:
        return {"error": "edit_file: path required", "exit_code": 1}
-    # Confine to the workspace when set, else the same allowlist + sensitive-file
-    # policy as read/write_file.
+    # Allowlist + sensitive-file policy as read/write_file.
    try:
-        path = (_resolve_tool_path_in_workspace(workspace, raw_path)
-                if workspace else _resolve_tool_path(raw_path))
+        path = _resolve_tool_path(raw_path)
    except ValueError as e:
        return {"error": f"edit_file: {e}", "exit_code": 1}
    if old == "":
@@ -272,39 +269,6 @@ def _resolve_tool_path(raw_path: str) -> str:
    )


-def _resolve_tool_path_in_workspace(workspace: str, raw_path: str) -> str:
-    """Confine a model-supplied path to the active workspace.
-
-    Layered on top of upstream's path policy: the workspace is the allowed
-    root (relative paths resolve under it; paths that escape it are rejected),
-    and the sensitive-file deny list (.ssh, .gnupg, id_rsa, …) still applies
-    inside it. When no workspace is set, callers use _resolve_tool_path (the
-    default data/tmp allowlist) instead.
-    """
-    if raw_path is None or not str(raw_path).strip():
-        raise ValueError("path is required")
-    base = os.path.realpath(workspace)
-    expanded = os.path.expanduser(str(raw_path).strip())
-    candidate = expanded if os.path.isabs(expanded) else os.path.join(base, expanded)
-    resolved = os.path.realpath(candidate)
-    if _is_sensitive_path(resolved):
-        raise ValueError(
-            f"path '{raw_path}' is inside a sensitive directory "
-            f"(e.g. .ssh, .gnupg) or matches a sensitive filename"
-        )
-    if resolved != base:
-        # normcase so containment holds on case-insensitive filesystems
-        # (Windows, default macOS): it lowercases on Windows and is a no-op on
-        # POSIX. commonpath raises ValueError across Windows drives (C: vs D:)
-        # or mixed abs/rel — both mean "outside", so the except rejects them.
-        nbase = os.path.normcase(base)
-        try:
-            if os.path.commonpath([os.path.normcase(resolved), nbase]) != nbase:
-                raise ValueError
-        except ValueError:
-            raise ValueError(f"path '{raw_path}' is outside the workspace ({workspace})")
-    return resolved
-
 # Bash + python tools used to share a single 60s timeout. That's
 # enough for one-shot commands but starves real workloads (pip
 # install, ffmpeg conversions, etc.) — and worse, the agent saw the
@@ -341,19 +305,13 @@ _CODENAV_MAX_HITS = 200
 _CODENAV_MAX_LINE = 400


-def _resolve_search_root(raw_path: str, workspace: Optional[str] = None) -> str:
+def _resolve_search_root(raw_path: str) -> str:
    """Resolve + confine a code-nav path (grep/glob/ls).

-    With a workspace set, the workspace folder is the root and supplied paths are
-    confined inside it (same policy as read_file). Without one, an empty path
-    defaults to the agent's primary root (project data dir) and a supplied path
-    is confined by the global allowlist + sensitive-file policy.
+    An empty path defaults to the agent's primary root (project data dir) and a
+    supplied path is confined by the global allowlist + sensitive-file policy.
    """
    raw = (raw_path or "").strip()
-    if workspace:
-        if not raw:
-            return os.path.realpath(workspace)
-        return _resolve_tool_path_in_workspace(workspace, raw)
    if not raw:
        roots = _tool_path_roots()
        return roots[0] if roots else os.path.realpath(".")
@@ -564,12 +522,11 @@ async def _call_mcp_tool(
    tool: str,
    content: str,
    progress_cb: Optional[Callable[[Dict], Awaitable[None]]] = None,
-    workspace: Optional[str] = None,
 ) -> Dict:
    """Route a legacy tool call through the MCP manager, with direct fallbacks."""
    mcp = get_mcp_manager()
    if not mcp:
-        return await _direct_fallback(tool, content, progress_cb=progress_cb, workspace=workspace) or {"error": f"MCP manager not available for tool '{tool}'", "exit_code": 1}
+        return await _direct_fallback(tool, content, progress_cb=progress_cb) or {"error": f"MCP manager not available for tool '{tool}'", "exit_code": 1}

    server_id, tool_name = _MCP_TOOL_MAP[tool]
    qualified = f"mcp__{server_id}__{tool_name}"
@@ -578,7 +535,7 @@ async def _call_mcp_tool(

    # If MCP server not connected, try direct fallback
    if isinstance(result, dict) and result.get("exit_code") == 1 and "not connected" in result.get("error", ""):
-        fallback = await _direct_fallback(tool, content, progress_cb=progress_cb, workspace=workspace)
+        fallback = await _direct_fallback(tool, content, progress_cb=progress_cb)
        if fallback:
            return fallback

@@ -636,7 +593,6 @@ async def _direct_fallback(
    tool: str,
    content: str,
    progress_cb: Optional[Callable[[Dict], Awaitable[None]]] = None,
-    workspace: Optional[str] = None,
 ) -> Optional[Dict]:
    """In-process execution path for the eight tools that used to live as
    stdio MCP servers under mcp_servers/. Those servers were deleted in
@@ -670,7 +626,7 @@ async def _direct_fallback(
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
                env=_subproc_env,
-                cwd=workspace or _AGENT_WORKDIR,
+                cwd=_AGENT_WORKDIR,
            )
            stdout, stderr, rc, timed_out = await _run_subprocess_streaming(
                proc,
@@ -697,7 +653,7 @@ async def _direct_fallback(
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE,
                env=_subproc_env,
-                cwd=workspace or _AGENT_WORKDIR,
+                cwd=_AGENT_WORKDIR,
            )
            stdout, stderr, rc, timed_out = await _run_subprocess_streaming(
                proc,
@@ -727,8 +683,7 @@ async def _direct_fallback(
                except (json.JSONDecodeError, TypeError, ValueError):
                    pass
            try:
-                path = (_resolve_tool_path_in_workspace(workspace, raw_path)
-                        if workspace else _resolve_tool_path(raw_path))
+                path = _resolve_tool_path(raw_path)
            except ValueError as e:
                return {"error": f"read_file: {e}", "exit_code": 1}
            try:
@@ -771,8 +726,7 @@ async def _direct_fallback(
            raw_path = lines[0].strip()
            body = lines[1] if len(lines) > 1 else ""
            try:
-                path = (_resolve_tool_path_in_workspace(workspace, raw_path)
-                        if workspace else _resolve_tool_path(raw_path))
+                path = _resolve_tool_path(raw_path)
            except ValueError as e:
                return {"error": f"write_file: {e}", "exit_code": 1}
            try:
@@ -825,7 +779,7 @@ async def _direct_fallback(
                max_hits = _CODENAV_MAX_HITS
            max_hits = max(1, min(max_hits, _CODENAV_MAX_HITS))
            try:
-                root = _resolve_search_root(str(args.get("path", "")), workspace)
+                root = _resolve_search_root(str(args.get("path", "")))
            except ValueError as e:
                return {"error": f"grep: {e}", "exit_code": 1}

@@ -909,7 +863,7 @@ async def _direct_fallback(
            if not pattern:
                return {"error": "glob: pattern is required", "exit_code": 1}
            try:
-                root = _resolve_search_root(str(args.get("path", "")), workspace)
+                root = _resolve_search_root(str(args.get("path", "")))
            except ValueError as e:
                return {"error": f"glob: {e}", "exit_code": 1}

@@ -956,7 +910,7 @@ async def _direct_fallback(
            else:
                raw_path = _s.split("\n", 1)[0].strip()
            try:
-                root = _resolve_search_root(raw_path, workspace)
+                root = _resolve_search_root(raw_path)
            except ValueError as e:
                return {"error": f"ls: {e}", "exit_code": 1}

@@ -1121,7 +1075,6 @@ async def execute_tool_block(
    tool_policy: Optional[ToolPolicy] = None,
    owner: Optional[str] = None,
    progress_cb: Optional[Callable[[Dict], Awaitable[None]]] = None,
-    workspace: Optional[str] = None,
 ) -> Tuple[str, Dict]:
    """Execute a single tool block. Returns (description, result_dict).

@@ -1296,7 +1249,7 @@ async def execute_tool_block(
        _is_bg, _bg_cmd = _split_bg_marker(content)
        if _is_bg and _bg_cmd:
            from src import bg_jobs
-            rec = bg_jobs.launch(_bg_cmd, session_id=session_id, cwd=workspace or _AGENT_WORKDIR)
+            rec = bg_jobs.launch(_bg_cmd, session_id=session_id, cwd=_AGENT_WORKDIR)
            short = _bg_cmd.strip().split(chr(10))[0][:80]
            desc = f"bash (background): {short}"
            result = {
@@ -1318,13 +1271,12 @@ async def execute_tool_block(
    if tool in _MCP_TOOL_MAP:
        first_line = content.split(chr(10))[0][:80]
        desc = f"{tool}: {first_line}"
-        result = await _call_mcp_tool(tool, content, progress_cb=progress_cb, workspace=workspace)
+        result = await _call_mcp_tool(tool, content, progress_cb=progress_cb)
    elif tool in ("grep", "glob", "ls"):
        # Code-navigation tools — no MCP server; run the direct implementation.
-        # Confined to the workspace when one is set (same policy as read_file).
        first_line = content.split(chr(10))[0][:80]
        desc = f"{tool}: {first_line}"
-        result = await _direct_fallback(tool, content, progress_cb=progress_cb, workspace=workspace) \
+        result = await _direct_fallback(tool, content, progress_cb=progress_cb) \
            or {"error": f"{tool}: execution failed", "exit_code": 1}
    elif tool == "create_document":
        title = content.split("\n")[0].strip()[:60]
@@ -1429,7 +1381,7 @@ async def execute_tool_block(
        desc = "edit_image"
        result = await do_edit_image(content, owner=owner)
    elif tool == "edit_file":
-        result = await _do_edit_file(content, workspace=workspace)
+        result = await _do_edit_file(content)
        desc = result.get("output") or result.get("error") or "edit_file"
    elif tool == "trigger_research":
        desc = "trigger_research"
@@ -2095,6 +2095,7 @@
                  <option value="https://opencode.ai/zen/v1" data-logo="opencode">OpenCode Zen</option>
                  <option value="https://opencode.ai/zen/go/v1" data-logo="opencode">OpenCode Go</option>
                  <option value="https://api.z.ai/api/coding/paas/v4" data-logo="zhipu">Z.AI Coding Plan</option>
+                  <option value="https://integrate.api.nvidia.com/v1" data-logo="nvidia">NVIDIA</option>
                </select>
                <!-- API key row stays in DOM, hidden until Key button is
                     clicked. Mirrors the Local section pattern: most users
@@ -118,6 +118,7 @@ const _ENDPOINT_LABELS = [
  [/(^|\.)together\.(ai|xyz)$/i, "Together"],
  [/(^|\.)fireworks\.ai$/i, "Fireworks"],
  [/(^|\.)perplexity\.ai$/i, "Perplexity"],
+  [/(^|\.)nvidia\.com$/i, "NVIDIA"],
  [/(^|\.)x\.ai$/i, "xAI"],
 ];

@@ -43,6 +43,7 @@ const PROVIDER_PATTERNS = [
  { re: /^gsk_/,             name: 'Groq',       url: 'https://api.groq.com/openai/v1' },
  { re: /^AIza/,             name: 'Gemini',     url: 'https://generativelanguage.googleapis.com/v1beta/openai' },
  { re: /^xai-/,             name: 'xAI',        url: 'https://api.x.ai/v1' },
+  { re: /^nvapi-/,           name: 'NVIDIA',     url: 'https://integrate.api.nvidia.com/v1' },
 ];
 const SETUP_PROVIDER_URLS = {
  deepseek: { name: 'DeepSeek', url: 'https://api.deepseek.com/v1' },
@@ -56,8 +57,9 @@ const SETUP_PROVIDER_URLS = {
  google: { name: 'Gemini', url: 'https://generativelanguage.googleapis.com/v1beta/openai' },
  'opencode-zen': { name: 'OpenCode Zen', url: 'https://opencode.ai/zen/v1' },
  'opencode-go': { name: 'OpenCode Go', url: 'https://opencode.ai/zen/go/v1' },
+  nvidia: { name: 'NVIDIA', url: 'https://integrate.api.nvidia.com/v1' },
 };
-const SETUP_PROVIDER_NAMES = ['deepseek', 'openai', 'openrouter', 'ollama', 'xai', 'anthropic', 'groq', 'gemini', 'opencode-zen', 'opencode-go'];
+const SETUP_PROVIDER_NAMES = ['deepseek', 'openai', 'openrouter', 'ollama', 'xai', 'anthropic', 'groq', 'gemini', 'opencode-zen', 'opencode-go', 'nvidia'];
 const SETUP_DEVICE_AUTH_PROVIDERS = [
  { key: 'copilot', name: 'GitHub Copilot', aliases: ['github'], command: '/setup copilot' },
  { key: 'chatgpt-subscription', name: 'ChatGPT Subscription', aliases: ['chatgptsubscription', 'chatgpt-sub', 'codex'], command: '/setup chatgpt-subscription' },
@@ -97,6 +99,7 @@ function _setupProviderFromInput(input) {
    google: 'gemini',
    xai: 'xai',
    grok: 'xai',
+    nvidia: 'nvidia',
  };
  return SETUP_PROVIDER_URLS[aliases[raw] || raw] || null;
 }
@@ -124,6 +127,7 @@ function _extractSetupProviderCredential(input) {
    ['groq', 'groq'],
    ['google', 'gemini'], ['gemini', 'gemini'],
    ['x ai', 'xai'], ['xai', 'xai'], ['grok', 'xai'],
+    ['nvidia', 'nvidia'],
  ];
  for (const [alias, key] of providerAliases) {
    const re = new RegExp('(^|\\s|[,;:])(' + alias.replace(/\s+/g, '\\s+') + ')(?=$|\\s|[,;:])', 'i');
@@ -0,0 +1,112 @@
+"""Regression test for routes/backup_routes.py import_data skills dedup.
+
+BUG: the skills import block deduplicates against EVERY tenant's skills
+(skills_manager.load_all()) instead of the importing user's own skills.
+So importing your own backup silently drops any skill whose title (or id)
+collides with ANOTHER user's skill — the same cross-tenant data-loss bug
+that was already fixed for memories in the block just above.
+"""
+import pytest
+
+from fastapi import FastAPI, Request
+from fastapi.testclient import TestClient
+import routes.backup_routes as backup_routes
+from routes.backup_routes import setup_backup_routes
+
+# require_admin / get_current_user are bound into routes.backup_routes at import
+# time (`from x import name`). We patch them on that module directly per-test
+# via monkeypatch — robust to import order and reverted at teardown. (Stubbing
+# them through sys.modules only works if backup_routes has not been imported
+# yet, which is not guaranteed in a full-suite run.)
+
+
+class FakeMemoryManager:
+    def __init__(self):
+        self.rows = []
+
+    def load(self, owner=None):
+        return [r for r in self.rows if r.get("owner") == owner]
+
+    def load_all(self):
+        return list(self.rows)
+
+    def save(self, rows):
+        self.rows = list(rows)
+
+
+class FakePresetManager:
+    def get_all(self):
+        return {}
+
+    def save(self, d):
+        pass
+
+
+class FakeSkillsManager:
+    """Mimics services.memory.skills: load_all() = all owners,
+    load(owner) = that owner's skills only."""
+
+    def __init__(self, rows):
+        self.rows = list(rows)
+
+    def load(self, owner=None):
+        return [s for s in self.rows if s.get("owner") == owner]
+
+    def load_all(self):
+        return list(self.rows)
+
+    def save(self, rows):
+        self.rows = list(rows)
+
+    def add_skill(self, title=None, name=None, owner=None, **kwargs):
+        # Mirrors services.memory.skills.add_skill: persists a SKILL.md row and
+        # returns its identity. source="user" skips auto-dedup, so no _deduped.
+        entry = {"id": f"new-{len(self.rows)}", "title": title, "name": name, "owner": owner}
+        self.rows.append(entry)
+        return {"name": name, "id": entry["id"]}
+
+
+def _make_client(skills_mgr, monkeypatch):
+    # Bypass the admin gate and read the importer straight off request.state.
+    monkeypatch.setattr(backup_routes, "require_admin", lambda *a, **k: None)
+    monkeypatch.setattr(backup_routes, "get_current_user",
+                        lambda req: getattr(req.state, "user", None))
+    app = FastAPI()
+
+    @app.middleware("http")
+    async def _set_user(request: Request, call_next):
+        request.state.user = "alice"
+        return await call_next(request)
+
+    router = setup_backup_routes(FakeMemoryManager(), FakePresetManager(), skills_mgr)
+    app.include_router(router)
+    return TestClient(app)
+
+
+def test_import_skill_not_dropped_by_other_users_title_collision(monkeypatch):
+    # Bob already owns a skill titled "Deploy". Alice (the importer) has none.
+    skills_mgr = FakeSkillsManager([
+        {"id": "bob-1", "title": "Deploy", "name": "Deploy", "owner": "bob"},
+    ])
+    client = _make_client(skills_mgr, monkeypatch)
+
+    # Alice imports HER OWN backup containing a skill also titled "Deploy".
+    payload = {
+        "skills": [
+            {"id": "alice-1", "title": "Deploy", "name": "Deploy"},
+        ],
+    }
+    resp = client.post("/api/import", json=payload)
+    assert resp.status_code == 200, resp.text
+
+    # Alice's skill must have been imported and assigned to her.
+    alice_skills = skills_mgr.load(owner="alice")
+    titles = {s["title"] for s in alice_skills}
+    assert "Deploy" in titles, (
+        "Alice's own 'Deploy' skill was silently dropped because Bob owns a "
+        "skill with the same title (cross-tenant dedup bug)."
+    )
+
+
+if __name__ == "__main__":
+    raise SystemExit(pytest.main([__file__, "-v"]))
@@ -22,10 +22,12 @@ from routes.cookbook_helpers import (
    _user_shell_path_bootstrap,
    _venv_safe_local_pip_install_cmd,
    _validate_gpus,
+    _validate_local_dir,
    _validate_repo_id,
    _validate_serve_cmd,
    _validate_serve_model_id,
    _validate_ssh_port,
+    _shell_path,
    run_ssh_command_async,
 )

@@ -110,6 +112,89 @@ def test_validate_ssh_port_rejects_shell_payload():
    assert _validate_ssh_port("2222") == "2222"


+def test_validate_local_dir_accepts_external_drive_paths_with_spaces():
+    path = "/Volumes/T7 2TB/AI Models/llamacpp"
+
+    assert _validate_local_dir(path) == path
+    assert _validate_local_dir(f'"{path}"') == path
+    assert _shell_path(f"{path}/Qwen3-8B") == '"/Volumes/T7 2TB/AI Models/llamacpp/Qwen3-8B"'
+
+
+def test_validate_local_dir_accepts_windows_drive_paths_with_spaces():
+    backslash_path = r"D:\AI Models\llamacpp"
+    slash_path = "D:/AI Models/llamacpp"
+
+    assert _validate_local_dir(backslash_path) == backslash_path
+    assert _validate_local_dir(f"'{backslash_path}'") == backslash_path
+    assert _validate_local_dir(slash_path) == slash_path
+    assert _shell_path(backslash_path + r"\Qwen3-8B") == '"D:\\AI Models\\llamacpp\\Qwen3-8B"'
+
+
+def test_validate_local_dir_still_rejects_shell_metacharacters():
+    for path in [
+        "/Volumes/T7 2TB/AI Models; touch /tmp/pwned",
+        "/Volumes/T7 2TB/AI Models/$(touch pwned)",
+        "/Volumes/T7 2TB/AI Models/`touch pwned`",
+        "/Volumes/T7 2TB/AI Models/model\nnext",
+    ]:
+        with pytest.raises(HTTPException):
+            _validate_local_dir(path)
+
+
+def test_validate_local_dir_rejects_windows_shell_metacharacters():
+    for path in [
+        r"D:\AI Models\llamacpp; touch C:\pwned",
+        r"D:\AI Models\llamacpp\$(touch pwned)",
+        r"D:\AI Models\llamacpp\`touch pwned`",
+        "D:\\AI Models\\llamacpp\nnext",
+    ]:
+        with pytest.raises(HTTPException):
+            _validate_local_dir(path)
+
+
+def test_validate_local_dir_accepts_non_ascii_unicode_paths():
+    # Folder names are routinely non-ASCII on localized systems; the validator
+    # must accept them the same way it accepts spaces (see issue: spaces AND
+    # non-ASCII chars were both rejected by the old ASCII-only allowlist).
+    for path in [
+        "/Volumes/Модели/llamacpp",   # Cyrillic (POSIX / external drive)
+        "/home/josé/models",          # accented Latin
+        "/Volumes/モデル/llm",         # CJK
+        r"D:\AI Models\Модели",       # Cyrillic (Windows drive path)
+    ]:
+        assert _validate_local_dir(path) == path
+
+
+def test_validate_local_dir_rejects_metacharacters_in_unicode_paths():
+    # Widening the allowlist to Unicode must not reopen the injection surface:
+    # shell metacharacters stay rejected even alongside non-ASCII segments.
+    for path in [
+        "/Volumes/Модели; touch /tmp/pwned",
+        "/Volumes/Модели/$(touch pwned)",
+        "/Volumes/Модели/`touch pwned`",
+        "/Volumes/Модели/a|b",
+        "/Volumes/Модели\nnext",
+        r"D:\Модели\llamacpp & calc.exe",
+    ]:
+        with pytest.raises(HTTPException):
+            _validate_local_dir(path)
+
+
+def test_validate_local_dir_rejects_leading_dash_segments():
+    # A path segment starting with '-' could be parsed as a CLI option by hf/etc.
+    # (option injection) even when quoted, since quoting doesn't stop a value from
+    # being read as a flag. The validator must reject it on every platform.
+    for path in [
+        "/models/-rf",
+        "/models/-rf/llamacpp",
+        "/-oStrictHostKeyChecking=no",
+        r"D:\models\-rf",
+        "D:/models/-rf",
+    ]:
+        with pytest.raises(HTTPException):
+            _validate_local_dir(path)
+
+
 def test_validate_gpus_accepts_indexes_only():
    assert _validate_gpus("0,1,2") == "0,1,2"
    with pytest.raises(HTTPException):
@@ -0,0 +1,68 @@
+"""Embedding-lane reset must restore rows even when chromadb returns the
+preserved embeddings as a numpy ndarray.
+
+Real chromadb returns collection.get(include=["embeddings"]) as a numpy
+ndarray. The restore-after-failed-rewrite path used `embeddings or []` and a
+bare `if ... and embeddings:`, both of which raise
+"truth value of an array ... is ambiguous" on an ndarray — aborting the
+restore and wiping the collection the reset was meant to preserve.
+
+This mirrors test_lane_reset_restores_existing_collection_when_rewrite_fails
+in test_embedding_lanes.py, but the preserved embeddings come back as ndarray.
+"""
+import numpy as np
+
+from src.embedding_lanes import build_embedding_lanes
+from tests.test_embedding_lanes import FakeChroma, FakeEmbedder, _patch_chroma
+
+
+def test_lane_reset_restores_when_chroma_returns_numpy_embeddings(monkeypatch):
+    fake = FakeChroma()
+    old_custom = fake.get_or_create_collection(
+        "odysseus_memories_custom",
+        metadata={
+            "embedding_lane": "custom",
+            "embedding_dimension": 384,
+            "embedding_fingerprint": "old",
+        },
+    )
+    old_custom.add(
+        ids=["existing-memory"],
+        embeddings=[[0.0] * 384],
+        documents=["existing custom memory"],
+        metadatas=[{"source": "memory"}],
+    )
+
+    # Make the preserved embeddings come back as a numpy ndarray, like real
+    # chromadb does.
+    real_get = old_custom.get
+
+    def ndarray_get(*args, **kwargs):
+        result = real_get(*args, **kwargs)
+        result["embeddings"] = np.array(result["embeddings"])
+        return result
+
+    old_custom.get = ndarray_get
+
+    # Force the post-reset rewrite to fail so the restore branch runs.
+    fake.fail_next_add_for["odysseus_memories_custom"] = 1
+    _patch_chroma(monkeypatch, fake)
+
+    import src.embedding_lanes as lanes
+
+    monkeypatch.setattr(lanes, "_build_custom_client", lambda: FakeEmbedder(768, "nomic", "http://embeddings/v1"))
+
+    def fail_fastembed():
+        raise RuntimeError("fastembed missing")
+
+    monkeypatch.setattr(lanes, "_build_fastembed_client", fail_fastembed)
+
+    built = build_embedding_lanes("odysseus_memories")
+
+    # Both lanes are unavailable, but the existing row must survive — not be
+    # wiped by an ndarray-truthiness crash in the restore path.
+    assert built == []
+    restored = fake.collections["odysseus_memories_custom"]
+    assert restored.count() == 1
+    assert restored.get()["ids"] == ["existing-memory"]
+    assert len(restored.rows["existing-memory"]["embedding"]) == 384
@@ -0,0 +1,165 @@
+"""Tests for Ollama /v1 thinking-suppression helpers.
+
+Covers:
+- _is_ollama_openai_compat_url: URL classification (local host + /v1 path)
+- think: false is injected into the payload for Ollama /v1 thinking models
+- think: false is NOT injected for non-thinking models or non-Ollama /v1 endpoints
+"""
+import asyncio
+import json
+
+from src import llm_core
+
+
+# ---------------------------------------------------------------------------
+# Fake HTTP client — captures the outgoing payload without network I/O
+# ---------------------------------------------------------------------------
+
+class _FakeResp:
+    status_code = 200
+
+    async def aiter_lines(self):
+        # Yield a minimal done event so stream_llm exits cleanly
+        yield json.dumps({"choices": [{"delta": {"content": "ok"}, "finish_reason": "stop"}]})
+        yield "data: [DONE]"
+
+    async def aread(self):
+        return b""
+
+
+class _FakeStreamCtx:
+    def __init__(self, captured):
+        self._captured = captured
+
+    async def __aenter__(self):
+        return _FakeResp()
+
+    async def __aexit__(self, *a):
+        return False
+
+
+class _FakeClient:
+    """Minimal stand-in for httpx.AsyncClient that captures request payload."""
+
+    def __init__(self):
+        self.captured_payload = {}
+
+    def stream(self, method, url, **kw):
+        self.captured_payload = kw.get("json") or {}
+        return _FakeStreamCtx(self.captured_payload)
+
+
+def _capture_payload(monkeypatch, url, model):
+    """Run stream_llm, intercept the HTTP payload, and return it."""
+    client = _FakeClient()
+    monkeypatch.setattr(llm_core, "_get_http_client", lambda: client)
+    monkeypatch.setattr(llm_core, "_is_host_dead", lambda u: False)
+    monkeypatch.setattr(llm_core, "note_model_activity", lambda *a, **k: None)
+    monkeypatch.setattr(llm_core, "_clear_host_dead", lambda *a, **k: None)
+    monkeypatch.setattr(llm_core, "get_context_length", lambda u, m: 32768)
+
+    async def run():
+        return [c async for c in llm_core.stream_llm(
+            url, model, [{"role": "user", "content": "hi"}],
+        )]
+
+    asyncio.run(run())
+    return client.captured_payload
+
+
+# ---------------------------------------------------------------------------
+# _is_ollama_openai_compat_url — pure function, no I/O
+# ---------------------------------------------------------------------------
+
+class TestIsOllamaOpenAICompatUrl:
+    """Unit tests for the URL classifier that gates think-suppression."""
+
+    # Positive cases — should be True
+    def test_default_port_v1_root(self):
+        assert llm_core._is_ollama_openai_compat_url("http://127.0.0.1:11434/v1")
+
+    def test_default_port_chat_completions(self):
+        assert llm_core._is_ollama_openai_compat_url("http://127.0.0.1:11434/v1/chat/completions")
+
+    def test_localhost_default_port(self):
+        assert llm_core._is_ollama_openai_compat_url("http://localhost:11434/v1")
+
+    def test_localhost_default_port_with_path(self):
+        assert llm_core._is_ollama_openai_compat_url("http://localhost:11434/v1/chat/completions")
+
+    def test_loopback_ipv6(self):
+        # IPv6 addresses in URLs require square brackets per RFC 3986
+        assert llm_core._is_ollama_openai_compat_url("http://[::1]:11434/v1")
+
+    def test_any_local_non_default_port(self):
+        """Localhost on a non-default port (custom OLLAMA_HOST) must also match."""
+        assert llm_core._is_ollama_openai_compat_url("http://127.0.0.1:11435/v1")
+
+    def test_localhost_non_default_port(self):
+        assert llm_core._is_ollama_openai_compat_url("http://localhost:8080/v1/chat/completions")
+
+    def test_zero_dot_zero_host(self):
+        assert llm_core._is_ollama_openai_compat_url("http://0.0.0.0:11434/v1")
+
+    # Negative cases — should be False
+    def test_openai_api_v1(self):
+        """Real OpenAI endpoint must never match, even though path is /v1."""
+        assert not llm_core._is_ollama_openai_compat_url("https://api.openai.com/v1")
+
+    def test_openai_chat_completions(self):
+        assert not llm_core._is_ollama_openai_compat_url("https://api.openai.com/v1/chat/completions")
+
+    def test_ollama_native_api_path(self):
+        """The native /api path is a different surface and must not match /v1."""
+        assert not llm_core._is_ollama_openai_compat_url("http://localhost:11434/api")
+
+    def test_ollama_native_api_chat(self):
+        assert not llm_core._is_ollama_openai_compat_url("http://localhost:11434/api/chat")
+
+    def test_remote_openrouter(self):
+        assert not llm_core._is_ollama_openai_compat_url("https://openrouter.ai/api/v1")
+
+    def test_empty_string(self):
+        assert not llm_core._is_ollama_openai_compat_url("")
+
+    def test_none_like_empty(self):
+        assert not llm_core._is_ollama_openai_compat_url(None)  # type: ignore[arg-type]
+
+
+# ---------------------------------------------------------------------------
+# Payload injection — think: false only when both conditions hold
+# ---------------------------------------------------------------------------
+
+class TestThinkSuppression:
+    """Assert think:false is present/absent in the outgoing HTTP payload."""
+
+    def test_think_false_for_ollama_v1_thinking_model(self, monkeypatch):
+        """think:false must be set for qwen3 on Ollama /v1."""
+        payload = _capture_payload(
+            monkeypatch, "http://127.0.0.1:11434/v1/chat/completions", "qwen3:14b"
+        )
+        assert payload.get("think") is False
+
+    def test_no_think_for_ollama_v1_non_thinking_model(self, monkeypatch):
+        """think must NOT be set for a plain (non-thinking) model on Ollama /v1."""
+        payload = _capture_payload(
+            monkeypatch, "http://127.0.0.1:11434/v1/chat/completions", "llama3.2:3b"
+        )
+        assert "think" not in payload
+
+    def test_no_think_for_openai_endpoint_with_thinking_model_name(self, monkeypatch):
+        """think must NOT leak to a real OpenAI endpoint even if the model name
+        matches a thinking pattern — the URL guard is what matters."""
+        payload = _capture_payload(
+            monkeypatch, "https://api.openai.com/v1/chat/completions", "qwen3:14b"
+        )
+        assert "think" not in payload
+
+    def test_think_false_for_non_default_port_thinking_model(self, monkeypatch):
+        """Custom-port localhost Ollama (e.g. OLLAMA_HOST=0.0.0.0:11435) must
+        also receive think:false — this is the regression guarded by the
+        host-set check added in this fix."""
+        payload = _capture_payload(
+            monkeypatch, "http://127.0.0.1:11435/v1/chat/completions", "qwen3:14b"
+        )
+        assert payload.get("think") is False
@@ -347,6 +347,8 @@ class TestIsChatModel:
        "gpt-4o", "gpt-4o-mini", "claude-sonnet-4", "llama-3.3-70b",
        "deepseek-chat", "gemini-2.0-flash", "o3",
        "llama-4-scout-17b-16e-instruct",
+        "gemma-2b-it", "google/gemma-2b-it",
+        "bigcode/starcoder2-15b-instruct",
    ])
    def test_chat_models(self, model_id):
        assert _is_chat_model(model_id) is True
@@ -40,6 +40,7 @@ class TestDetectProvider:
        ("https://anthropic.com/v1", "anthropic"),
        ("https://openrouter.ai/api/v1", "openrouter"),
        ("https://api.groq.com/openai/v1", "groq"),
+        ("https://integrate.api.nvidia.com/v1", "nvidia"),
        ("http://localhost:11434/api", "ollama"),
        ("https://ollama.com", "ollama"),
        # xAI, DeepSeek and Gemini's OpenAI-compatible surface are NOT
@@ -84,6 +85,7 @@ class TestProviderLabel:
        ("https://api.openai.com/v1", "OpenAI"),
        ("https://openrouter.ai/api/v1", "OpenRouter"),
        ("https://api.groq.com/openai/v1", "Groq"),
+        ("https://integrate.api.nvidia.com/v1", "NVIDIA"),
        ("https://api.mistral.ai/v1", "Mistral"),
        ("https://api.deepseek.com", "DeepSeek"),
        ("https://generativelanguage.googleapis.com/v1beta/openai", "Google"),
@@ -50,6 +50,9 @@ PROVIDER_CASES = [
    ("groq", "https://api.groq.com/openai/v1",
     "https://api.groq.com/openai/v1/chat/completions",
     "https://api.groq.com/openai/v1/models"),
+    ("nvidia", "https://integrate.api.nvidia.com/v1",
+     "https://integrate.api.nvidia.com/v1/chat/completions",
+     "https://integrate.api.nvidia.com/v1/models"),
    ("xai", "https://api.x.ai/v1",
     "https://api.x.ai/v1/chat/completions",
     "https://api.x.ai/v1/models"),
@@ -112,6 +115,7 @@ def test_headers_anthropic_without_key_still_sends_version():
    "https://api.x.ai/v1",
    "https://api.deepseek.com",
    "https://api.groq.com/openai/v1",
+    "https://integrate.api.nvidia.com/v1",
    "https://generativelanguage.googleapis.com/v1beta/openai",
 ])
 def test_headers_openai_style_use_bearer(base):
@@ -0,0 +1,384 @@
+"""Renaming a user must update all three owner caches, not just the SQL DB.
+
+The DB owner-rename loop in the rename_user route updates every SQL-backed
+owner column, but three file-backed / in-memory stores are left stale:
+
+1. session_manager.sessions  — in-memory session objects carry s.owner set at
+   load time; get_sessions_for_user does an exact `s.owner == username` check,
+   so the renamed user's sidebar empties until a server restart.
+
+2. data/deep_research/*.json  — each report JSON has an `owner` field;
+   research_routes filters by `d.get("owner") == user`, making every report
+   invisible after rename.
+
+3. data/memory.json  — a flat array where every entry has an `owner` field;
+   memory_manager.load(owner=user) filters on it, so all memories vanish.
+
+Regression coverage: these bugs are invisible in unit tests that mock the DB
+loop but don't exercise the file/cache patches added to the route.
+"""
+import asyncio
+import json
+import sys
+import types
+from pathlib import Path
+from types import SimpleNamespace
+from unittest.mock import MagicMock
+
+import pytest
+
+
+def _route(router, name):
+    for r in router.routes:
+        if getattr(getattr(r, "endpoint", None), "__name__", "") == name:
+            return r.endpoint
+    raise AssertionError(name)
+
+
+@pytest.fixture
+def rename_endpoint(monkeypatch, tmp_path):
+    import routes.auth_routes as ar
+    import core.database as cdb
+
+    # Neutralize the DB owner-rename loop.
+    monkeypatch.setattr(cdb, "SessionLocal", lambda: MagicMock())
+    monkeypatch.setattr(cdb, "Base", SimpleNamespace(registry=SimpleNamespace(mappers=[])), raising=False)
+    # Neutralize the JSON-prefs rename.
+    pr = types.ModuleType("routes.prefs_routes")
+    pr._load = lambda: {}
+    pr._save = lambda d: None
+    monkeypatch.setitem(sys.modules, "routes.prefs_routes", pr)
+    # Patch the module-level constants so file-update steps write to tmp_path.
+    # (Patching sc.DATA_DIR wouldn't work — auth_routes binds DEEP_RESEARCH_DIR
+    # and MEMORY_FILE at import time, so we must patch those names on the module.)
+    monkeypatch.setattr(ar, "DEEP_RESEARCH_DIR", str(tmp_path / "deep_research"))
+    monkeypatch.setattr(ar, "MEMORY_FILE", str(tmp_path / "memory.json"))
+    monkeypatch.setattr(ar, "SKILLS_DIR", str(tmp_path / "skills"))
+
+    am = MagicMock()
+    am.is_admin.return_value = True
+    am.get_username_for_token.return_value = "admin"
+    am.users = {"alice": {}}
+    am.rename_user.return_value = True
+    return _route(ar.setup_auth_routes(am), "rename_user"), am, tmp_path
+
+
+def _request(tmp_path, session_manager=None):
+    state = SimpleNamespace(
+        invalidate_token_cache=lambda: None,
+        session_manager=session_manager,
+    )
+    return SimpleNamespace(
+        cookies={"odysseus_session": "t"},
+        app=SimpleNamespace(state=state),
+        state=SimpleNamespace(current_user="admin"),
+    )
+
+
+# ---------------------------------------------------------------------------
+# 1. In-memory session cache
+# ---------------------------------------------------------------------------
+
+def test_rename_updates_in_memory_session_owner(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    # Build a fake session_manager with one session owned by alice.
+    sess = SimpleNamespace(owner="alice")
+    sm = SimpleNamespace(sessions={"s1": sess})
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path, sm)))
+
+    assert sess.owner == "alice2", "in-memory session owner was not updated on rename"
+
+
+def test_rename_session_owner_case_insensitive(rename_endpoint):
+    """Stored owner 'Alice' (mixed case) must match rename of 'alice'."""
+    endpoint, _am, tmp_path = rename_endpoint
+
+    sess = SimpleNamespace(owner="Alice")
+    sm = SimpleNamespace(sessions={"s1": sess})
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="bob"), _request(tmp_path, sm)))
+
+    assert sess.owner == "bob"
+
+
+def test_rename_leaves_other_sessions_untouched(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    sess_alice = SimpleNamespace(owner="alice")
+    sess_other = SimpleNamespace(owner="carol")
+    sm = SimpleNamespace(sessions={"s1": sess_alice, "s2": sess_other})
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path, sm)))
+
+    assert sess_alice.owner == "alice2"
+    assert sess_other.owner == "carol", "unrelated session owner was modified"
+
+
+def test_rename_no_session_manager_does_not_crash(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+    # app.state without a session_manager must not raise.
+    req = SimpleNamespace(
+        cookies={"odysseus_session": "t"},
+        app=SimpleNamespace(state=SimpleNamespace(invalidate_token_cache=lambda: None)),
+        state=SimpleNamespace(current_user="admin"),
+    )
+    res = asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), req))
+    assert res["ok"] is True
+
+
+# ---------------------------------------------------------------------------
+# 2. deep_research JSON files
+# ---------------------------------------------------------------------------
+
+def test_rename_updates_research_json_owner(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    dr_dir = tmp_path / "deep_research"
+    dr_dir.mkdir()
+    report = {"query": "test", "owner": "alice", "status": "done"}
+    p = dr_dir / "abc123.json"
+    p.write_text(json.dumps(report), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    updated = json.loads(p.read_text(encoding="utf-8"))
+    assert updated["owner"] == "alice2", "deep_research JSON owner was not updated on rename"
+
+
+def test_rename_research_json_case_insensitive(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    dr_dir = tmp_path / "deep_research"
+    dr_dir.mkdir()
+    p = (dr_dir / "r1.json")
+    p.write_text(json.dumps({"owner": "Alice"}), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="bob"), _request(tmp_path)))
+
+    assert json.loads(p.read_text())["owner"] == "bob"
+
+
+def test_rename_leaves_other_research_untouched(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    dr_dir = tmp_path / "deep_research"
+    dr_dir.mkdir()
+    p_alice = dr_dir / "a.json"
+    p_carol = dr_dir / "c.json"
+    p_alice.write_text(json.dumps({"owner": "alice"}), encoding="utf-8")
+    p_carol.write_text(json.dumps({"owner": "carol"}), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    assert json.loads(p_alice.read_text())["owner"] == "alice2"
+    assert json.loads(p_carol.read_text())["owner"] == "carol"
+
+
+def test_rename_no_deep_research_dir_does_not_crash(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+    # No deep_research dir — must not crash.
+    res = asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+    assert res["ok"] is True
+
+
+def test_rename_research_respects_custom_data_dir(monkeypatch, tmp_path):
+    """DEEP_RESEARCH_DIR (which honours ODYSSEUS_DATA_DIR) is used, not a
+    hardcoded relative path. Before the fix, setting ODYSSEUS_DATA_DIR made
+    the rename silently patch a different directory from where research files
+    actually live, so reports still disappeared after rename."""
+    import routes.auth_routes as ar
+    import core.database as cdb
+
+    custom_dr = tmp_path / "custom_data" / "deep_research"
+    custom_dr.mkdir(parents=True)
+    p = custom_dr / "rp-abc.json"
+    p.write_text(json.dumps({"query": "q", "owner": "alice", "status": "done"}), encoding="utf-8")
+
+    monkeypatch.setattr(cdb, "SessionLocal", lambda: MagicMock())
+    monkeypatch.setattr(cdb, "Base", SimpleNamespace(registry=SimpleNamespace(mappers=[])), raising=False)
+    pr = types.ModuleType("routes.prefs_routes")
+    pr._load = lambda: {}
+    pr._save = lambda d: None
+    monkeypatch.setitem(sys.modules, "routes.prefs_routes", pr)
+    monkeypatch.setattr(ar, "DEEP_RESEARCH_DIR", str(custom_dr))
+    monkeypatch.setattr(ar, "MEMORY_FILE", str(tmp_path / "memory.json"))
+
+    am = MagicMock()
+    am.is_admin.return_value = True
+    am.get_username_for_token.return_value = "admin"
+    am.users = {"alice": {}}
+    am.rename_user.return_value = True
+    endpoint = _route(ar.setup_auth_routes(am), "rename_user")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    assert json.loads(p.read_text(encoding="utf-8"))["owner"] == "alice2", (
+        "research JSON at custom DATA_DIR was not patched — DEEP_RESEARCH_DIR constant not used"
+    )
+
+
+# ---------------------------------------------------------------------------
+# 3. memory.json
+# ---------------------------------------------------------------------------
+
+def test_rename_updates_memory_json_owner(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    entries = [
+        {"id": "1", "text": "Lives in Berlin", "owner": "alice"},
+        {"id": "2", "text": "Likes Python",    "owner": "carol"},
+    ]
+    (tmp_path / "memory.json").write_text(json.dumps(entries), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    updated = json.loads((tmp_path / "memory.json").read_text(encoding="utf-8"))
+    assert updated[0]["owner"] == "alice2", "memory.json entry owner was not updated on rename"
+    assert updated[1]["owner"] == "carol",  "unrelated memory entry was modified"
+
+
+def test_rename_memory_json_case_insensitive(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    entries = [{"id": "1", "text": "x", "owner": "Alice"}]
+    (tmp_path / "memory.json").write_text(json.dumps(entries), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="bob"), _request(tmp_path)))
+
+    assert json.loads((tmp_path / "memory.json").read_text())[0]["owner"] == "bob"
+
+
+def test_rename_no_memory_json_does_not_crash(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+    # No memory.json — must not crash.
+    res = asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+    assert res["ok"] is True
+
+
+# ---------------------------------------------------------------------------
+# 4. Skills (SKILL.md frontmatter + _usage.json sidecar)
+# ---------------------------------------------------------------------------
+
+_SKILL_MD = """\
+---
+name: test-skill
+description: A test skill.
+version: 1.0.0
+category: general
+status: published
+confidence: 0.9
+source: learned
+owner: {owner}
+---
+
+## When to Use
+When testing.
+"""
+
+
+def test_rename_updates_skill_md_owner(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    skill_dir = tmp_path / "skills" / "general" / "test-skill"
+    skill_dir.mkdir(parents=True)
+    (skill_dir / "SKILL.md").write_text(_SKILL_MD.format(owner="alice"), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    content = (skill_dir / "SKILL.md").read_text(encoding="utf-8")
+    assert "owner: alice2" in content
+    assert "owner: alice\n" not in content
+
+
+def test_rename_leaves_other_skill_owners_untouched(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    for owner, name in [("alice", "alice-skill"), ("carol", "carol-skill")]:
+        d = tmp_path / "skills" / "general" / name
+        d.mkdir(parents=True)
+        (d / "SKILL.md").write_text(_SKILL_MD.format(owner=owner).replace("test-skill", name), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    assert "owner: alice2" in (tmp_path / "skills" / "general" / "alice-skill" / "SKILL.md").read_text()
+    assert "owner: carol" in (tmp_path / "skills" / "general" / "carol-skill" / "SKILL.md").read_text()
+
+
+def test_rename_updates_usage_sidecar_keys(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+
+    skills_root = tmp_path / "skills"
+    skills_root.mkdir(parents=True)
+    usage = {
+        "alice::test-skill": {"uses": 3, "last_used": 1000},
+        "carol::other-skill": {"uses": 1, "last_used": 500},
+        "unscoped-skill": {"uses": 2, "last_used": 200},
+    }
+    (skills_root / "_usage.json").write_text(json.dumps(usage), encoding="utf-8")
+
+    asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+
+    updated = json.loads((skills_root / "_usage.json").read_text(encoding="utf-8"))
+    assert "alice2::test-skill" in updated
+    assert "alice::test-skill" not in updated
+    assert "carol::other-skill" in updated
+    assert "unscoped-skill" in updated
+
+
+def test_rename_no_skills_dir_does_not_crash(rename_endpoint):
+    endpoint, _am, tmp_path = rename_endpoint
+    res = asyncio.run(endpoint("alice", SimpleNamespace(username="alice2"), _request(tmp_path)))
+    assert res["ok"] is True
+
+
+# ---------------------------------------------------------------------------
+# 5. P1 regression: rejected auth rename must not mutate file-backed stores
+# ---------------------------------------------------------------------------
+
+def test_rejected_rename_does_not_mutate_files(monkeypatch, tmp_path):
+    """If auth_manager.rename_user() returns False, no file-backed store
+    should be touched. Before the fix the deep_research and memory writes
+    ran before the auth check, so a rejected rename (e.g. reserved username)
+    silently moved owner fields to the new name."""
+    import routes.auth_routes as ar
+    import core.database as cdb
+
+    monkeypatch.setattr(cdb, "SessionLocal", lambda: MagicMock())
+    monkeypatch.setattr(cdb, "Base", SimpleNamespace(registry=SimpleNamespace(mappers=[])), raising=False)
+    pr = types.ModuleType("routes.prefs_routes")
+    pr._load = lambda: {}
+    pr._save = lambda d: None
+    monkeypatch.setitem(sys.modules, "routes.prefs_routes", pr)
+    monkeypatch.setattr(ar, "DEEP_RESEARCH_DIR", str(tmp_path / "deep_research"))
+    monkeypatch.setattr(ar, "MEMORY_FILE", str(tmp_path / "memory.json"))
+    monkeypatch.setattr(ar, "SKILLS_DIR", str(tmp_path / "skills"))
+
+    # Seed files for alice.
+    dr = tmp_path / "deep_research"
+    dr.mkdir()
+    rp = dr / "rp-abc.json"
+    rp.write_text(json.dumps({"owner": "alice", "query": "q"}), encoding="utf-8")
+
+    mem = tmp_path / "memory.json"
+    mem.write_text(json.dumps([{"owner": "alice", "text": "x"}]), encoding="utf-8")
+
+    skill_dir = tmp_path / "skills" / "general" / "s"
+    skill_dir.mkdir(parents=True)
+    (skill_dir / "SKILL.md").write_text(_SKILL_MD.format(owner="alice"), encoding="utf-8")
+
+    # Auth rejects the rename (reserved name, race, etc.).
+    am = MagicMock()
+    am.is_admin.return_value = True
+    am.get_username_for_token.return_value = "admin"
+    am.users = {"alice": {}}
+    am.rename_user.return_value = False
+    endpoint = _route(ar.setup_auth_routes(am), "rename_user")
+
+    with pytest.raises(Exception):
+        asyncio.run(endpoint("alice", SimpleNamespace(username="api"), _request(tmp_path)))
+
+    assert json.loads(rp.read_text())["owner"] == "alice", "research owner mutated after rejected rename"
+    assert json.loads(mem.read_text())[0]["owner"] == "alice", "memory owner mutated after rejected rename"
+    assert "owner: alice" in (skill_dir / "SKILL.md").read_text(), "skill owner mutated after rejected rename"
@@ -76,6 +76,23 @@ def _seed_index_skill(tmp_path: Path) -> Path:
    return data_dir


+def _write_index_skill(data_dir: Path, name: str, description: str, owner: str) -> None:
+    skill_dir = data_dir / "skills" / owner / name
+    skill_dir.mkdir(parents=True, exist_ok=True)
+    (skill_dir / "SKILL.md").write_text(
+        "---\n"
+        f"name: {name}\n"
+        f"description: {description}\n"
+        "when_to_use: when this owner needs a private workflow\n"
+        "category: private\n"
+        "status: published\n"
+        f"owner: {owner}\n"
+        "---\n\n"
+        f"# {name}\n",
+        encoding="utf-8",
+    )
+
+
 def _patch_prefs(monkeypatch, data_dir):
    """Mirror the helpers from test_skill_prompt_injection.py: point
    `src.constants.DATA_DIR` at our tmp, and patch the prefs loader so
@@ -152,3 +169,40 @@ def test_skill_index_lands_in_untrusted_user_message(tmp_path, monkeypatch):
    )
    assert untrusted[0]["role"] == "user"
    assert "Source: skills" in untrusted[0]["content"]
+
+
+def test_skill_index_is_owner_scoped_across_prompt_cache_hits(tmp_path, monkeypatch):
+    """Authenticated users must not receive another user's skill index.
+
+    This calls the prompt builder twice without clearing the base-prompt cache,
+    so the second call exercises the cache-hit path as well as owner scoping.
+    """
+    data_dir = tmp_path / "data"
+    _write_index_skill(data_dir, "alice-only", "Alice private procedure", "alice")
+    _write_index_skill(data_dir, "bob-only", "Bob private procedure", "bob")
+    _patch_prefs(monkeypatch, data_dir)
+
+    from src.agent_loop import _build_system_prompt  # noqa: WPS433
+
+    messages = [{"role": "user", "content": "use my workflow"}]
+    alice_out, _ = _build_system_prompt(
+        messages=messages, model="test-model",
+        active_document=None, mcp_mgr=None, owner="alice",
+    )
+    bob_out, _ = _build_system_prompt(
+        messages=messages, model="test-model",
+        active_document=None, mcp_mgr=None, owner="bob",
+    )
+
+    alice_text = "\n".join(m.get("content", "") or "" for m in alice_out)
+    bob_text = "\n".join(m.get("content", "") or "" for m in bob_out)
+
+    assert "alice-only" in alice_text
+    assert "Alice private procedure" in alice_text
+    assert "bob-only" not in alice_text
+    assert "Bob private procedure" not in alice_text
+
+    assert "bob-only" in bob_text
+    assert "Bob private procedure" in bob_text
+    assert "alice-only" not in bob_text
+    assert "Alice private procedure" not in bob_text
@@ -238,36 +238,6 @@ def test_guide_only_blocks_later_round_document_streaming(monkeypatch):
    assert not any(event.get("type") == "doc_stream_delta" for event in events)


-def test_guide_only_directive_dominates_workspace_prompt(monkeypatch):
-    _patch_loop_basics(monkeypatch)
-    system_prompts = []
-
-    async def _fake_stream(_candidates, messages, **kwargs):
-        system_prompts.append(messages[0]["content"])
-        yield _delta_chunk("ok")
-        yield "data: [DONE]\n\n"
-
-    monkeypatch.setattr(al, "stream_llm_with_fallback", _fake_stream, raising=False)
-    policy = build_effective_tool_policy(last_user_message="Do not use tools.")
-
-    _collect(
-        al.stream_agent_loop(
-            "http://local.test/v1",
-            "local-model",
-            [{"role": "user", "content": "Do not use tools."}],
-            max_rounds=1,
-            relevant_tools={"bash"},
-            tool_policy=policy,
-            workspace="/tmp/project",
-        )
-    )
-
-    assert system_prompts
-    assert system_prompts[0].startswith("## GUIDE-ONLY MODE")
-    assert "ACTIVE WORKSPACE" not in system_prompts[0]
-    assert "ALWAYS start by exploring" not in system_prompts[0]
-
-
 def test_guide_only_skips_intent_without_action_nudge(monkeypatch):
    _patch_loop_basics(monkeypatch)

@@ -1,107 +0,0 @@
-"""Workspace confinement: file tools are hard-bounded to the workspace folder
-(layered on upstream's sensitive-path policy); bash runs with cwd there."""
-import os
-import tempfile
-
-import pytest
-
-from src.tool_execution import _resolve_tool_path_in_workspace, _direct_fallback
-
-
-def test_workspace_resolver_confines():
-    ws = tempfile.mkdtemp()
-    open(os.path.join(ws, "a.txt"), "w").write("x")
-    real = os.path.realpath(os.path.join(ws, "a.txt"))
-    # relative path resolves under the workspace
-    assert _resolve_tool_path_in_workspace(ws, "a.txt") == real
-    # absolute path inside the workspace is allowed
-    assert _resolve_tool_path_in_workspace(ws, os.path.join(ws, "a.txt")) == real
-    # absolute path outside is rejected (sibling temp dir, portable across OSes)
-    outside = tempfile.mkdtemp()
-    with pytest.raises(ValueError):
-        _resolve_tool_path_in_workspace(ws, os.path.join(outside, "x.txt"))
-    # parent-escape is rejected
-    with pytest.raises(ValueError):
-        _resolve_tool_path_in_workspace(ws, os.path.join("..", "..", "escape.txt"))
-
-
-def test_workspace_resolver_blocks_sensitive():
-    """Upstream's sensitive-file deny list still applies inside the workspace."""
-    ws = tempfile.mkdtemp()
-    os.makedirs(os.path.join(ws, ".ssh"), exist_ok=True)
-    with pytest.raises(ValueError):
-        _resolve_tool_path_in_workspace(ws, ".ssh/authorized_keys")
-
-
-@pytest.mark.asyncio
-async def test_read_write_confined_in_workspace():
-    ws = tempfile.mkdtemp()
-    # Write inside the workspace (relative path) succeeds.
-    res = await _direct_fallback("write_file", "note.txt\nhello", workspace=ws)
-    assert res["exit_code"] == 0
-    assert os.path.isfile(os.path.join(ws, "note.txt"))
-    # Read it back.
-    res = await _direct_fallback("read_file", "note.txt", workspace=ws)
-    assert res["exit_code"] == 0 and res["output"] == "hello"
-    # Reading outside the workspace is rejected (sibling temp dir, portable).
-    outside = tempfile.mkdtemp()
-    outside_file = os.path.join(outside, "secret.txt")
-    open(outside_file, "w").write("nope")
-    res = await _direct_fallback("read_file", outside_file, workspace=ws)
-    assert res["exit_code"] == 1 and "outside the workspace" in res["error"]
-    # Writing outside is rejected (file must not be created).
-    escape = os.path.join(outside, "_ws_escape.txt")
-    res = await _direct_fallback("write_file", f"{escape}\nx", workspace=ws)
-    assert res["exit_code"] == 1 and "outside the workspace" in res["error"]
-    assert not os.path.exists(escape)
-
-
-@pytest.mark.asyncio
-async def test_subprocess_runs_with_workspace_cwd():
-    """bash/python subprocesses run with cwd set to the workspace. Use the
-    python tool for an OS-agnostic cwd probe (Windows cmd has no `pwd`)."""
-    ws = tempfile.mkdtemp()
-    res = await _direct_fallback("python", "import os; print(os.getcwd())", workspace=ws)
-    assert res["exit_code"] == 0
-    assert os.path.realpath(res["output"].strip()) == os.path.realpath(ws)
-
-
-# --- Tools that landed after this PR, now wired into the workspace -----------
-
-@pytest.mark.asyncio
-async def test_edit_file_confined_in_workspace():
-    import json
-    from src.tool_execution import _do_edit_file
-    ws = tempfile.mkdtemp()
-    open(os.path.join(ws, "f.txt"), "w").write("foo bar")
-    # Edit inside the workspace succeeds.
-    res = await _do_edit_file(json.dumps(
-        {"path": "f.txt", "old_string": "foo", "new_string": "baz"}), workspace=ws)
-    assert res["exit_code"] == 0
-    assert open(os.path.join(ws, "f.txt")).read() == "baz bar"
-    # Editing outside the workspace is rejected (sibling temp dir, portable).
-    outside = tempfile.mkdtemp()
-    outside_file = os.path.join(outside, "f.txt")
-    open(outside_file, "w").write("a")
-    res = await _do_edit_file(json.dumps(
-        {"path": outside_file, "old_string": "a", "new_string": "b"}), workspace=ws)
-    assert res["exit_code"] == 1 and "outside the workspace" in res["error"]
-
-
-@pytest.mark.asyncio
-async def test_grep_and_ls_confined_in_workspace():
-    import json
-    ws = tempfile.mkdtemp()
-    open(os.path.join(ws, "doc.txt"), "w").write("hello workspace\n")
-    # grep with no path searches the workspace root and finds the match.
-    res = await _direct_fallback("grep", json.dumps({"pattern": "hello"}), workspace=ws)
-    assert res["exit_code"] == 0 and "doc.txt" in res["output"]
-    # grep pointed outside the workspace is rejected (sibling temp dir, portable).
-    outside = tempfile.mkdtemp()
-    res = await _direct_fallback("grep", json.dumps({"pattern": "x", "path": outside}), workspace=ws)
-    assert res["exit_code"] == 1 and "outside the workspace" in res["error"]
-    # ls of the workspace lists its files; ls outside is rejected.
-    res = await _direct_fallback("ls", "", workspace=ws)
-    assert res["exit_code"] == 0 and "doc.txt" in res["output"]
-    res = await _direct_fallback("ls", outside, workspace=ws)
-    assert res["exit_code"] == 1 and "outside the workspace" in res["error"]
Author	SHA1	Message	Date
Maruf Hasan	c3fcaf15b7	feat(providers): add NVIDIA AI provider endpoint support (#3456 ) * feat: add NVIDIA as an AI provider (integrate.api.nvidia.com) * feat: add NVIDIA option to provider settings dropdown and aliases * test: add NVIDIA provider detection and endpoint tests * Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering - nvidia.com -> 'nvidia' curated key for proper provider routing - _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed - _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip, kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse, -embedqa, -nemoretriever * Expand non-chat model filtering for NVIDIA embedding/guard/video models Add _NON_CHAT_PREFIXES: embed, recurrent Add _NON_CHAT_CONTAINS: topic-control, guard, calibration, ai-synthetic-video, cosmos-reason2 Catches remaining unfiltered non-chat models from NVIDIA catalog: embedding (llama-nemotron-embed, embed-qa), guard (llama-guard, nemoguard-topic-control), calibration (ising-calibration), video (ai-synthetic-video-detector, cosmos-reason2), recurrent (recurrentgemma-2b) * Filter non-chat models in _probe_endpoint via _is_chat_model() Previously _is_chat_model() was only used in the per-model probe and _first_chat_model(), so non-chat models still appeared in the model picker even though they were filtered in those specific paths. Applying the filter at _probe_endpoint() return ensures non-chat models (embeddings, safety guards, reward, calibration, video detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never enter cached_models and never appear in the picker. * Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models Prefix checks (mid.startswith) miss models with org prefixes like baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc. Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught regardless of the org prefix. Adds: embed, bge, recurrent, starcoder, gemma-2b * fix(model-routes): drop collision-prone substrings from global non-chat filter The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES and _NON_CHAT_CONTAINS tuples. These are intended to filter out embedding, retrieval, safety, and vision models from NVIDIA's catalog that are not chat-completions-capable. However, four of the added substrings collide with legitimate chat models served by other providers: - gemma-2b matches google/gemma-2b-it (instruct chat model) - starcoder matches bigcode/starcoder2-15b (code completion model) - recurrent matches google/recurrentgemma-2b (language model) - guard matches meta-llama/Llama-Guard-3-8B (safety classifier) Removing these four from the global tuples keeps the NVIDIA-specific filtering intact (safety, embedding, retrieval, and vision models are still caught by other tokens such as content-safety, -safety, -reward, embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while preventing false negatives for instruct/code models on other providers. Tests added for gemma-2b-it, google/gemma-2b-it, and bigcode/starcoder2-15b-instruct asserting they are recognized as chat models. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS entries redundant since the prefix check runs first. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * style: fix indentation of groq and xai test cases in test_provider_endpoints.py --------- Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>	2026-06-09 11:06:12 +02:00
Mazen Tamer Salah	3c4ec8828b	fix(embeddings): survive numpy embeddings when restoring a reset lane (#3410 ) When a lane reset fails to rewrite the recreated collection, the recovery path re-adds the preserved rows. It read the embeddings with `preserved.get("embeddings") or []` and gated the loop with `if ids and docs and old_embeddings:`. chromadb returns embeddings as a numpy ndarray, whose truth value is ambiguous, so both expressions raise ValueError inside the except block — the restore is abandoned and every preserved row is lost (the collection was already deleted), exactly when the code is trying to avoid data loss. Use an explicit `is None` check and `len(...)`, and convert ndarray batches to lists before re-adding. Adds tests/test_embedding_lane_ndarray_restore.py (preserved embeddings come back as np.ndarray); existing test_embedding_lanes.py still passes.	2026-06-09 10:40:17 +02:00
Ashvin	2fdb4813db	fix(auth): sync file-backed and in-memory owner caches on user rename (#3397 ) The DB owner-rename loop in rename_user patched every SQL column named owner, but three non-SQL stores were left behind: 1. session_manager.sessions -- in-memory Session objects carry s.owner set at server-boot time. get_sessions_for_user() does an exact s.owner == username check, so the renamed user chat sidebar goes empty until a server restart. 2. data/deep_research/.json -- each completed research report is a standalone JSON file with an owner field. research_routes filters by d.get(owner) == user, making every report invisible to the renamed user. 3. data/memory.json -- a flat JSON array; each entry carries an owner field. memory_manager.load(owner=user) filters on it, so all memories vanish from the memory panel. Fix: after the SQL loop, patch all three: - iterate sm.sessions and update owner in-place (exposed via app.state) - walk data/deep_research/.json and rewrite owner with atomic_write_json - update matching entries in memory.json with atomic_write_json All three use the same case-insensitive lower() comparison the SQL loop already uses. Each step is independently wrapped so a single failure does not abort the others or the rename itself. Fixes #3362	2026-06-09 10:19:45 +02:00
nubs	f1cda91683	fix(agent): scope skill index to owner (#2404 ) Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>	2026-06-09 09:51:29 +02:00
Kenny Van de Maele	0aba00f4cf	refactor(tools): remove dead workspace-confinement plumbing (#3590 ) Commit `e6b1009` removed the workspace feature's entry point (deleted routes/workspace_routes.py + static/js/workspace.js and dropped the workspace-param parsing in chat_routes), but left the downstream backend plumbing dangling: chat_routes passed a hardcoded workspace=None into stream_agent_loop, which forwarded it to execute_tool_block, so the workspace value was permanently None and every workspace-gated branch was unreachable. Remove the now-dead code (no behavior change, since workspace was always None): - src/tool_execution.py: drop _resolve_tool_path_in_workspace and the workspace params/branches on execute_tool_block, _direct_fallback, _call_mcp_tool, _do_edit_file, and _resolve_search_root; restore the bash/python/bg cwd to _AGENT_WORKDIR. - src/agent_loop.py: drop the workspace param on stream_agent_loop, the dead 'ACTIVE WORKSPACE' system-prompt block, and the workspace forward. - routes/chat_routes.py: drop the hardcoded workspace=None arg and var. - tests: delete test_workspace_confine.py (tested the removed feature) and the workspace assertion in test_tool_policy.py. Full suite: 2903 passed, 1 skipped.	2026-06-09 08:30:50 +02:00
Afonso Coutinho	fbed9027b0	fix: backup import dropping a user's skill on cross-tenant title/id collision (#2057 ) * Fix backup import dropping a user's skill on cross-tenant title/id collision The skills block of import_data deduped incoming skills against skills_manager.load_all(), which returns EVERY tenant's skills. So when a user imports their own backup, any skill whose id or title collides with another user's skill was silently skipped — the importing user lost their own data. This is the same cross-tenant bug already fixed for the memories block just above (#1743); the skills block was left with the old pattern. Filter the dedup sets to the importing user's own skills (owner == user); the full store is still saved back, preserving other users' skills. * Restore sys.modules after stubbing so backup test does not break collection of later src.* test modules * Patch backup_routes auth helpers via monkeypatch instead of sys.modules stubs so the test is import-order robust * Give FakeSkillsManager an add_skill method matching the disk-backed skills API	2026-06-09 08:04:22 +02:00
Disorder AA	d9141c6e56	fix(cookbook): allow spaces and non-ASCII characters in model directory paths (#3473 ) * fix(cookbook): allow spaces in model directory paths Allow POSIX external-drive paths and Windows drive paths with spaces while keeping shell metacharacters rejected. * fix(cookbook): also allow non-ASCII (Unicode) characters in model dir paths The ASCII-only allowlist that rejected spaces also rejected Cyrillic, accented Latin and CJK folder names (e.g. /Volumes/Модели, D:\AI Models\Модели) with 400 Invalid local_dir. Switch the path character class from [A-Za-z0-9._ -] to [\w. -] (\w is Unicode-aware on Python 3 str patterns) so localized folder names validate, while shell metacharacters (; & \| ` $ quotes newlines) stay rejected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(cookbook): reject local_dir path segments starting with '-' The local_dir allowlist includes '-', so a directory like /models/-rf (or D:\models\-rf) could be parsed as a CLI flag by hf/etc. (option injection) — and quoting does not stop a value from being read as an option. Guard against it inside the validator so the safety stays fully self-contained there rather than depending on consumers' quoting. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 07:58:38 +02:00
onemorethan0	8ae2b5f58c	fix(llm): suppress thinking mode for qwen3/gemma4 on Ollama /v1 endpoint (#3228 ) * fix(llm): suppress thinking for qwen3/gemma4 on Ollama /v1 compat endpoint When using qwen3, QwQ, gemma4, or other thinking models via Ollama's OpenAI-compatible /v1 endpoint, the model routes all output into its <think>...</think> reasoning block. Since Odysseus strips thinking content from round_response and only accumulates native tool_calls, this produces a round with 0 chars, 0 native calls, 0 tool blocks — the agent appears to silently do nothing. Root cause: Odysseus classifies the /v1 endpoint as provider="openai" (not "ollama"), so the payload is built as a standard OpenAI payload without any Ollama-specific options. Ollama's /v1 endpoint accepts "think": false as a top-level parameter to suppress extended thinking, but this was never sent. Fix: - Add _is_ollama_openai_compat_url() to detect local Ollama /v1 URLs - Inject "think": false in both stream_llm and llm_call_async for thinking models (qwen3, QwQ, gemma4, DeepSeek-R1, etc.) on this endpoint Verified with qwen3:14b on Ollama 0.24: with think=False the model correctly emits native tool_calls in a single streaming chunk and the agent executes bash/file/web tools as expected. * fix(llm): extend _is_ollama_openai_compat_url to match localhost on any port Per reviewer feedback on PR #3228: 1. Generalize host detection to mirror _is_ollama_native_url: match any localhost/127.0.0.1/0.0.0.0/::1 host (not just port 11434) so that custom OLLAMA_HOST ports and container remaps are also covered. 2. Add tests/test_llm_core_ollama_thinking.py covering: - _is_ollama_openai_compat_url for all positive/negative URL cases including IPv6, non-default port, native /api path, and real OpenAI - Payload injection: think:false set for Ollama /v1 thinking model, not set for non-thinking model, not set for real OpenAI endpoint, and set for localhost on a non-default port (the new case)	2026-06-09 07:35:15 +02:00