From 620fdd0859e3097e45c3ed4cec1b4ffbad05f962 Mon Sep 17 00:00:00 2001 From: Kenny Van de Maele Date: Thu, 11 Jun 2026 18:17:54 +0200 Subject: [PATCH] feat(agent): confine agent file/shell tools to a selectable workspace (#3665) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(agent): workspace confinement via context-local binding + get_workspace tool Bind the per-turn workspace once in execute_tool_block; the shared path resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so file tools + bash/python are confined centrally and a new tool that uses the shared helpers cannot accidentally bypass it. Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory modal (reusing existing modal/button CSS), the /workspace slash command, and a get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic (realpath/normcase/commonpath) and docker-safe (container paths, no host assumptions). Reopens #2023. * ux(workspace): clarify workspace is not a sandbox Picker modal note + pill tooltip + get_workspace tool/output wording now state plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder, but bash/python only start there (cwd) and are not sandboxed. Modal note reuses the existing .muted class. * fix(agent): treat an active workspace as file-work intent A vague low-signal message (e.g. "look at the local project") matches no domain keywords, so tool retrieval is skipped and only always-available tools are offered — leaving the agent with no file access even though a workspace is set. When a workspace is active, include the file/code tools (incl. get_workspace) on low-signal turns so the agent can act on the folder. Also requires the tool index (ChromaDB) to be reachable for normal retrieval; that is an environment dependency, not part of this change. * ux(workspace): hide pill + overflow entry in chat mode Workspace only scopes the agent's file/shell tools, so the pill and the overflow 'Workspace' entry are agent-only now — hidden in chat mode like the bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is called from the agent/chat setMode handler. * prompt(tools): steer bash/python to defer to the dedicated file tools bash/python schema descriptions (what native-tool-calling models read) were bare and gave no steer, so models would do file ops via the shell (e.g. writing SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python in the schema + tool-index + prompt section to prefer read_file/write_file/ edit_file/grep/glob/ls and only be used for what those do not cover. * prompt(tools): keep bash/python deferral generic (no hardcoded tool names) Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc. by name, so the guidance does not go stale if those tools are renamed. * style(workspace): drop em-dashes from added code comments/strings * ux(workspace): terser non-sandbox note in picker (no tool-name list) * ux(workspace): mirror terse non-sandbox wording in pill tooltip * chore: untrack local venv symlink (run-only, not part of the feature) * prompt(workspace): keep get_workspace text generic (no hardcoded tool names) * fix(agent): low-signal + workspace surfaces only read-only file tools Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but not write_file/edit_file/bash/python -- those wait for a request that actually calls for them (RAG retrieval still adds them on a real ask). * feat(workspace): cap browse listing at 500 dirs with a truncated hint Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local _MAX_BROWSE_DIRS so a directory with thousands of children does not dump every row into the picker; the response carries a truncated flag and the modal tells the user to type a path to jump in. * chore: untrack local venv symlink (run-only artifact) * fix(workspace): vet the workspace root against the sensitive-path deny list at bind time The in-workspace resolver deny-lists sensitive paths inside the workspace, but the empty-path search root is the workspace itself, so a workspace of ~/.ssh could be listed via ls with no path. vet_workspace() (public, in tool_execution next to the resolvers) rejects non-directories and sensitive roots before the path is ever bound; chat_routes uses it instead of its inline isdir check. * fix(workspace): reject filesystem roots and stop showing rejected workspaces as active Review findings from #3665: P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes every absolute path 'inside' the workspace and collapses confinement into host-wide file access. A root is its own dirname, so reject when dirname(resolved) == resolved; the browse response now carries a selectable flag and the picker disables 'Use this folder' on unselectable dirs. P3: /workspace set stored any string client-side and the chat route silently dropped rejected values, so the pill could claim a confinement that was not in effect. New admin-gated /api/workspace/vet validates manual paths before they persist (canonical path returned), and when a posted workspace is rejected at send time the stream emits workspace_rejected so the client clears the stored value and toasts instead of continuing silently. * fix(workspace): check caller privilege before vetting the posted workspace Review finding: /api/chat_stream called vet_workspace() on the posted value for every caller and emitted workspace_rejected on failure, so a non-admin who can chat but cannot use file/shell tools could distinguish existing directories from missing/file/sensitive/root paths by whether the event appeared. The resolution now lives in _resolve_request_workspace, which drops the submitted value uniformly for non-admin callers, with no vetting and no event, before the path ever touches the filesystem. Admin and single-user behavior is unchanged. Test pins that valid and invalid paths are indistinguishable for a non-admin and that vet_workspace is never invoked for them. --- app.py | 3 + routes/chat_routes.py | 39 ++++ routes/workspace_routes.py | 85 +++++++ src/agent_loop.py | 22 +- src/agent_tools/__init__.py | 5 +- src/agent_tools/filesystem_tools.py | 75 +++---- src/agent_tools/subprocess_tools.py | 10 +- src/tool_execution.py | 101 ++++++++- src/tool_index.py | 5 +- src/tool_schemas.py | 14 +- src/tool_security.py | 2 + static/app.js | 4 + static/index.html | 15 +- static/js/chat.js | 19 ++ static/js/slashCommands.js | 43 ++++ static/js/storage.js | 3 +- static/js/workspace.js | 208 ++++++++++++++++++ static/style.css | 45 ++++ tests/test_workspace_confine.py | 328 ++++++++++++++++++++++++++++ 19 files changed, 955 insertions(+), 71 deletions(-) create mode 100644 routes/workspace_routes.py create mode 100644 static/js/workspace.js create mode 100644 tests/test_workspace_confine.py diff --git a/app.py b/app.py index 755fc252e..6958ac347 100644 --- a/app.py +++ b/app.py @@ -676,6 +676,9 @@ app.include_router(setup_shell_routes()) from routes.cookbook_routes import setup_cookbook_routes app.include_router(setup_cookbook_routes()) +from routes.workspace_routes import setup_workspace_routes +app.include_router(setup_workspace_routes()) + # Hardware model fitting (cookbook "What Fits?" tab) from routes.hwfit_routes import setup_hwfit_routes app.include_router(setup_hwfit_routes()) diff --git a/routes/chat_routes.py b/routes/chat_routes.py index 3e18bf5c6..f06ca4dc7 100644 --- a/routes/chat_routes.py +++ b/routes/chat_routes.py @@ -62,6 +62,33 @@ def _stream_set(session_id: str, **fields) -> None: rec.update(fields) +def _resolve_request_workspace(request, raw_value) -> tuple: + """Resolve the posted workspace for this request: (workspace, rejected). + + Privilege is checked BEFORE the path ever touches the filesystem. Only + admin/single-user callers can use the workspace-backed file/shell tools, + so only they get vet_workspace() and the workspace_rejected signal. For + any other caller the submitted value is dropped uniformly, with no vetting + and no event: otherwise the presence/absence of workspace_rejected would + let a non-admin chat caller probe which host paths exist. + + vet_workspace rejects non-directories, sensitive roots (.ssh, .gnupg, + ...), and filesystem roots; on rejection there is no confinement and the + default tool-path allowlist applies. The rejected value is surfaced so the + stream can tell an admin client (which believes a workspace is active) + that it was dropped. + """ + requested = (raw_value or "").strip() + if not requested: + return "", "" + from src.tool_security import owner_is_admin_or_single_user + if not owner_is_admin_or_single_user(get_current_user(request)): + return "", "" + from src.tool_execution import vet_workspace + workspace = vet_workspace(requested) or "" + return workspace, (requested if not workspace else "") + + def _session_url_matches_endpoint(session_url: str, endpoint_base: str) -> bool: if not session_url or not endpoint_base: return False @@ -457,6 +484,10 @@ def setup_chat_routes( # manual form posts that still send plan_mode=true. plan_mode = False chat_mode = str(form_data.get("mode", "")).lower() # 'chat' or 'agent' + # Workspace: confine the agent's file/shell tools to this folder. + workspace, workspace_rejected = _resolve_request_workspace( + request, form_data.get("workspace") + ) # Plan mode is a modifier on agent mode — it only makes sense with tools. if plan_mode: chat_mode = "agent" @@ -761,6 +792,13 @@ def setup_chat_routes( # Register active stream for partial-save safety net _active_streams[session] = {"status": "streaming", "partial": "", "query": message, "is_research": effective_do_research, "mode": _effective_mode} + # The client sent a workspace the server refused to bind (deleted + # folder, file path, sensitive dir, filesystem root). Tell it up + # front so the UI can clear the pill instead of displaying a + # confinement that is not actually in effect. + if workspace_rejected: + yield f"data: {json.dumps({'type': 'workspace_rejected', 'data': {'path': workspace_rejected}})}\n\n" + if ctx.preprocessed.attachment_meta: yield f"data: {json.dumps({'type': 'attachments', 'data': ctx.preprocessed.attachment_meta})}\n\n" @@ -1138,6 +1176,7 @@ def setup_chat_routes( fallbacks=_fallback_candidates, plan_mode=plan_mode, approved_plan=approved_plan or None, + workspace=workspace or None, ): if chunk.startswith("data: ") and not chunk.startswith("data: [DONE]"): try: diff --git a/routes/workspace_routes.py b/routes/workspace_routes.py new file mode 100644 index 000000000..ef70e78c2 --- /dev/null +++ b/routes/workspace_routes.py @@ -0,0 +1,85 @@ +"""Workspace API - browse server directories to pick a tool workspace folder.""" +import os +from fastapi import APIRouter, Request, HTTPException, Query + +from src.auth_helpers import get_current_user +from src.tool_security import owner_is_admin_or_single_user + +# Cap entries returned per directory (mirrors filesystem_tools._CODENAV_MAX_HITS). +# A huge directory shouldn't dump thousands of rows into the picker; the user can +# type/paste a path to jump straight in instead. +_MAX_BROWSE_DIRS = 500 + + +def setup_workspace_routes(): + router = APIRouter(prefix="/api/workspace", tags=["workspace"]) + + @router.get("/browse") + def browse(request: Request, path: str = Query(default="")): + """List subdirectories of `path` (default: home) so the UI can navigate + the server filesystem and pick a workspace folder. Directories only. + + ADMIN-ONLY: this enumerates the server filesystem, so it is gated the + same way the file/shell tools are (read_file/write_file/bash are in + NON_ADMIN_BLOCKED_TOOLS). A non-admin who can't use those tools must not + be able to map the host's directory tree either. + """ + owner = get_current_user(request) + if not owner_is_admin_or_single_user(owner): + raise HTTPException(status_code=403, detail="Workspace browsing is admin-only") + + # Resolve symlinks so the reported path is canonical and the UI navigates + # real directories (defends against symlink games in displayed paths). + target = os.path.realpath(os.path.expanduser(path.strip() or "~")) + if not os.path.isdir(target): + target = os.path.realpath(os.path.expanduser("~")) + + dirs = [] + try: + with os.scandir(target) as it: + for entry in it: + try: + # Don't follow symlinks when classifying - a symlinked + # dir is skipped rather than letting the browser wander + # off via a link. Hidden entries are omitted. + if entry.is_dir(follow_symlinks=False) and not entry.name.startswith("."): + # Build the child path server-side with os.path.join + # so it's correct on Windows (backslashes) and Linux. + dirs.append({"name": entry.name, "path": os.path.join(target, entry.name)}) + except OSError: + continue + except (PermissionError, OSError): + dirs = [] + + dirs_sorted = sorted(dirs, key=lambda d: d["name"].lower()) + truncated = len(dirs_sorted) > _MAX_BROWSE_DIRS + parent = os.path.dirname(target) + from src.tool_execution import vet_workspace + return { + "path": target, + "parent": parent if parent and parent != target else None, + "dirs": dirs_sorted[:_MAX_BROWSE_DIRS], + "truncated": truncated, + # Whether this directory may be bound as a workspace (filesystem + # roots and sensitive dirs may be browsed through but not chosen). + "selectable": vet_workspace(target) is not None, + } + + @router.get("/vet") + def vet(request: Request, path: str = Query(default="")): + """Validate a workspace path without binding it. + + The UI calls this before persisting a manually typed path (/workspace + set) so a typo, file path, deleted folder, sensitive dir, or filesystem + root is rejected up front with the canonical path returned on success, + instead of being stored client-side and silently dropped at chat time. + Admin-gated like /browse: it confirms path existence on the host. + """ + owner = get_current_user(request) + if not owner_is_admin_or_single_user(owner): + raise HTTPException(status_code=403, detail="Workspace selection is admin-only") + from src.tool_execution import vet_workspace + resolved = vet_workspace(path) + return {"ok": resolved is not None, "path": resolved} + + return router diff --git a/src/agent_loop.py b/src/agent_loop.py index 4843f28a1..26938c429 100644 --- a/src/agent_loop.py +++ b/src/agent_loop.py @@ -272,7 +272,7 @@ _DOMAIN_TOOL_MAP = { "notes_calendar_tasks": {"manage_notes", "manage_calendar", "manage_tasks"}, "ui": {"ui_control"}, "sessions": {"create_session", "list_sessions", "manage_session", "send_to_session", "search_chats"}, - "files": {"bash", "python", "read_file", "write_file", "edit_file", "grep", "glob", "ls"}, + "files": {"bash", "python", "read_file", "write_file", "edit_file", "grep", "glob", "ls", "get_workspace"}, "settings": {"manage_settings", "manage_endpoints", "manage_mcp", "manage_webhooks", "manage_tokens", "app_api"}, } @@ -309,6 +309,7 @@ NEVER pipe multi-line Python through `python -c "..."` — shell quoting eats re ``` Execute Python code. Use for computation, data processing, scripting. NOT for writing code for the user (use create_document for that). Same sandbox limits as bash — no TTY, no GUI, no `input()`; for anything the user should interact with, generate a single HTML file with inline JS instead. +Prefer a dedicated tool whenever one fits the job (reading, searching, or writing files); use python only for computation/processing no dedicated tool covers - not for reading or writing files. Do NOT use Python/requests for web lookup/search/latest/current requests when `web_search` or `web_fetch` is available.""", "web_search": """\ @@ -347,6 +348,11 @@ Write content to a file. First line is the path, rest is the content.""", ``` Edit an EXISTING file by exact string replacement. PREFER this over bash (sed/echo/redirects) for changing files — it shows a before/after diff. `old_string` must match the file exactly and be unique unless `replace_all` is true. Use write_file to create a new file.""", + "get_workspace": """\ +```get_workspace +``` +Return the absolute path of the active workspace folder. File tools are CONFINED to it (paths can be RELATIVE to it); the shell starts there (cwd) but is NOT sandboxed. Call this first when the user says "the project"/"the code"/"this folder" without a path, instead of asking them. No arguments.""", + "create_document": """\ ```create_document @@ -1726,6 +1732,7 @@ async def stream_agent_loop( plan_mode: bool = False, approved_plan: Optional[str] = None, tool_policy: Optional[ToolPolicy] = None, + workspace: Optional[str] = None, _is_teacher_run: bool = False, ) -> AsyncGenerator[str, None]: """Streaming agent loop generator. @@ -1795,7 +1802,17 @@ async def stream_agent_loop( if not guide_only and not _relevant_tools and bool(_intent.get("low_signal")): from src.tool_index import ALWAYS_AVAILABLE _relevant_tools = set(ALWAYS_AVAILABLE) - logger.info("[tool-rag] Low-signal agent message; skipping retrieval and using always-available tools only") + if workspace: + # An active workspace IS the file-work signal: a vague "look at the + # project" means explore this folder. Surface only the READ-ONLY file + # tools (intersection with the plan-mode read-only allowlist) so the + # agent can investigate; write/shell tools stay out until the request + # actually calls for them (RAG retrieval adds those on a real ask). + from src.tool_security import PLAN_MODE_READONLY_TOOLS + _relevant_tools |= (_DOMAIN_TOOL_MAP["files"] & PLAN_MODE_READONLY_TOOLS) + logger.info("[tool-rag] Low-signal but workspace active; including read-only file tools") + else: + logger.info("[tool-rag] Low-signal agent message; skipping retrieval and using always-available tools only") if not guide_only and not _relevant_tools: try: from src.tool_index import get_tool_index, ALWAYS_AVAILABLE @@ -2644,6 +2661,7 @@ async def stream_agent_loop( tool_policy=tool_policy, owner=owner, progress_cb=_push_progress, + workspace=workspace, ) finally: # Sentinel so the drainer knows to stop. diff --git a/src/agent_tools/__init__.py b/src/agent_tools/__init__.py index 4db923a9a..52fe4a99c 100644 --- a/src/agent_tools/__init__.py +++ b/src/agent_tools/__init__.py @@ -20,7 +20,7 @@ logger = logging.getLogger(__name__) from .subprocess_tools import BashTool, PythonTool from .web_tools import WebSearchTool, WebFetchTool -from .filesystem_tools import ReadFileTool, WriteFileTool, EditFileTool, LsTool, GlobTool, GrepTool +from .filesystem_tools import ReadFileTool, WriteFileTool, EditFileTool, LsTool, GlobTool, GrepTool, GetWorkspaceTool from .document_tools import CreateDocumentTool, UpdateDocumentTool, EditDocumentTool, SuggestDocumentTool, ManageDocumentTool TOOL_HANDLERS = { @@ -39,6 +39,7 @@ TOOL_HANDLERS = { "edit_document": EditDocumentTool().execute, "suggest_document": SuggestDocumentTool().execute, "manage_documents": ManageDocumentTool().execute, + "get_workspace": GetWorkspaceTool().execute, } # --------------------------------------------------------------------------- @@ -51,7 +52,7 @@ PYTHON_TIMEOUT = 30 # Tool types that trigger execution TOOL_TAGS = {"bash", "python", "web_search", "web_fetch", "read_file", "write_file", "edit_file", - "grep", "glob", "ls", + "grep", "glob", "ls", "get_workspace", "create_document", "update_document", "edit_document", "search_chats", "chat_with_model", "create_session", "list_sessions", diff --git a/src/agent_tools/filesystem_tools.py b/src/agent_tools/filesystem_tools.py index 3b5425242..7ba22161c 100644 --- a/src/agent_tools/filesystem_tools.py +++ b/src/agent_tools/filesystem_tools.py @@ -46,13 +46,7 @@ def _unified_diff(old: str, new: str, path: str) -> Optional[Dict[str, Any]]: class EditFileTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate try: args = json.loads(content) if content.strip().startswith("{") else {} except (json.JSONDecodeError, TypeError): @@ -64,8 +58,7 @@ class EditFileTool: if not raw_path: return {"error": "edit_file: path required", "exit_code": 1} try: - path = (_resolve_tool_path_in_workspace(workspace, raw_path) - if workspace else _resolve_tool_path(raw_path)) + path = _resolve_tool_path(raw_path) except ValueError as e: return {"error": f"edit_file: {e}", "exit_code": 1} if old == "": @@ -113,13 +106,7 @@ class EditFileTool: class ReadFileTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate raw_path, offset, limit = content.split("\n", 1)[0].strip(), 0, 0 _stripped = content.strip() if _stripped.startswith("{"): @@ -131,8 +118,7 @@ class ReadFileTool: except (json.JSONDecodeError, TypeError, ValueError): pass try: - path = (_resolve_tool_path_in_workspace(workspace, raw_path) - if workspace else _resolve_tool_path(raw_path)) + path = _resolve_tool_path(raw_path) except ValueError as e: return {"error": f"read_file: {e}", "exit_code": 1} try: @@ -170,19 +156,12 @@ class ReadFileTool: class WriteFileTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate lines = content.split("\n", 1) raw_path = lines[0].strip() body = lines[1] if len(lines) > 1 else "" try: - path = (_resolve_tool_path_in_workspace(workspace, raw_path) - if workspace else _resolve_tool_path(raw_path)) + path = _resolve_tool_path(raw_path) except ValueError as e: return {"error": f"write_file: {e}", "exit_code": 1} try: @@ -212,13 +191,7 @@ class WriteFileTool: class LsTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate raw_path = "" _s = (content or "").strip() if _s.startswith("{"): @@ -267,13 +240,7 @@ class LsTool: class GlobTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate args = {} _s = (content or "").strip() if _s.startswith("{"): @@ -325,13 +292,7 @@ class GlobTool: class GrepTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import ( - _resolve_tool_path, - _resolve_tool_path_in_workspace, - _resolve_search_root, - _truncate - ) - workspace = ctx.get("workspace") + from src.tool_execution import _resolve_tool_path, _resolve_search_root, _truncate args: Dict[str, Any] = {} _s = (content or "").strip() if _s.startswith("{"): @@ -417,3 +378,21 @@ class GrepTool: if len(lines) >= max_hits: out += f"\n... [capped at {max_hits} matches]" return {"output": _truncate(out), "exit_code": 0} + +class GetWorkspaceTool: + """Report the active workspace folder (no args). File tools are confined to + it; the shell starts there (cwd) but is NOT sandboxed.""" + async def execute(self, content: str, ctx: dict) -> dict: + from src.tool_execution import get_active_workspace + ws = get_active_workspace() + if ws: + return { + "output": f"{ws}\n(File tools are confined to this folder; the shell starts " + f"here but is not sandboxed and can reach outside it.)", + "exit_code": 0, + } + return { + "output": "No workspace is set. File tools use the default allowed roots; " + "resolve paths from the user or use absolute paths.", + "exit_code": 0, + } diff --git a/src/agent_tools/subprocess_tools.py b/src/agent_tools/subprocess_tools.py index 6b5972030..8a0e2b5d5 100644 --- a/src/agent_tools/subprocess_tools.py +++ b/src/agent_tools/subprocess_tools.py @@ -102,16 +102,15 @@ async def _run_subprocess_streaming( class BashTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import _AGENT_WORKDIR, _truncate + from src.tool_execution import agent_cwd, _truncate progress_cb = ctx.get("progress_cb") - workspace = ctx.get("workspace") _subproc_env = ctx.get("subproc_env") proc = await asyncio.create_subprocess_shell( content, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE, env=_subproc_env, - cwd=workspace or _AGENT_WORKDIR, + cwd=agent_cwd(), ) stdout, stderr, rc, timed_out = await _run_subprocess_streaming( proc, @@ -129,16 +128,15 @@ class BashTool: class PythonTool: async def execute(self, content: str, ctx: dict) -> dict: - from src.tool_execution import _AGENT_WORKDIR, _truncate + from src.tool_execution import agent_cwd, _truncate progress_cb = ctx.get("progress_cb") - workspace = ctx.get("workspace") _subproc_env = ctx.get("subproc_env") proc = await asyncio.create_subprocess_exec( (sys.executable or "python"), "-I", "-c", content, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.PIPE, env=_subproc_env, - cwd=workspace or _AGENT_WORKDIR, + cwd=agent_cwd(), ) stdout, stderr, rc, timed_out = await _run_subprocess_streaming( proc, diff --git a/src/tool_execution.py b/src/tool_execution.py index 751bc13af..612364b66 100644 --- a/src/tool_execution.py +++ b/src/tool_execution.py @@ -9,6 +9,7 @@ Extracted from agent_tools.py. import asyncio import collections +import contextvars import json import logging import os @@ -146,7 +147,13 @@ def _resolve_tool_path(raw_path: str) -> str: Returns the realpath on success. Raises ValueError on rejection. Symlinks are resolved before comparison. + + When a workspace is active for this turn, paths are confined to it instead + of the default allowlist (see _resolve_tool_path_in_workspace). """ + ws = get_active_workspace() + if ws: + return _resolve_tool_path_in_workspace(ws, raw_path) if raw_path is None or not str(raw_path).strip(): raise ValueError("path is required") expanded = os.path.expanduser(str(raw_path).strip()) @@ -207,6 +214,55 @@ def _resolve_tool_path_in_workspace(workspace: str, raw_path: str) -> str: +# --------------------------------------------------------------------------- +# Active workspace (per-turn, context-local) +# --------------------------------------------------------------------------- +# Set ONCE in execute_tool_block from the request's `workspace`. The path +# resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd +# helper (agent_cwd) read it from here, so confinement is enforced in a single +# place: any tool that resolves paths through these helpers is confined +# automatically and cannot accidentally bypass the workspace. contextvars are +# task-local, so concurrent turns don't leak into each other. +_active_workspace: contextvars.ContextVar = contextvars.ContextVar( + "agent_active_workspace", default=None +) + + +def get_active_workspace() -> Optional[str]: + """The folder the agent is confined to this turn, or None.""" + return _active_workspace.get() + + +def vet_workspace(raw: str) -> Optional[str]: + """Validate a requested workspace path at bind time. + + Returns the canonical path, or None when it is unusable: not a real + directory, or itself a sensitive path (.ssh, .gnupg, ...). The in-workspace + resolver deny-lists sensitive paths *inside* the workspace, but the + empty-path search root is the workspace itself, so the root has to be + vetted before it is ever bound. + """ + raw = (raw or "").strip() + if not raw: + return None + resolved = os.path.realpath(os.path.expanduser(raw)) + if not os.path.isdir(resolved) or _is_sensitive_path(resolved): + return None + # Reject filesystem roots: binding / (or a Windows drive/UNC root) as the + # workspace would make every absolute path "inside" it, collapsing the + # confinement into host-wide file access. A root is its own dirname, which + # also covers C:\ and \\server\share without platform-specific lists. + if os.path.dirname(resolved) == resolved: + return None + return resolved + + +def agent_cwd() -> str: + """Working directory for agent subprocesses (bash/python/background jobs): + the active workspace when set, else the persistent data dir.""" + return get_active_workspace() or _AGENT_WORKDIR + + def get_mcp_manager(): from src import agent_tools return agent_tools.get_mcp_manager() @@ -217,10 +273,15 @@ def get_mcp_manager(): def _resolve_search_root(raw_path: str) -> str: """Resolve + confine a code-nav path (grep/glob/ls). - An empty path defaults to the agent's primary root (project data dir) and a - supplied path is confined by the global allowlist + sensitive-file policy. + With a workspace active, the workspace folder is the root and a supplied + path is confined inside it. Otherwise an empty path defaults to the agent's + primary root (project data dir) and a supplied path is confined by the + global allowlist + sensitive-file policy. """ raw = (raw_path or "").strip() + ws = get_active_workspace() + if ws: + return os.path.realpath(ws) if not raw else _resolve_tool_path_in_workspace(ws, raw) if not raw: roots = _tool_path_roots() return roots[0] if roots else os.path.realpath(".") @@ -392,7 +453,6 @@ async def _direct_fallback( tool: str, content: str, progress_cb: Optional[Callable[[Dict], Awaitable[None]]] = None, - workspace: Optional[str] = None, ) -> Optional[Dict]: _subproc_env = { **os.environ, @@ -405,7 +465,6 @@ async def _direct_fallback( try: ctx = { "progress_cb": progress_cb, - "workspace": workspace, "subproc_env": _subproc_env, } @@ -448,6 +507,34 @@ async def execute_tool_block( ) -> Tuple[str, Dict]: """Execute a single tool block. Returns (description, result_dict). + Thin wrapper: bind the per-turn workspace (so the path resolvers + subprocess + cwd confine to it) for the duration of this call, then delegate. Reset on the + way out so the binding never leaks to the next tool call. + """ + token = _active_workspace.set(workspace or None) + try: + return await _execute_tool_block_impl( + block, + session_id=session_id, + disabled_tools=disabled_tools, + owner=owner, + progress_cb=progress_cb, + tool_policy=tool_policy, + ) + finally: + _active_workspace.reset(token) + + +async def _execute_tool_block_impl( + block: Any, + session_id: Optional[str] = None, + disabled_tools: Optional[set] = None, + owner: Optional[str] = None, + progress_cb: Optional[Callable[[Dict], Awaitable[None]]] = None, + tool_policy: Optional[Any] = None, +) -> Tuple[str, Dict]: + """Execute a single tool block. Returns (description, result_dict). + `progress_cb` is forwarded to long-running subprocess tools (bash, python) so the agent loop can emit `tool_progress` SSE events while the command is in flight. Ignored by other tools. @@ -621,7 +708,7 @@ async def execute_tool_block( _is_bg, _bg_cmd = _split_bg_marker(content) if _is_bg and _bg_cmd: from src import bg_jobs - rec = bg_jobs.launch(_bg_cmd, session_id=session_id, cwd=_AGENT_WORKDIR) + rec = bg_jobs.launch(_bg_cmd, session_id=session_id, cwd=agent_cwd()) short = _bg_cmd.strip().split(chr(10))[0][:80] desc = f"bash (background): {short}" result = { @@ -644,7 +731,7 @@ async def execute_tool_block( first_line = content.split(chr(10))[0][:80] desc = f"{tool}: {first_line}" result = await _call_mcp_tool(tool, content, progress_cb=progress_cb) - elif tool in ("grep", "glob", "ls"): + elif tool in ("grep", "glob", "ls", "get_workspace"): # Code-navigation tools — no MCP server; run the direct implementation. first_line = content.split(chr(10))[0][:80] desc = f"{tool}: {first_line}" @@ -744,7 +831,7 @@ async def execute_tool_block( desc = "edit_image" result = await do_edit_image(content, owner=owner) elif tool == "edit_file": - result = await _direct_fallback(tool, content, workspace=workspace) or {"error": "edit failed", "exit_code": 1} + result = await _direct_fallback(tool, content) or {"error": "edit failed", "exit_code": 1} desc = result.get("output") or result.get("error") or "edit_file" elif tool == "trigger_research": desc = "trigger_research" diff --git a/src/tool_index.py b/src/tool_index.py index 4eb8a51ee..32c7bcf41 100644 --- a/src/tool_index.py +++ b/src/tool_index.py @@ -67,14 +67,15 @@ COLLECTION_NAME = "odysseus_tool_index" # Each tool gets a searchable description that helps retrieval. # These are richer than the system prompt one-liners — they're for embedding. BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = { - "bash": "Run shell commands on the server. Install packages, check files, git operations, system info, and process management. Do not use for web lookup/search; use web_search or web_fetch when web tools are available.", - "python": "Execute Python code for computation, data processing, math, scripting, and parsing. Not for writing code for the user. Do not use for web lookup/search; use web_search or web_fetch when web tools are available.", + "bash": "Run shell commands on the server. Install packages, git operations, builds, system info, process management. Prefer a dedicated tool whenever one fits the job (file read/write/edit, search, listing); use bash only for what no dedicated tool covers. Do not use for web lookup/search; use web_search or web_fetch when web tools are available.", + "python": "Execute Python code for computation, data processing, math, scripting, and parsing. Not for writing code for the user. Prefer a dedicated tool for reading, writing, or searching files; use python only for what no dedicated tool covers. Do not use for web lookup/search; use web_search or web_fetch when web tools are available.", "web_search": "Quick single web lookup for a fact, current event, latest/current information, or doc mid-task. Use this instead of bash/curl/python/requests for web searches. NOT for 'research X' / 'do research on X' requests — those are deep-research jobs (use trigger_research). web_search = one query; trigger_research = a full researched report in the sidebar.", "web_fetch": "Fetch and read the text content of a specific URL/website the user names (e.g. 'check example.com', 'open this link'). Use when you have a concrete URL; for open-ended lookups use web_search instead.", "read_file": "Read a file from disk and return its contents. View source code, config files, logs. Supports an optional line range (offset/limit) for large files.", "grep": "Search file CONTENTS for a regex across a directory tree (ripgrep-backed, honours .gitignore). Returns file:line:match. Use to find where code/symbols/strings live — prefer over bash grep.", "glob": "Find FILES by glob pattern (e.g. '**/*.py'), newest first. Use to locate files by name/extension — prefer over bash find/ls.", "ls": "List a directory's entries (folders then files with sizes). Use to see what's in a folder — prefer over bash ls.", + "get_workspace": "Return the absolute path of the active workspace folder the user is working in. File tools are confined to it; the shell starts there but is not sandboxed. Call this first when the user refers to 'the project'/'the code'/'this folder' without giving a path, instead of asking them.", "write_file": "Write/create or fully rewrite a file ON DISK (source code, configs, project files). Use for new files or full rewrites — NOT create_document (editor panel) and NOT a bash heredoc.", "edit_file": "Edit an existing file ON DISK by exact string replacement (fix a bug, change a function). Shows a diff. The tool for changing files on disk — NOT edit_document (editor panel) and NOT bash sed/heredoc.", "create_document": "Create a new document in the editor panel. For code, articles, text content longer than 15 lines, unless an already-open document/email draft is the obvious target. If an email compose draft is open, edit that draft instead of creating another document.", diff --git a/src/tool_schemas.py b/src/tool_schemas.py index e0d01f008..5735208ec 100644 --- a/src/tool_schemas.py +++ b/src/tool_schemas.py @@ -25,7 +25,7 @@ FUNCTION_TOOL_SCHEMAS = [ "type": "function", "function": { "name": "bash", - "description": "Run a shell command (full access)", + "description": "Run a shell command (full access). Prefer a dedicated tool whenever one fits the job (reading, writing, editing, searching, or listing files); use bash only for what no dedicated tool covers (installs, git, builds, running programs, system info). Do NOT create or edit files via bash redirects/heredocs/sed -- use the dedicated file tools.", "parameters": { "type": "object", "properties": { @@ -39,7 +39,7 @@ FUNCTION_TOOL_SCHEMAS = [ "type": "function", "function": { "name": "python", - "description": "Execute Python code to compute a result or test something", + "description": "Execute Python code to compute a result or test something. Prefer a dedicated tool whenever one fits the job (reading, writing, or searching files); use python only for computation, data processing, or scripting no dedicated tool covers.", "parameters": { "type": "object", "properties": { @@ -141,6 +141,14 @@ FUNCTION_TOOL_SCHEMAS = [ } } }, + { + "type": "function", + "function": { + "name": "get_workspace", + "description": "Return the absolute path of the active workspace folder the user is working in. File tools are confined to it; the shell starts there but is not sandboxed. Call this first when the user refers to 'the project'/'the code'/'this folder' without a path, instead of asking them. Takes no arguments.", + "parameters": {"type": "object", "properties": {}, "required": []} + } + }, { "type": "function", "function": { @@ -1246,6 +1254,8 @@ def function_call_to_tool_block(name: str, arguments: str) -> Optional[ToolBlock content = args.get("path", "") elif tool_type in ("grep", "glob", "ls"): content = json.dumps(args) if args else "{}" + elif tool_type == "get_workspace": + content = "" elif tool_type == "write_file": content = args.get("path", "") + "\n" + args.get("content", "") elif tool_type == "edit_file": diff --git a/src/tool_security.py b/src/tool_security.py index 6b7bc90df..6d29a6ab9 100644 --- a/src/tool_security.py +++ b/src/tool_security.py @@ -20,6 +20,7 @@ NON_ADMIN_BLOCKED_TOOLS = { "grep", "glob", "ls", + "get_workspace", "search_chats", "manage_memory", "manage_skills", @@ -66,6 +67,7 @@ PLAN_MODE_READONLY_TOOLS = { "grep", "glob", "ls", + "get_workspace", "web_search", "web_fetch", "search_chats", diff --git a/static/app.js b/static/app.js index e1ffcc612..ed8b6e49a 100644 --- a/static/app.js +++ b/static/app.js @@ -4,6 +4,7 @@ // ============================================ import Storage from './js/storage.js'; import uiModule from './js/ui.js'; +import workspaceModule from './js/workspace.js'; import fileHandlerModule from './js/fileHandler.js'; import modelsModule from './js/models.js'; import ragModule from './js/rag.js'; @@ -1622,6 +1623,8 @@ function initializeEventListeners() { // Slide the pill to the active button const toggle = agentBtn.closest('.mode-toggle'); if (toggle) toggle.classList.toggle('mode-chat', mode === 'chat'); + // Workspace pill + overflow entry are agent-only - hide immediately (no flash). + try { workspaceModule.applyMode(mode); } catch (_) {} // Delay tool glow-up for a staggered effect setTimeout(() => applyModeToToggles(mode), 500); } @@ -1697,6 +1700,7 @@ function initializeEventListeners() { } setupToggle('web-toggle-btn', 'web-toggle', 'web'); setupToggle('bash-toggle-btn', 'bash-toggle', 'bash'); + try { workspaceModule.initWorkspace(); } catch (_) {} // Document editor toggle (special: uses module panel, not a checkbox) const overflowDocBtn = el('overflow-doc-btn'); diff --git a/static/index.html b/static/index.html index 60a2764d9..b717cd3e6 100644 --- a/static/index.html +++ b/static/index.html @@ -1040,6 +1040,13 @@ <span>RAG</span> <span class="overflow-active-dot"></span> </button> + <button type="button" class="overflow-menu-item" id="overflow-workspace-btn"> + <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"> + <path d="M3 7a2 2 0 0 1 2-2h4l2 2h8a2 2 0 0 1 2 2v8a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"/> + </svg> + <span>Workspace</span> + <span class="overflow-active-dot"></span> + </button> <!-- Inline "deep research mode" toggle removed (superseded by the Deep Research sidebar / trigger_research). The hidden #research-toggle checkbox is kept inert so existing JS refs @@ -1071,6 +1078,12 @@ <polyline points="4 17 10 11 4 5"/><line x1="12" y1="19" x2="20" y2="19"/> </svg> </button> + <!-- Workspace indicator (hidden until a folder is set) --> + <button type="button" class="input-icon-btn tool-indicator" title="Workspace - click to clear" id="workspace-indicator-btn" aria-label="Clear workspace" style="display:none;"> + <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 7a2 2 0 0 1 2-2h4l2 2h8a2 2 0 0 1 2 2v8a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"/></svg> + <span style="font-size:11px;margin-left:2px;max-width:120px;overflow:hidden;text-overflow:ellipsis;white-space:nowrap;" id="workspace-indicator-name"></span> + <svg class="tool-indicator-x" width="10" height="10" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="3" stroke-linecap="round"><line x1="6" y1="6" x2="18" y2="18"/><line x1="18" y1="6" x2="6" y2="18"/></svg> + </button> <!-- RAG toolbar indicator (hidden until active) --> <button type="button" class="input-icon-btn tool-indicator" title="RAG active — click to deactivate" id="rag-indicator-btn" style="display:none;"> <svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"> @@ -2342,7 +2355,7 @@ <script type="module" src="/static/js/chatRenderer.js"></script> <script type="module" src="/static/js/codeRunner.js"></script> <script type="module" src="/static/js/chatStream.js"></script> -<script type="module" src="/static/js/chat.js?v=20260604s"></script> +<script type="module" src="/static/js/chat.js?v=20260609ws"></script> <script type="module" src="/static/js/cookbook.js"></script> <script src="/static/js/cookbookSchedule.js"></script> <script type="module" src="/static/js/search-chat.js"></script> diff --git a/static/js/chat.js b/static/js/chat.js index 7ecefdb7d..434976c65 100644 --- a/static/js/chat.js +++ b/static/js/chat.js @@ -819,6 +819,10 @@ import { wireArrowUpRecall, getLastUserMessageFromChatHistory } from './composer if (incognitoChk && incognitoChk.checked) { fd.append('incognito', 'true'); } + const _ws = (Storage.KEYS && Storage.get(Storage.KEYS.WORKSPACE, '')) || ''; + if (_ws) { + fd.append('workspace', _ws); + } if (presetsModule.getSelectedPreset()) { fd.append('preset_id', presetsModule.getSelectedPreset()); } @@ -1781,6 +1785,21 @@ import { wireArrowUpRecall, getLastUserMessageFromChatHistory } from './composer _sourcesData = json.data; _sourcesType = 'web'; _sourcesHtml = _buildSourcesBox(json.data, 'web'); } + } else if (json.type === 'workspace_rejected') { + // Server refused to bind the posted workspace (deleted folder, + // file path, sensitive dir, filesystem root). Clear the stored + // value so the pill stops claiming a confinement that is not in + // effect, and tell the user. + const _wsPath = (json.data && json.data.path) || ''; + import('./workspace.js').then((m) => { + const ws = m.default || m; + if (ws && ws.setWorkspace) ws.setWorkspace(''); + }); + uiModule.showToast( + `Workspace ${_wsPath || '(unknown)'} is no longer usable; running without confinement`, + 6000 + ); + continue; } else if (json.type === 'model_fallback') { // Model went offline — switched to fallback var _fbData = json.data || {}; diff --git a/static/js/slashCommands.js b/static/js/slashCommands.js index 79b037cf4..11165e93e 100644 --- a/static/js/slashCommands.js +++ b/static/js/slashCommands.js @@ -17,6 +17,7 @@ import chatRenderer from './chatRenderer.js'; import spinnerModule from './spinner.js'; import themeModule from './theme.js'; import documentModule from './document.js'; +import workspaceModule from './workspace.js'; import settingsModule from './settings.js'; import cookbookModule from './cookbook.js'; import { EVAL_PROMPTS } from './compare/index.js'; @@ -1229,6 +1230,40 @@ async function _cmdToggleDoc(args, ctx) { return true; } +// Workspace: confine the agent's file/shell tools to a folder. Not a boolean - +// show / set <path> / clear / pick (open the directory browser). +async function _cmdWorkspace(args, ctx) { + const sub = (args[0] || '').toLowerCase(); + const rest = args.slice(1).join(' ').trim(); + const cur = workspaceModule.getWorkspace(); + if (!sub || sub === 'show' || sub === 'status' || sub === 'info') { + slashReply(cur ? `Workspace: <code>${uiModule.esc(cur)}</code>` : 'No workspace set. <code>/workspace pick</code> or <code>/workspace set /path</code>.'); + return true; + } + if (sub === 'set' || sub === 'cd' || sub === 'use') { + if (!rest) { slashReply('Usage: <code>/workspace set /absolute/path</code>'); return true; } + // Validate server-side before persisting so the pill never claims a + // workspace the backend will refuse to bind (typo, file path, deleted + // folder, sensitive dir, filesystem root). + workspaceModule.vetAndSetWorkspace(rest).then(({ ok, path }) => { + if (ok) slashReply(`Workspace set: <code>${uiModule.esc(path)}</code>`); + else slashReply(`Not a usable workspace folder: <code>${uiModule.esc(rest)}</code>. It must be an existing directory, not a filesystem root or sensitive path.`); + }); + return true; + } + if (sub === 'clear' || sub === 'off' || sub === 'none' || sub === 'unset') { + workspaceModule.clearWorkspace(); + slashReply('Workspace cleared.'); + return true; + } + if (sub === 'pick' || sub === 'browse' || sub === 'open') { + workspaceModule.openWorkspaceBrowser(); + return true; + } + slashReply('Usage: <code>/workspace</code> · <code>set /path</code> · <code>clear</code> · <code>pick</code>'); + return true; +} + async function _cmdToggleShow(args, ctx) { const name = (args[0] || '').toLowerCase(); const val = (args[1] || '').toLowerCase(); @@ -5731,6 +5766,14 @@ const COMMANDS = { '_show': { handler: _cmdToggleShow, alias: [], help: 'Show all toggle states', usage: '/toggle' } } }, + workspace: { + alias: ['ws'], + category: 'Agent', + help: 'Set the folder the agent works in', + handler: _cmdWorkspace, + noUserBubble: true, + usage: '/workspace [set <path> | clear | pick]', + }, memory: { alias: ['m'], category: 'Memory', diff --git a/static/js/storage.js b/static/js/storage.js index c72a5dbb1..7ff9c6bd5 100644 --- a/static/js/storage.js +++ b/static/js/storage.js @@ -23,7 +23,8 @@ export const KEYS = { MCP_ACTIVE: 'odysseus-mcp-active', SECTION_ORDER: 'sidebar-section-order', ADMIN_LAST_TAB: 'admin-last-tab', - DENSITY: 'odysseus-density' + DENSITY: 'odysseus-density', + WORKSPACE: 'odysseus-workspace' }; /** diff --git a/static/js/workspace.js b/static/js/workspace.js new file mode 100644 index 000000000..fd6ab4184 --- /dev/null +++ b/static/js/workspace.js @@ -0,0 +1,208 @@ +// static/js/workspace.js +// +// Workspace picker: browse server directories in a draggable modal, choose a +// folder, and show it as a removable pill in the chat input bar. While set, the +// chat request sends `workspace` so the agent's file/shell tools are confined +// to that folder (see routes/chat_routes.py + src/tool_execution.py). + +import Storage, { KEYS } from './storage.js'; +import uiModule from './ui.js'; +import { makeWindowDraggable } from './windowDrag.js'; + +const API_BASE = window.location.origin; +// Same folder glyph as the overflow menu item + pill (not an emoji). +const _FOLDER_SVG = '<svg class="workspace-row-icon" width="15" height="15" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round"><path d="M3 7a2 2 0 0 1 2-2h4l2 2h8a2 2 0 0 1 2 2v8a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"/></svg>'; +let _modal = null; +let _curPath = ''; + +export function getWorkspace() { + return Storage.get(KEYS.WORKSPACE, '') || ''; +} + +function _basename(p) { + if (!p) return ''; + // Handle both POSIX (/) and Windows (\) separators. + const parts = p.replace(/[\\/]+$/, '').split(/[\\/]/); + return parts[parts.length - 1] || p; +} + +// Workspace only applies to agent mode (it scopes the file/shell tools), so the +// pill + overflow entry are hidden in chat mode, like the bash toggle. +function _isChatMode() { + const b = document.getElementById('mode-chat-btn'); + return !!(b && b.classList.contains('active')); +} + +export function syncWorkspaceIndicator(path) { + const chat = _isChatMode(); + const pill = document.getElementById('workspace-indicator-btn'); + const name = document.getElementById('workspace-indicator-name'); + const overflow = document.getElementById('overflow-workspace-btn'); + if (pill) { + pill.style.display = (path && !chat) ? '' : 'none'; + pill.classList.toggle('active', !!path); + if (path) pill.title = `Workspace: ${path}\nFile tools are confined here; shell commands start here but are not sandboxed and can reach outside it.\nClick to clear.`; + } + if (name) name.textContent = path ? _basename(path) : ''; + if (overflow) { + overflow.style.display = chat ? 'none' : ''; + overflow.classList.toggle('active', !!path); + } + // Recompute the "+" overflow dot (app.js owns updatePlusDot via this event). + try { document.dispatchEvent(new CustomEvent('overflow-state-change')); } catch (_) {} +} + +// Called by the agent/chat mode toggle so the pill + overflow entry follow mode. +export function applyMode(_mode) { + syncWorkspaceIndicator(getWorkspace()); +} + +export function setWorkspace(path) { + if (path) Storage.set(KEYS.WORKSPACE, path); + else Storage.remove(KEYS.WORKSPACE); + syncWorkspaceIndicator(path || ''); +} + +/** + * Validate a manually entered path server-side, then persist the canonical + * form. Returns {ok, path|null}. Without this, a typo / file path / deleted + * folder / filesystem root would be stored and shown as active while the + * backend silently refuses to bind it on every send. + */ +export async function vetAndSetWorkspace(path) { + try { + const res = await fetch(`${API_BASE}/api/workspace/vet?path=${encodeURIComponent(path)}`, { credentials: 'same-origin' }); + if (!res.ok) return { ok: false, path: null }; + const data = await res.json(); + if (data.ok && data.path) { + setWorkspace(data.path); + return { ok: true, path: data.path }; + } + return { ok: false, path: null }; + } catch (e) { + return { ok: false, path: null }; + } +} + +export function clearWorkspace() { + setWorkspace(''); + if (uiModule && uiModule.showToast) uiModule.showToast('Workspace cleared'); +} + +async function _load(path) { + const url = `${API_BASE}/api/workspace/browse${path ? `?path=${encodeURIComponent(path)}` : ''}`; + const res = await fetch(url, { credentials: 'same-origin' }); + if (!res.ok) throw new Error(`browse failed: ${res.status}`); + return res.json(); +} + +function _render(data) { + _curPath = data.path; + const body = _modal.querySelector('#workspace-body'); + const pathEl = _modal.querySelector('#workspace-cur-path'); + if (pathEl) { + // Reflect the resolved (realpath) location back into the editable field. + pathEl.value = data.path; + pathEl.title = data.path; + } + let rows = ''; + if (data.parent) { + rows += `<div class="workspace-row workspace-up" data-path="${encodeURIComponent(data.parent)}">↑ ..</div>`; + } + for (const d of data.dirs) { + // Backend supplies the full child path (os.path.join → cross-platform). + rows += `<div class="workspace-row" data-path="${encodeURIComponent(d.path)}">${_FOLDER_SVG}<span>${uiModule.esc(d.name)}</span></div>`; + } + if (data.truncated) { + rows += '<div class="workspace-empty">Too many folders to list. Type or paste a path above to jump in.</div>'; + } + if (!data.dirs.length && !data.parent) rows = '<div class="workspace-empty">No subfolders</div>'; + body.innerHTML = rows || '<div class="workspace-empty">No subfolders</div>'; + body.querySelectorAll('.workspace-row').forEach((row) => { + row.addEventListener('click', () => _navigate(decodeURIComponent(row.dataset.path))); + }); + // Filesystem roots (and sensitive dirs) can be browsed through but never + // bound as the workspace; the backend rejects them too. + const useBtn = _modal.querySelector('#workspace-use'); + if (useBtn) { + useBtn.disabled = data.selectable === false; + useBtn.title = data.selectable === false ? 'This folder cannot be used as a workspace' : ''; + } +} + +async function _navigate(path) { + try { + _render(await _load(path)); + } catch (e) { + if (uiModule && uiModule.showError) uiModule.showError('Could not open folder'); + } +} + +function _getModal() { + if (_modal) return _modal; + _modal = document.createElement('div'); + _modal.id = 'workspace-modal'; + _modal.className = 'modal'; + _modal.style.display = 'none'; + _modal.innerHTML = ` + <div class="modal-content"> + <div class="modal-header"> + <h4><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px"><path d="M3 7a2 2 0 0 1 2-2h4l2 2h8a2 2 0 0 1 2 2v8a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2z"/></svg>Select workspace</h4> + <button class="close-btn" id="workspace-close" aria-label="Close">✖</button> + </div> + <input type="text" class="styled-prompt-input workspace-cur" id="workspace-cur-path" + spellcheck="false" autocomplete="off" autocapitalize="off" autocorrect="off" + placeholder="Type or paste a folder path, then press Enter" /> + <p class="muted workspace-note">File tools are <strong>confined</strong> to this folder. Shell commands start here but are <strong>not sandboxed</strong> and can reach outside it. A workspace scopes the tools; it is not a security boundary.</p> + <div class="modal-body workspace-body" id="workspace-body"></div> + <div class="modal-footer workspace-footer"> + <button type="button" class="confirm-btn confirm-btn-secondary" id="workspace-cancel">Cancel</button> + <button type="button" class="confirm-btn confirm-btn-primary" id="workspace-use">Use this folder</button> + </div> + </div>`; + document.body.appendChild(_modal); + _modal.querySelector('#workspace-close').addEventListener('click', closeWorkspaceBrowser); + _modal.querySelector('#workspace-cancel').addEventListener('click', closeWorkspaceBrowser); + // Editable path bar: Enter navigates to a typed/pasted folder. + _modal.querySelector('#workspace-cur-path').addEventListener('keydown', (e) => { + if (e.key === 'Enter') { + e.preventDefault(); + const v = e.target.value.trim(); + if (v) _navigate(v); + } + }); + _modal.querySelector('#workspace-use').addEventListener('click', () => { + setWorkspace(_curPath); + if (uiModule && uiModule.showToast) uiModule.showToast(`Workspace set: ${_basename(_curPath)}`); + closeWorkspaceBrowser(); + }); + const content = _modal.querySelector('.modal-content'); + const header = _modal.querySelector('.modal-header'); + if (content && header) makeWindowDraggable(_modal, { content, header }); + return _modal; +} + +export async function openWorkspaceBrowser() { + const modal = _getModal(); + modal.style.display = 'flex'; + try { + _render(await _load(getWorkspace() || '')); + } catch (e) { + if (uiModule && uiModule.showError) uiModule.showError('Could not browse folders'); + } +} + +export function closeWorkspaceBrowser() { + if (_modal) _modal.style.display = 'none'; +} + +export function initWorkspace() { + // Restore persisted workspace into the pill on load. + syncWorkspaceIndicator(getWorkspace()); + const overflow = document.getElementById('overflow-workspace-btn'); + if (overflow) overflow.addEventListener('click', openWorkspaceBrowser); + const pill = document.getElementById('workspace-indicator-btn'); + if (pill) pill.addEventListener('click', clearWorkspace); +} + +export default { initWorkspace, openWorkspaceBrowser, getWorkspace, setWorkspace, vetAndSetWorkspace, clearWorkspace, syncWorkspaceIndicator, applyMode }; diff --git a/static/style.css b/static/style.css index ae5b68375..b93b470f7 100644 --- a/static/style.css +++ b/static/style.css @@ -36606,3 +36606,48 @@ body.theme-frosted .modal { the input beside it (.confirm-btn won't stretch on its own). */ .ask-user-other-send { flex-shrink: 0; white-space: nowrap; min-height: 39px; } .ask-user-other-send:disabled { opacity: 0.5; cursor: default; } + +/* ── Workspace picker ───────────────────────────────────────────── */ +/* Layout (width/flex column/max-height) inherited from base .modal-content. */ +/* Editable path/address bar: reuses .styled-prompt-input for border/bg/radius/ + focus ring (set in the element's class list). Overrides only the deltas: + mono font, and full-bleed via flex stretch with no horizontal margin (the + modal-content's 10px padding is the gutter) instead of the base width:100%, + which overflowed against the overflow:auto scrollbar. */ +.workspace-cur { + align-self: stretch; + width: auto; + min-width: 0; + margin: 4px 0 8px; + font-family: var(--mono, monospace); + font-size: 12px; +} +/* flex/overflow inherited from base .modal-body; only the padding differs. */ +.workspace-body { padding: 6px 0; } +.workspace-row { + padding: 7px 18px; + cursor: pointer; + font-size: 13px; + display: flex; + align-items: center; + gap: 8px; +} +.workspace-row > span { + white-space: nowrap; + overflow: hidden; + text-overflow: ellipsis; +} +.workspace-row-icon { flex-shrink: 0; opacity: 0.75; } +.workspace-row:hover { + background: color-mix(in srgb, var(--border) 20%, transparent); +} +.workspace-up { opacity: 0.7; } +.workspace-empty { padding: 14px 18px; opacity: 0.5; font-size: 13px; } +.workspace-footer { + display: flex; + justify-content: flex-end; + gap: 8px; + padding: 10px 18px; + border-top: 1px solid var(--border); +} +.workspace-note { margin: 0 0 8px; font-size: 11px; line-height: 1.4; } diff --git a/tests/test_workspace_confine.py b/tests/test_workspace_confine.py new file mode 100644 index 000000000..81bc7235c --- /dev/null +++ b/tests/test_workspace_confine.py @@ -0,0 +1,328 @@ +"""Workspace confinement. + +The agent's per-turn workspace is a single context-local binding set in +execute_tool_block. The shared path resolvers (_resolve_tool_path / +_resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so +confinement is enforced in ONE place: a tool that uses the shared helpers is +confined automatically and a new tool cannot accidentally bypass it. + +Covers: the resolver helper, the central binding (the safety net), end-to-end +confinement of read/write/edit/grep/ls + subprocess cwd via execute_tool_block, +the get_workspace tool, no-leak across calls, and the admin-gated browse route. +""" +import json +import os +import tempfile +from types import SimpleNamespace + +import pytest + +from src.tool_execution import ( + _AGENT_WORKDIR, + _active_workspace, + _resolve_search_root, + _resolve_tool_path, + _resolve_tool_path_in_workspace, + agent_cwd, + execute_tool_block, + get_active_workspace, +) + + +def _block(tool, content=""): + return SimpleNamespace(tool_type=tool, content=content) + + +@pytest.fixture +def ws(): + d = tempfile.mkdtemp() + with open(os.path.join(d, "a.txt"), "w") as f: + f.write("x") + return d + + +@pytest.fixture +def admin(monkeypatch): + """Pass the public-tool gate so file tools dispatch in tests.""" + monkeypatch.setattr( + "src.tool_execution.owner_is_admin_or_single_user", lambda owner: True + ) + + +# ── the resolver helper ──────────────────────────────────────────────── + +def test_resolver_confines(ws): + real = os.path.realpath(os.path.join(ws, "a.txt")) + assert _resolve_tool_path_in_workspace(ws, "a.txt") == real # relative + assert _resolve_tool_path_in_workspace(ws, os.path.join(ws, "a.txt")) == real # abs inside + outside = tempfile.mkdtemp() + with pytest.raises(ValueError): # abs outside + _resolve_tool_path_in_workspace(ws, os.path.join(outside, "x.txt")) + with pytest.raises(ValueError): # parent escape + _resolve_tool_path_in_workspace(ws, os.path.join("..", "..", "escape.txt")) + + +def test_resolver_blocks_sensitive_inside_workspace(ws): + os.makedirs(os.path.join(ws, ".ssh"), exist_ok=True) + with pytest.raises(ValueError): + _resolve_tool_path_in_workspace(ws, ".ssh/authorized_keys") + + +# ── the central binding: the safety net ───────────────────────────────── + +def test_active_binding_confines_shared_resolvers(ws): + """ANY tool resolving paths through the shared helpers is confined while the + binding is active, without doing anything workspace-specific itself. This is + what stops a newly added tool from accidentally ignoring the workspace.""" + token = _active_workspace.set(ws) + try: + assert get_active_workspace() == ws + assert agent_cwd() == ws + assert _resolve_tool_path("a.txt") == os.path.realpath(os.path.join(ws, "a.txt")) + with pytest.raises(ValueError): # normally-allowed root, now outside ws + _resolve_tool_path("/tmp/whatever.txt") + assert _resolve_search_root("") == os.path.realpath(ws) + finally: + _active_workspace.reset(token) + + +def test_no_binding_uses_default_roots(): + assert get_active_workspace() is None + assert agent_cwd() == _AGENT_WORKDIR + with pytest.raises(ValueError): + _resolve_tool_path("/etc/hosts") + + +# ── end-to-end via execute_tool_block (sets + resets the binding) ─────── + +@pytest.mark.asyncio +async def test_read_write_edit_confined_e2e(ws, admin): + _, r = await execute_tool_block(_block("write_file", "note.txt\nhello"), owner="a", workspace=ws) + assert r["exit_code"] == 0 and os.path.isfile(os.path.join(ws, "note.txt")) + _, r = await execute_tool_block(_block("read_file", "note.txt"), owner="a", workspace=ws) + assert r["exit_code"] == 0 and r["output"] == "hello" + + with open(os.path.join(ws, "f.txt"), "w") as f: + f.write("foo bar") + _, r = await execute_tool_block( + _block("edit_file", json.dumps({"path": "f.txt", "old_string": "foo", "new_string": "baz"})), + owner="a", workspace=ws, + ) + assert r["exit_code"] == 0 + with open(os.path.join(ws, "f.txt")) as f: + assert f.read() == "baz bar" + + # outside the workspace is rejected, and nothing is created + outside = tempfile.mkdtemp() + of = os.path.join(outside, "secret.txt") + with open(of, "w") as f: + f.write("nope") + _, r = await execute_tool_block(_block("read_file", of), owner="a", workspace=ws) + assert r["exit_code"] == 1 and "outside the workspace" in r["error"] + escape = os.path.join(outside, "_esc.txt") + _, r = await execute_tool_block(_block("write_file", f"{escape}\nx"), owner="a", workspace=ws) + assert r["exit_code"] == 1 and "outside the workspace" in r["error"] + assert not os.path.exists(escape) + + +@pytest.mark.asyncio +async def test_grep_and_ls_confined_e2e(ws, admin): + with open(os.path.join(ws, "doc.txt"), "w") as f: + f.write("hello workspace\n") + _, r = await execute_tool_block(_block("grep", json.dumps({"pattern": "hello"})), owner="a", workspace=ws) + assert r["exit_code"] == 0 and "doc.txt" in r["output"] + outside = tempfile.mkdtemp() + _, r = await execute_tool_block(_block("grep", json.dumps({"pattern": "x", "path": outside})), owner="a", workspace=ws) + assert r["exit_code"] == 1 and "outside the workspace" in r["error"] + _, r = await execute_tool_block(_block("ls", ""), owner="a", workspace=ws) + assert r["exit_code"] == 0 and "doc.txt" in r["output"] + _, r = await execute_tool_block(_block("ls", outside), owner="a", workspace=ws) + assert r["exit_code"] == 1 and "outside the workspace" in r["error"] + + +@pytest.mark.asyncio +async def test_subprocess_cwd_is_workspace_e2e(ws, admin): + """python tool runs with cwd = workspace (OS-agnostic probe).""" + _, r = await execute_tool_block(_block("python", "import os; print(os.getcwd())"), owner="a", workspace=ws) + assert r["exit_code"] == 0 + assert os.path.realpath(r["output"].strip()) == os.path.realpath(ws) + + +# ── get_workspace tool ────────────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_get_workspace_tool(ws, admin): + _, r = await execute_tool_block(_block("get_workspace", ""), owner="a", workspace=ws) + assert r["exit_code"] == 0 and r["output"].startswith(ws) and "not sandboxed" in r["output"] + _, r = await execute_tool_block(_block("get_workspace", ""), owner="a") # none active + assert r["exit_code"] == 0 and "No workspace" in r["output"] + + +# ── no leak across calls ──────────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_binding_does_not_leak(ws, admin): + await execute_tool_block(_block("ls", ""), owner="a", workspace=ws) + assert get_active_workspace() is None + + +# ── tool selection: an active workspace is the file-work signal ───────── +# A vague ("low-signal") message like "look at the local project" matches no +# domain keywords, so retrieval is normally skipped. When a workspace is set it +# must still surface the file tools, otherwise the agent says it has no file +# access (the bug this guards against). + +def _sent_tool_names(monkeypatch, *, workspace): + import asyncio + import src.agent_loop as al + + monkeypatch.setattr(al, "get_setting", lambda key, default=None: default, raising=False) + monkeypatch.setattr(al, "get_mcp_manager", lambda: None, raising=False) + monkeypatch.setattr(al, "estimate_tokens", lambda *a, **k: 10, raising=False) + # Isolate the selection logic from owner gating (tested separately). + monkeypatch.setattr(al, "blocked_tools_for_owner", lambda owner: set(), raising=False) + + captured = [] + + async def _fake_stream(_candidates, messages, **kwargs): + captured.append(kwargs.get("tools")) + yield "data: " + json.dumps({"delta": "ok"}) + "\n\n" + yield "data: [DONE]\n\n" + + monkeypatch.setattr(al, "stream_llm_with_fallback", _fake_stream, raising=False) + + async def _run(): + gen = al.stream_agent_loop( + "https://api.openai.com/v1", "gpt-test", + [{"role": "user", "content": "look at the local project"}], + max_rounds=1, relevant_tools=None, owner="admin", workspace=workspace, + ) + return [c async for c in gen] + + asyncio.run(_run()) + schemas = captured[0] or [] + return {t["function"]["name"] for t in schemas if isinstance(t, dict) and "function" in t} + + +def test_low_signal_with_workspace_surfaces_readonly_file_tools(monkeypatch): + names = _sent_tool_names(monkeypatch, workspace="/tmp") + # read-only nav tools surface so the agent can explore + assert "read_file" in names + assert "get_workspace" in names + assert "grep" in names + # write/shell tools do NOT surface on a vague message + assert "write_file" not in names + assert "edit_file" not in names + assert "bash" not in names + assert "python" not in names + + +def test_low_signal_without_workspace_excludes_file_tools(monkeypatch): + names = _sent_tool_names(monkeypatch, workspace=None) + assert "read_file" not in names + assert "get_workspace" not in names + + +# ── browse route is admin-gated ───────────────────────────────────────── + +def test_browse_is_admin_gated(monkeypatch): + from fastapi import HTTPException + import routes.workspace_routes as wr + + router = wr.setup_workspace_routes() + browse = next(r.endpoint for r in router.routes if r.path == "/api/workspace/browse") + + monkeypatch.setattr(wr, "get_current_user", lambda req: "bob") + monkeypatch.setattr(wr, "owner_is_admin_or_single_user", lambda owner: False) + with pytest.raises(HTTPException) as ei: + browse(request=object(), path="/") + assert ei.value.status_code == 403 + + monkeypatch.setattr(wr, "owner_is_admin_or_single_user", lambda owner: True) + out = browse(request=object(), path=os.path.expanduser("~")) + assert "dirs" in out and "path" in out + assert all("name" in d and "path" in d for d in out["dirs"]) + + +# ── bind-time vetting of the workspace root ───────────────────────────── + +def test_vet_workspace_accepts_normal_dir(ws): + from src.tool_execution import vet_workspace + assert vet_workspace(ws) == os.path.realpath(ws) + + +def test_vet_workspace_rejects_sensitive_root(tmp_path): + # The resolver deny-lists sensitive paths inside the workspace, but the + # empty-path search root is the workspace itself - a sensitive root must + # be rejected before it is bound or `ls` with no path would list it. + from src.tool_execution import vet_workspace + ssh_dir = tmp_path / ".ssh" + ssh_dir.mkdir() + assert vet_workspace(str(ssh_dir)) is None + + +def test_vet_workspace_rejects_nondir_and_empty(ws): + from src.tool_execution import vet_workspace + assert vet_workspace(os.path.join(ws, "a.txt")) is None # file, not dir + assert vet_workspace("/nonexistent/path/xyz") is None + assert vet_workspace("") is None + assert vet_workspace(" ") is None + + +def test_vet_workspace_rejects_filesystem_root(): + # Binding / would make every absolute path "inside" the workspace, + # collapsing confinement into host-wide file access. + from src.tool_execution import vet_workspace + assert vet_workspace("/") is None + + +def test_browse_marks_root_unselectable_and_vet_endpoint(monkeypatch): + import routes.workspace_routes as wr + + router = wr.setup_workspace_routes() + browse = next(r.endpoint for r in router.routes if r.path == "/api/workspace/browse") + vet = next(r.endpoint for r in router.routes if r.path == "/api/workspace/vet") + + monkeypatch.setattr(wr, "get_current_user", lambda req: "admin") + monkeypatch.setattr(wr, "owner_is_admin_or_single_user", lambda owner: True) + + out = browse(request=object(), path="/") + assert out["selectable"] is False + out = browse(request=object(), path=os.path.expanduser("~")) + assert out["selectable"] is True + + assert vet(request=object(), path="/") == {"ok": False, "path": None} + home = os.path.realpath(os.path.expanduser("~")) + assert vet(request=object(), path="~") == {"ok": True, "path": home} + + from fastapi import HTTPException + monkeypatch.setattr(wr, "owner_is_admin_or_single_user", lambda owner: False) + with pytest.raises(HTTPException) as ei: + vet(request=object(), path="/tmp") + assert ei.value.status_code == 403 + + +# ── send-time privilege gate (no path oracle for non-admins) ──────────── + +def test_request_workspace_gate(ws, monkeypatch): + """Non-admin chat callers must get a uniform drop with no vetting: the + workspace_rejected signal would otherwise reveal which host paths exist.""" + import routes.chat_routes as cr + + monkeypatch.setattr(cr, "get_current_user", lambda req: "bob") + vet_calls = [] + import src.tool_execution as te + real_vet = te.vet_workspace + monkeypatch.setattr(te, "vet_workspace", lambda p: vet_calls.append(p) or real_vet(p)) + + import src.tool_security as ts + monkeypatch.setattr(ts, "owner_is_admin_or_single_user", lambda owner: False) + # Valid and invalid paths are indistinguishable for a non-admin: both + # drop silently, and the path never reaches the filesystem. + assert cr._resolve_request_workspace(object(), ws) == ("", "") + assert cr._resolve_request_workspace(object(), "/nonexistent/xyz") == ("", "") + assert vet_calls == [] + + monkeypatch.setattr(ts, "owner_is_admin_or_single_user", lambda owner: True) + assert cr._resolve_request_workspace(object(), ws) == (os.path.realpath(ws), "") + assert cr._resolve_request_workspace(object(), "/nonexistent/xyz") == ("", "/nonexistent/xyz")