mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-19 19:25:27 -04:00
620fdd0859
* feat(agent): workspace confinement via context-local binding + get_workspace tool Bind the per-turn workspace once in execute_tool_block; the shared path resolvers (_resolve_tool_path / _resolve_search_root) and the subprocess cwd helper (agent_cwd) read it, so file tools + bash/python are confined centrally and a new tool that uses the shared helpers cannot accidentally bypass it. Adds the admin-gated /api/workspace/browse picker, a workspace pill + directory modal (reusing existing modal/button CSS), the /workspace slash command, and a get_workspace tool (replaces a system-prompt block). Confinement is OS-agnostic (realpath/normcase/commonpath) and docker-safe (container paths, no host assumptions). Reopens #2023. * ux(workspace): clarify workspace is not a sandbox Picker modal note + pill tooltip + get_workspace tool/output wording now state plainly: read_file/write_file/edit_file/grep/glob/ls are confined to the folder, but bash/python only start there (cwd) and are not sandboxed. Modal note reuses the existing .muted class. * fix(agent): treat an active workspace as file-work intent A vague low-signal message (e.g. "look at the local project") matches no domain keywords, so tool retrieval is skipped and only always-available tools are offered — leaving the agent with no file access even though a workspace is set. When a workspace is active, include the file/code tools (incl. get_workspace) on low-signal turns so the agent can act on the folder. Also requires the tool index (ChromaDB) to be reachable for normal retrieval; that is an environment dependency, not part of this change. * ux(workspace): hide pill + overflow entry in chat mode Workspace only scopes the agent's file/shell tools, so the pill and the overflow 'Workspace' entry are agent-only now — hidden in chat mode like the bash toggle. Mode read from the DOM in syncWorkspaceIndicator; applyMode() is called from the agent/chat setMode handler. * prompt(tools): steer bash/python to defer to the dedicated file tools bash/python schema descriptions (what native-tool-calling models read) were bare and gave no steer, so models would do file ops via the shell (e.g. writing SVG/HTML, which then dumps raw markup into the tool preview). Tell bash/python in the schema + tool-index + prompt section to prefer read_file/write_file/ edit_file/grep/glob/ls and only be used for what those do not cover. * prompt(tools): keep bash/python deferral generic (no hardcoded tool names) Reference 'a dedicated tool' rather than listing read_file/write_file/grep/etc. by name, so the guidance does not go stale if those tools are renamed. * style(workspace): drop em-dashes from added code comments/strings * ux(workspace): terser non-sandbox note in picker (no tool-name list) * ux(workspace): mirror terse non-sandbox wording in pill tooltip * chore: untrack local venv symlink (run-only, not part of the feature) * prompt(workspace): keep get_workspace text generic (no hardcoded tool names) * fix(agent): low-signal + workspace surfaces only read-only file tools Intersect the files tool group with PLAN_MODE_READONLY_TOOLS so a vague message in a workspace exposes read_file/grep/glob/ls/get_workspace for exploration, but not write_file/edit_file/bash/python -- those wait for a request that actually calls for them (RAG retrieval still adds them on a real ask). * feat(workspace): cap browse listing at 500 dirs with a truncated hint Mirror the filesystem_tools._CODENAV_MAX_HITS pattern with a module-local _MAX_BROWSE_DIRS so a directory with thousands of children does not dump every row into the picker; the response carries a truncated flag and the modal tells the user to type a path to jump in. * chore: untrack local venv symlink (run-only artifact) * fix(workspace): vet the workspace root against the sensitive-path deny list at bind time The in-workspace resolver deny-lists sensitive paths inside the workspace, but the empty-path search root is the workspace itself, so a workspace of ~/.ssh could be listed via ls with no path. vet_workspace() (public, in tool_execution next to the resolvers) rejects non-directories and sensitive roots before the path is ever bound; chat_routes uses it instead of its inline isdir check. * fix(workspace): reject filesystem roots and stop showing rejected workspaces as active Review findings from #3665: P2: vet_workspace accepted / (and would accept drive/UNC roots), which makes every absolute path 'inside' the workspace and collapses confinement into host-wide file access. A root is its own dirname, so reject when dirname(resolved) == resolved; the browse response now carries a selectable flag and the picker disables 'Use this folder' on unselectable dirs. P3: /workspace set stored any string client-side and the chat route silently dropped rejected values, so the pill could claim a confinement that was not in effect. New admin-gated /api/workspace/vet validates manual paths before they persist (canonical path returned), and when a posted workspace is rejected at send time the stream emits workspace_rejected so the client clears the stored value and toasts instead of continuing silently. * fix(workspace): check caller privilege before vetting the posted workspace Review finding: /api/chat_stream called vet_workspace() on the posted value for every caller and emitted workspace_rejected on failure, so a non-admin who can chat but cannot use file/shell tools could distinguish existing directories from missing/file/sensitive/root paths by whether the event appeared. The resolution now lives in _resolve_request_workspace, which drops the submitted value uniformly for non-admin callers, with no vetting and no event, before the path ever touches the filesystem. Admin and single-user behavior is unchanged. Test pins that valid and invalid paths are indistinguishable for a non-admin and that vet_workspace is never invoked for them.
198 lines
7.4 KiB
Python
198 lines
7.4 KiB
Python
"""Server-side tool safety policy."""
|
|
|
|
from __future__ import annotations
|
|
|
|
import logging
|
|
from typing import Optional, Set
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
# Tools regular/public users must not execute directly. These either expose
|
|
# server/runtime access, sensitive user data, external messaging, persistent
|
|
# state changes, or generic loopback/integration surfaces.
|
|
NON_ADMIN_BLOCKED_TOOLS = {
|
|
"bash",
|
|
"python",
|
|
"read_file",
|
|
"write_file",
|
|
"edit_file",
|
|
"grep",
|
|
"glob",
|
|
"ls",
|
|
"get_workspace",
|
|
"search_chats",
|
|
"manage_memory",
|
|
"manage_skills",
|
|
"manage_tasks",
|
|
"manage_endpoints",
|
|
"manage_mcp",
|
|
"manage_webhooks",
|
|
"manage_tokens",
|
|
"manage_documents",
|
|
"manage_settings",
|
|
"api_call",
|
|
"app_api",
|
|
"send_email",
|
|
"reply_to_email",
|
|
"list_emails",
|
|
"read_email",
|
|
"resolve_contact",
|
|
"manage_contact",
|
|
"manage_calendar",
|
|
"vault_search",
|
|
"vault_get",
|
|
"vault_unlock",
|
|
"download_model",
|
|
"serve_model",
|
|
"serve_preset",
|
|
"stop_served_model",
|
|
"cancel_download",
|
|
"adopt_served_model",
|
|
}
|
|
|
|
|
|
# Plan mode: the agent may investigate but must not mutate anything. Only these
|
|
# read-only/inspection tools stay enabled; everything else (writes, sends,
|
|
# manage_*, model serving, MCP, etc.) is blocked. Allowlist rather than blocklist
|
|
# so any newly added tool defaults to BLOCKED in plan mode — fail safe.
|
|
#
|
|
# bash/python are deliberately NOT here: the shell can mutate (write files, hit
|
|
# the network) and can't be constrained to read-only at the tool layer, so plan
|
|
# mode blocks it outright rather than relying on a prompt to keep it well-behaved.
|
|
# Code/file discovery is covered by the dedicated read-only tools below
|
|
# (read_file, grep, glob, ls) instead of freestyle shell.
|
|
PLAN_MODE_READONLY_TOOLS = {
|
|
"read_file",
|
|
"grep",
|
|
"glob",
|
|
"ls",
|
|
"get_workspace",
|
|
"web_search",
|
|
"web_fetch",
|
|
"search_chats",
|
|
"list_models",
|
|
"list_sessions",
|
|
"list_emails",
|
|
"read_email",
|
|
"list_served_models",
|
|
"list_downloads",
|
|
"list_cached_models",
|
|
"search_hf_models",
|
|
"list_serve_presets",
|
|
"list_cookbook_servers",
|
|
"resolve_contact",
|
|
"chat_with_model",
|
|
"ask_teacher",
|
|
}
|
|
|
|
|
|
# The agent's tool gate is a DENYLIST: execute_tool_block blocks any tool whose
|
|
# name is in `disabled_tools`. Plan mode's policy is the opposite — an allowlist
|
|
# (PLAN_MODE_READONLY_TOOLS). To apply an allowlist through a denylist, plan mode
|
|
# returns the inverse: every known tool name minus the allowlist.
|
|
#
|
|
# Known tool names come from FUNCTION_TOOL_SCHEMAS, but that source is imperfect:
|
|
# some tools are only XML-invocable (e.g. manage_notes, generate_image) and never
|
|
# appear there, and the import can fail outright. Either gap would drop a mutating
|
|
# tool from the subtraction and silently leave it enabled. This set is the static
|
|
# backstop for both: union it in so known mutators are always subtracted, and so a
|
|
# failed import still blocks them (fail closed, never open). Only mutators belong
|
|
# here — read-only tools are covered by the allowlist. Keep in sync when adding
|
|
# new mutating tools.
|
|
_PLAN_MODE_KNOWN_MUTATORS = {
|
|
"write_file", "create_document", "edit_document", "update_document",
|
|
"suggest_document", "manage_documents", "create_session", "manage_session",
|
|
"send_to_session", "pipeline", "manage_memory", "manage_skills",
|
|
"manage_tasks", "manage_notes", "manage_endpoints", "manage_mcp",
|
|
"manage_webhooks", "manage_tokens", "manage_settings", "manage_contact",
|
|
"manage_calendar", "api_call", "app_api", "ui_control",
|
|
"send_email", "reply_to_email", "bulk_email", "delete_email",
|
|
"archive_email", "mark_email_read", "download_model", "serve_model",
|
|
"stop_served_model", "cancel_download", "adopt_served_model", "serve_preset",
|
|
"generate_image", "edit_image", "trigger_research", "manage_research",
|
|
# Shell is never read-only-safe; block it explicitly so it stays out of plan
|
|
# mode even if the schema list fails to load.
|
|
"bash", "python",
|
|
}
|
|
|
|
|
|
def plan_mode_disabled_tools() -> Set[str]:
|
|
"""Tool names to add to the denylist in plan mode.
|
|
|
|
Plan mode allows only PLAN_MODE_READONLY_TOOLS. The gate is a denylist, so
|
|
return the inverse: every known tool name minus the allowlist. Known names
|
|
come from the function-tool schemas, backstopped by _PLAN_MODE_KNOWN_MUTATORS
|
|
(see above) so XML-only tools and a failed schema import can't leave a mutator
|
|
enabled. MCP tools are handled separately — the loop drops the MCP manager
|
|
entirely in plan mode."""
|
|
try:
|
|
# agent_tools / tool_parsing / tool_schemas form a mutually-circular
|
|
# cluster that only resolves cleanly when entered via agent_tools.
|
|
# Import it first so the lazy schema import works even from a cold
|
|
# import (e.g. tests) — not just after the app has wired everything up.
|
|
import src.agent_tools # noqa: F401
|
|
from src.tool_schemas import FUNCTION_TOOL_SCHEMAS
|
|
|
|
all_names = {
|
|
(t.get("function") or {}).get("name")
|
|
for t in FUNCTION_TOOL_SCHEMAS
|
|
}
|
|
all_names.discard(None)
|
|
except Exception as exc:
|
|
logger.warning("Unable to load tool schemas for plan-mode gating: %s", exc)
|
|
all_names = set()
|
|
# Subtract the allowlist from all known tool names (schema-derived plus the
|
|
# static mutator backstop). Fail closed: if the schema import failed above,
|
|
# the backstop alone still blocks known mutators.
|
|
return (all_names | _PLAN_MODE_KNOWN_MUTATORS) - PLAN_MODE_READONLY_TOOLS
|
|
|
|
|
|
def is_public_blocked_tool(tool_name: Optional[str]) -> bool:
|
|
"""Return True when a non-admin/public user must not execute this tool.
|
|
|
|
This is a security gate, so it fails CLOSED: a malformed non-string tool
|
|
name can't be matched against the blocklist or the ``mcp__`` namespace, so
|
|
it is treated as blocked rather than silently allowed through. ``None`` /
|
|
empty string means there is no tool to gate.
|
|
"""
|
|
if tool_name is None or tool_name == "":
|
|
return False
|
|
if not isinstance(tool_name, str):
|
|
return True
|
|
return tool_name in NON_ADMIN_BLOCKED_TOOLS or tool_name.startswith("mcp__")
|
|
|
|
|
|
def owner_is_admin_or_single_user(owner: Optional[str]) -> bool:
|
|
"""Return True for admins, or in intentional single-user mode.
|
|
|
|
Single-user mode means the operator explicitly disabled auth
|
|
(``AUTH_ENABLED=false``) — the local/self-host default where the owner has
|
|
full access to their own box.
|
|
|
|
The pre-setup window (auth ENABLED but no admin created yet) is treated as
|
|
NON-admin: returning True there would hand server-execution tools
|
|
(``bash``/``python``) to any caller before setup completes. The auth
|
|
middleware already 401s ``/api/`` requests pre-setup, so this is
|
|
defense-in-depth for callers that bypass it (e.g. trusted loopback).
|
|
"""
|
|
try:
|
|
from core.auth import AuthManager
|
|
|
|
auth = AuthManager()
|
|
if not auth.is_configured:
|
|
from src.auth_helpers import _auth_disabled
|
|
|
|
return _auth_disabled()
|
|
return bool(owner and auth.is_admin(owner))
|
|
except Exception as exc:
|
|
logger.warning("Unable to evaluate owner admin status: %s", exc)
|
|
return False
|
|
|
|
|
|
def blocked_tools_for_owner(owner: Optional[str]) -> Set[str]:
|
|
"""Tools to hide/disable for this owner under public-user policy."""
|
|
if owner_is_admin_or_single_user(owner):
|
|
return set()
|
|
return set(NON_ADMIN_BLOCKED_TOOLS)
|