feat: Add plan mode to the chat agent (#638)

* feat: Add plan mode to the chat agent Adds a plan mode: the agent investigates read-only, proposes a checklist, and waits for approval before changing anything. On approval it runs with full tools and checks items off as it goes. Enforcement reuses the existing disabled_tools gate. Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the plan toggle from the chat input. - src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP). - src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the plan directive, force agent mode. - static/: plan toggle pill, Approve & Run, dockable plan window, task-list checkboxes, and the /plan slash command. - tests/test_plan_mode.py. * Plan mode: persistent re-referenceable plan + agent write-back Three improvements so a long plan survives a weak model and stays in reach: 1. Re-reference the plan (out-of-context fix). On the execution turn the frontend sends the approved checklist back (`approved_plan`); the backend pins it as a top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so the agent can always re-read the plan instead of losing the thread on a long run. New `build_active_plan_note()` (unit-tested). 2. Re-open / dock the plan anytime. The plan checklist is stored per-session (localStorage). When a plan exists, the plan-mode button opens a small menu ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan window — so it can stay docked while the agent works. The window live-refreshes as the plan changes. 3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps `- [x]` after finishing them, or to revise steps when the user asks. Marker tool (no I/O) → `plan_update` SSE event → the stored plan + docked window update live. The ACTIVE PLAN note instructs the agent to use it. Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb), src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse `approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools / tool_index (always-available, not admin-gated). Frontend: static/js/chat.js (plan store, send `approved_plan`, handle `plan_update`, capture restated checklists), static/app.js (plan-button menu), static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key). Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py. * Plan mode: drop bash/python, rely on read-only discovery tools Shell can mutate (write files, hit the network) and can't be constrained to read-only at the tool layer, so plan mode no longer relies on a prompt to keep it well-behaved — bash/python are removed from the read-only allowlist and added to the fail-closed block set. Discovery is covered by the dedicated read-only tools (read_file, grep, glob, ls) instead. Rewrites the plan-mode directive to state shell is disabled and lists the available read-only tools positively. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comment: note _MCP_READONLY_VERBS are prefixes not whole words Clarifies that entries like "summar" are intentional stems matched via startswith (covers summarise/summarize/summary), not typos. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: clarify why gating inverts the allowlist into a denylist Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an allowlist, so it returns the inverse (all known tool names minus the allowlist). The static mutator set is a backstop for the schema-derived name list, which misses XML-only tools and can fail to import. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: stop hardcoding the read-only tool list in the directive The model is already shown its available (read-only) tools by _assemble_prompt, which removes every disabled tool. Enumerating them again in the directive only duplicated that list and would drift as tools change. Point at the tools listed below instead. Addresses review feedback on #638.
2026-06-27 23:25:22 -04:00 · 2026-06-05 16:32:25 +02:00
parent 2e207fc315
commit 8ce945d338
18 changed files with 891 additions and 8 deletions
@@ -9,7 +9,7 @@ import json
 import logging
 import os
 import re
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, Set, Tuple

 logger = logging.getLogger(__name__)

@@ -90,6 +90,44 @@ def _format_mcp_params(input_schema: Any) -> str:
    return hint


+# Tool-name prefixes that denote a read-only/inspection operation. Used to
+# classify MCP tools for plan mode when the server provides no readOnlyHint.
+# These are PREFIXES, not whole words (matched via str.startswith below), so a
+# stem like "summar" intentionally covers "summarise"/"summarize"/"summary".
+_MCP_READONLY_VERBS = (
+    "list", "get", "read", "search", "fetch", "query", "find", "describe",
+    "show", "view", "lookup", "count", "status", "info", "inspect", "summar",
+)
+
+
+def mcp_tool_is_readonly(tool: Dict) -> bool:
+    """Classify an MCP tool as safe (non-mutating) for plan mode.
+
+    Prefer the server's own annotations (readOnlyHint / destructiveHint). When
+    absent, fall back to a tool-name verb heuristic, and FAIL CLOSED (treat as
+    write) for anything that doesn't clearly read — plan mode must not run a
+    write tool just because its intent is ambiguous.
+    """
+    ann = tool.get("annotations")
+    # annotations may be a dict or a pydantic model
+    read_hint = None
+    destructive = None
+    if ann is not None:
+        if isinstance(ann, dict):
+            read_hint = ann.get("readOnlyHint")
+            destructive = ann.get("destructiveHint")
+        else:
+            read_hint = getattr(ann, "readOnlyHint", None)
+            destructive = getattr(ann, "destructiveHint", None)
+    if read_hint is True:
+        return True
+    if read_hint is False or destructive is True:
+        return False
+    # No usable hint — heuristic on the tool name's leading verb.
+    name = (tool.get("name") or "").lower()
+    return name.startswith(_MCP_READONLY_VERBS)
+
+
 class McpManager:
    """Manages MCP server connections and tool routing."""

@@ -170,6 +208,10 @@ class McpManager:
                    "name": tool.name,
                    "description": tool.description or "",
                    "input_schema": tool.inputSchema if hasattr(tool, 'inputSchema') else {},
+                    # MCP tool annotations (readOnlyHint / destructiveHint) drive
+                    # plan-mode read-only gating. Absent on many servers, so we
+                    # fall back to a name heuristic in mcp_tool_is_readonly().
+                    "annotations": getattr(tool, 'annotations', None),
                })

            self._sessions[server_id] = session
@@ -227,6 +269,10 @@ class McpManager:
                    "name": tool.name,
                    "description": tool.description or "",
                    "input_schema": tool.inputSchema if hasattr(tool, 'inputSchema') else {},
+                    # MCP tool annotations (readOnlyHint / destructiveHint) drive
+                    # plan-mode read-only gating. Absent on many servers, so we
+                    # fall back to a name heuristic in mcp_tool_is_readonly().
+                    "annotations": getattr(tool, 'annotations', None),
                })

            self._sessions[server_id] = session
@@ -537,6 +583,24 @@ class McpManager:
                })
        return result

+    def plan_mode_blocked_mcp(self) -> Tuple[Dict[str, Set[str]], Set[str]]:
+        """Plan mode: block every MCP tool that isn't clearly read-only.
+
+        Returns (disabled_map, qualified_names):
+          - disabled_map: {server_id: {tool_name, ...}} to hide write tools from
+            the prompt/schemas (merged into the existing mcp_disabled_map).
+          - qualified_names: {"mcp__<server>__<tool>", ...} for runtime rejection
+            in execute_tool_block (which matches the qualified name).
+        """
+        disabled_map: Dict[str, Set[str]] = {}
+        qualified: Set[str] = set()
+        for server_id, tools in self._tools.items():
+            for tool in tools:
+                if not mcp_tool_is_readonly(tool):
+                    disabled_map.setdefault(server_id, set()).add(tool["name"])
+                    qualified.add(f"mcp__{server_id}__{tool['name']}")
+        return disabled_map, qualified
+
    def is_builtin(self, server_id: str) -> bool:
        """Check if a server is a built-in (auto-registered) server."""
        return server_id.startswith("builtin_") or server_id in {