feat: Add plan mode to the chat agent (#638)

* feat: Add plan mode to the chat agent Adds a plan mode: the agent investigates read-only, proposes a checklist, and waits for approval before changing anything. On approval it runs with full tools and checks items off as it goes. Enforcement reuses the existing disabled_tools gate. Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the plan toggle from the chat input. - src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP). - src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the plan directive, force agent mode. - static/: plan toggle pill, Approve & Run, dockable plan window, task-list checkboxes, and the /plan slash command. - tests/test_plan_mode.py. * Plan mode: persistent re-referenceable plan + agent write-back Three improvements so a long plan survives a weak model and stays in reach: 1. Re-reference the plan (out-of-context fix). On the execution turn the frontend sends the approved checklist back (`approved_plan`); the backend pins it as a top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so the agent can always re-read the plan instead of losing the thread on a long run. New `build_active_plan_note()` (unit-tested). 2. Re-open / dock the plan anytime. The plan checklist is stored per-session (localStorage). When a plan exists, the plan-mode button opens a small menu ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan window — so it can stay docked while the agent works. The window live-refreshes as the plan changes. 3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps `- [x]` after finishing them, or to revise steps when the user asks. Marker tool (no I/O) → `plan_update` SSE event → the stored plan + docked window update live. The ACTIVE PLAN note instructs the agent to use it. Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb), src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse `approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools / tool_index (always-available, not admin-gated). Frontend: static/js/chat.js (plan store, send `approved_plan`, handle `plan_update`, capture restated checklists), static/app.js (plan-button menu), static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key). Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py. * Plan mode: drop bash/python, rely on read-only discovery tools Shell can mutate (write files, hit the network) and can't be constrained to read-only at the tool layer, so plan mode no longer relies on a prompt to keep it well-behaved — bash/python are removed from the read-only allowlist and added to the fail-closed block set. Discovery is covered by the dedicated read-only tools (read_file, grep, glob, ls) instead. Rewrites the plan-mode directive to state shell is disabled and lists the available read-only tools positively. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comment: note _MCP_READONLY_VERBS are prefixes not whole words Clarifies that entries like "summar" are intentional stems matched via startswith (covers summarise/summarize/summary), not typos. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: clarify why gating inverts the allowlist into a denylist Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an allowlist, so it returns the inverse (all known tool names minus the allowlist). The static mutator set is a backstop for the schema-derived name list, which misses XML-only tools and can fail to import. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: stop hardcoding the read-only tool list in the directive The model is already shown its available (read-only) tools by _assemble_prompt, which removes every disabled tool. Enumerating them again in the directive only duplicated that list and would drift as tools change. Point at the tools listed below instead. Addresses review feedback on #638.
2026-06-16 01:35:36 -04:00 · 2026-06-05 16:32:25 +02:00
parent 2e207fc315
commit 8ce945d338
18 changed files with 891 additions and 8 deletions
@@ -19,7 +19,7 @@ from src.llm_core import stream_llm, stream_llm_with_fallback, _is_ollama_native
 from src.model_context import estimate_tokens
 from src.settings import get_setting
 from src.prompt_security import untrusted_context_message
-from src.tool_security import blocked_tools_for_owner
+from src.tool_security import blocked_tools_for_owner, plan_mode_disabled_tools
 from src.agent_tools import (
    parse_tool_blocks,
    strip_tool_blocks,
@@ -336,6 +336,7 @@ If the user asks for a reminder/alarm before the event, pass `reminder_minutes`
    "pipeline": "- ```pipeline``` — Run a multi-step AI pipeline. Args (JSON) with ordered steps, each specifying a model and prompt. Use for complex workflows.",
    "ui_control": "- ```ui_control``` — Control the UI: toggle tools on/off, OPEN PANELS, open email reply drafts, switch models, change themes. Commands: `toggle <name> on/off` (names: bash/shell, web/search, research, incognito, document_editor/documents), `open_panel <name>` (panels: documents, gallery, email, sessions, notes, memories/brain, skills, settings, cookbook), `open_email_reply <uid> <folder> <reply|reply-all|ai-reply>` (opens an email compose document, does NOT send), `set_mode agent/chat`, `switch_model <name>`, `set_theme <preset>`, `create_theme <name> <bg> <fg> <panel> <border> <accent>` (optional key=val for advanced colors AND background effects: bgPattern=<none|dots|synapse|rain|constellations|perlin-flow|petals|sparkles|embers>, bgEffectColor=#RRGGBB, bgEffectIntensity=<num>, bgEffectSize=<num>, frosted=true|false). \"open documents\" / \"open library\" / \"show gallery\" / \"open inbox\" / \"open notes\" / \"open cookbook\" all map to `open_panel <name>`. Theme presets: dark, light, midnight, paper, cyberpunk, retrowave, forest, ocean, ume, copper, terminal, organs, lavender, gpt, claude, cute.",
    "ask_user": "- ```ask_user``` — Ask the user a multiple-choice question when the task is genuinely ambiguous and the answer changes what you do next (pick an approach, confirm an assumption, choose a target). Args (JSON): {\"question\": \"...\", \"options\": [{\"label\": \"...\", \"description\": \"...\"?}, ...], \"multi\": false?}. 2-6 options. The user gets clickable buttons; calling this ENDS your turn and their choice comes back as your next message. Prefer sensible defaults — only ask when you truly can't proceed well without their input.",
+    "update_plan": "- ```update_plan``` — While executing an approved plan, write the plan back: tick steps done or revise them. Args (JSON): {\"plan\": \"- [x] done step\\n- [ ] next step\"}. Always pass the COMPLETE checklist, not a diff. Call it after finishing each step (mark it `- [x]`) and whenever the user asks to change the plan. The user's docked plan window updates live. Does nothing if there's no active plan.",
    "list_served_models": "- ```list_served_models``` — Show what the Cookbook (LLM-serving subsystem) is currently running. NO args. Use this for ANY 'what's running' / 'what's serving' / 'show my cookbook' / 'is anything up' query. DO NOT shell out (`ps aux`, `docker ps`, etc.) — this tool is the source of truth. Failed serve tasks include recent logs plus diagnosis/retry suggestions; use those suggestions to call `serve_model` again with an adjusted command when appropriate.",
    "stop_served_model": "- ```stop_served_model``` — Stop a running model server. Args (JSON): {\"session_id\": \"<from list_served_models>\"}. Use for 'kill my cookbook' / 'stop the model' / 'shut down vLLM'.",
    "tail_serve_output": "- ```tail_serve_output``` — Read the actual tmux stderr/traceback of a CURRENTLY failing cookbook task. Args (JSON): {\"session_id\": \"<from list_served_models>\", \"tail\": 150?}. **Use ONLY after** you just launched something via `serve_model` AND `list_served_models` reports YOUR new task as `crashed`/`error`. DO NOT use it on old stopped/completed download tasks (they're historical noise — won't predict whether a new launch succeeds). DO NOT call it before launching a fresh attempt. When you do call it, bump `tail` to 400+ only if the visible error references 'see root cause above'.",
@@ -1372,6 +1373,53 @@ def _empty_response_fallback(
    return _error_msg, f'data: {json.dumps({"delta": _error_msg})}\n\n'


+PLAN_MODE_DIRECTIVE = (
+    "## PLAN MODE — OVERRIDES EVERYTHING ELSE BELOW\n"
+    "You are in PLAN MODE. Your ONLY job this turn is to PROPOSE a plan. You have "
+    "NOT done anything yet. Do NOT claim you created, wrote, ran, sent, or changed "
+    "anything — that would be a lie.\n"
+    "\n"
+    "ABSOLUTE RULE — DO NOT MUTATE ANYTHING. Every write/state-changing tool, "
+    "including the shell (`bash`/`python`), is disabled this turn and will be "
+    "rejected — only read-only tools remain available. Use the read-only tools "
+    "listed below (read files, search code, browse the project, web lookups) to "
+    "ground the plan. If the task is 'write a file', your plan is to DESCRIBE "
+    "writing it — you do NOT write it now.\n"
+    "\n"
+    "OUTPUT: present the plan as a GitHub-style checklist, one concrete step per line:\n"
+    "- [ ] first action you will take once approved\n"
+    "- [ ] next action\n"
+    "Each item = one concrete action (file to create/edit, command to run, side "
+    "effect). Do not execute. Do not end with 'Done' or anything implying the work "
+    "is finished. End your turn with the checklist."
+)
+
+
+def build_active_plan_note(approved_plan: str) -> str:
+    """System note that pins an approved plan during execution.
+
+    Sent back by the frontend each turn so a long plan on a weak model survives
+    history truncation — the agent can always re-read it. Returns "" for empty
+    input.
+    """
+    if not approved_plan or not approved_plan.strip():
+        return ""
+    return (
+        "## ACTIVE PLAN (approved — execute this)\n"
+        "You are executing a plan the user already approved. THE FULL PLAN IS "
+        "BELOW — it is always provided here every turn. Do NOT say you lost it, "
+        "and do NOT look for it in tasks, notes, memory, files, or the API; just "
+        "read it below. Work through it IN ORDER. After finishing each step, call "
+        "the `update_plan` tool with the full checklist and that step marked "
+        "`- [x]` so progress stays visible in the user's plan window. If the user "
+        "asks to change the plan, call `update_plan` with the revised checklist. "
+        "Do the next unchecked item until all are done. Do not skip, reorder, or "
+        "invent steps; if a step is genuinely impossible, say so and stop.\n\n"
+        "Current plan:\n"
+        + approved_plan.strip()
+    )
+
+
 async def stream_agent_loop(
    endpoint_url: str,
    model: str,
@@ -1390,6 +1438,8 @@ async def stream_agent_loop(
    relevant_tools: Optional[Set[str]] = None,
    fallbacks: Optional[List[tuple]] = None,
    workspace: Optional[str] = None,
+    plan_mode: bool = False,
+    approved_plan: Optional[str] = None,
    _is_teacher_run: bool = False,
 ) -> AsyncGenerator[str, None]:
    """Streaming agent loop generator.
@@ -1413,6 +1463,13 @@ async def stream_agent_loop(
        # public/non-admin users rather than trying to enumerate every tool.
        mcp_mgr = None

+    if plan_mode:
+        # Plan mode: investigate read-only, propose a plan, don't execute. The
+        # route also unions the read-only-disabled set, but enforce here too so
+        # the loop is safe regardless of caller. MCP stays available but is
+        # filtered to read-only tools below (after the disabled map is loaded).
+        disabled_tools.update(plan_mode_disabled_tools())
+
    _t0 = time.time()
    _needs_admin = _detect_admin_intent(messages)
    _last_user = _extract_last_user_message(messages)
@@ -1420,6 +1477,13 @@ async def stream_agent_loop(
    # not just the latest message, so short follow-ups don't drop just-used tools.
    _retrieval_query = _recent_context_for_retrieval(messages) or _last_user
    _mcp_disabled_map = _load_mcp_disabled_map() if mcp_mgr else {}
+    if plan_mode and mcp_mgr:
+        # Allow read-only MCP tools to investigate, block write/unknown ones:
+        # hide them from the schemas AND reject them at runtime by qualified name.
+        _mcp_block_map, _mcp_block_q = mcp_mgr.plan_mode_blocked_mcp()
+        for _sid, _names in _mcp_block_map.items():
+            _mcp_disabled_map.setdefault(_sid, set()).update(_names)
+        disabled_tools.update(_mcp_block_q)
    prep_timings["request_setup"] = time.time() - _t0

    # RAG-based tool selection: retrieve relevant tools for this query.
@@ -1577,6 +1641,27 @@ async def stream_agent_loop(
        else:
            messages.insert(0, {"role": "system", "content": _ws_note})
        logger.info("[workspace] active for this turn: %s", workspace)
+    if plan_mode:
+        # Steer the model to investigate-then-propose. Hard tool gating handles
+        # every write path except shell; this directive is what keeps the
+        # intentionally-allowed bash/python read-only, so it must DOMINATE. Put
+        # it at the very TOP of the system prompt (the base prompt is large and
+        # action-oriented — appending buried it, and small models ignored it).
+        if messages and messages[0].get("role") == "system":
+            messages[0]["content"] = PLAN_MODE_DIRECTIVE + "\n\n" + (messages[0].get("content") or "")
+        else:
+            messages.insert(0, {"role": "system", "content": PLAN_MODE_DIRECTIVE})
+    elif approved_plan and approved_plan.strip():
+        # EXECUTING an approved plan. Pin the checklist as a top-of-context
+        # system note so a long plan on a weak model survives history
+        # truncation — the agent can always re-read the plan instead of losing
+        # the thread. (The first system message is kept by the context trimmer.)
+        _plan_note = build_active_plan_note(approved_plan)
+        if messages and messages[0].get("role") == "system":
+            messages[0]["content"] = _plan_note + "\n\n" + (messages[0].get("content") or "")
+        else:
+            messages.insert(0, {"role": "system", "content": _plan_note})
+        logger.info("[plan] pinned approved plan (%d chars) for execution turn", len(approved_plan))
    prep_timings["prompt_build"] = time.time() - _t2

    _t3 = time.time()
@@ -2287,6 +2372,14 @@ async def stream_agent_loop(
                )
                _awaiting_user = True

+            # update_plan: agent wrote back to the plan (ticked a step / revised).
+            # Push it to the frontend so the stored plan + docked window update
+            # live. Does NOT end the turn — the agent keeps working.
+            if "plan_update" in result:
+                yield (
+                    f'data: {json.dumps({"type": "plan_update", "data": result["plan_update"]})}\n\n'
+                )
+
            # Build output for frontend tool bubble.
            # Document tools get a short summary — content goes to the editor panel.
            output_text = ""