feat: Add plan mode to the chat agent (#638)

* feat: Add plan mode to the chat agent Adds a plan mode: the agent investigates read-only, proposes a checklist, and waits for approval before changing anything. On approval it runs with full tools and checks items off as it goes. Enforcement reuses the existing disabled_tools gate. Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the plan toggle from the chat input. - src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP). - src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the plan directive, force agent mode. - static/: plan toggle pill, Approve & Run, dockable plan window, task-list checkboxes, and the /plan slash command. - tests/test_plan_mode.py. * Plan mode: persistent re-referenceable plan + agent write-back Three improvements so a long plan survives a weak model and stays in reach: 1. Re-reference the plan (out-of-context fix). On the execution turn the frontend sends the approved checklist back (`approved_plan`); the backend pins it as a top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so the agent can always re-read the plan instead of losing the thread on a long run. New `build_active_plan_note()` (unit-tested). 2. Re-open / dock the plan anytime. The plan checklist is stored per-session (localStorage). When a plan exists, the plan-mode button opens a small menu ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan window — so it can stay docked while the agent works. The window live-refreshes as the plan changes. 3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps `- [x]` after finishing them, or to revise steps when the user asks. Marker tool (no I/O) → `plan_update` SSE event → the stored plan + docked window update live. The ACTIVE PLAN note instructs the agent to use it. Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb), src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse `approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools / tool_index (always-available, not admin-gated). Frontend: static/js/chat.js (plan store, send `approved_plan`, handle `plan_update`, capture restated checklists), static/app.js (plan-button menu), static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key). Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py. * Plan mode: drop bash/python, rely on read-only discovery tools Shell can mutate (write files, hit the network) and can't be constrained to read-only at the tool layer, so plan mode no longer relies on a prompt to keep it well-behaved — bash/python are removed from the read-only allowlist and added to the fail-closed block set. Discovery is covered by the dedicated read-only tools (read_file, grep, glob, ls) instead. Rewrites the plan-mode directive to state shell is disabled and lists the available read-only tools positively. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comment: note _MCP_READONLY_VERBS are prefixes not whole words Clarifies that entries like "summar" are intentional stems matched via startswith (covers summarise/summarize/summary), not typos. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: clarify why gating inverts the allowlist into a denylist Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an allowlist, so it returns the inverse (all known tool names minus the allowlist). The static mutator set is a backstop for the schema-derived name list, which misses XML-only tools and can fail to import. Addresses review feedback on #638. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Plan mode: stop hardcoding the read-only tool list in the directive The model is already shown its available (read-only) tools by _assemble_prompt, which removes every disabled tool. Enumerating them again in the directive only duplicated that list and would drift as tools change. Point at the tools listed below instead. Addresses review feedback on #638.
2026-06-15 17:25:26 -04:00 · 2026-06-05 16:32:25 +02:00
parent 2e207fc315
commit 8ce945d338
18 changed files with 891 additions and 8 deletions
@@ -13,6 +13,7 @@ import chatStream from './chatStream.js';
 import { addAITTSButton } from './tts-ai.js';
 import markdownModule from './markdown.js';
 import { svgifyEmoji } from './markdown.js';
+import planWindowModule from './planWindow.js';
 import spinnerModule from './spinner.js';
 import presetsModule from './presets.js';
 import fileHandlerModule from './fileHandler.js';
@@ -87,6 +88,35 @@ import createResearchSynapse from './researchSynapse.js';
  let _streamSessionId = null; // Session ID for the currently active reader loop
  let _lastReaderActivity = 0; // Timestamp of last reader.read() success — used to detect frozen streams
  let _webLockRelease = null;  // Function to release the Web Lock held during streaming
+  let _forcePlanOff = false;   // One-shot: suppress plan_mode for the next send (Approve & Run)
+
+  // ── Plan store: the latest proposed/approved checklist for the CURRENT chat ──
+  // Kept so (a) it can be sent back each turn and pinned in context (a long plan
+  // on a weak model survives history truncation), and (b) the plan window can be
+  // re-opened/docked at any time via the plan-button menu. Stored per session in
+  // localStorage so it survives a reload mid-execution.
+  function _setStoredPlan(text) {
+    const sid = sessionModule.getCurrentSessionId();
+    if (!sid || !text || !text.trim()) return;
+    Storage.setJSON(Storage.KEYS.PLAN, { sid, text });
+    // Live-refresh the plan window if it's open (shows progress as the agent
+    // restates the checklist with [x]).
+    try {
+      if (planWindowModule.isPlanWindowOpen && planWindowModule.isPlanWindowOpen()) {
+        planWindowModule.openPlanWindow(text, null);
+      }
+    } catch (_) {}
+  }
+  function _getStoredPlan() {
+    const sid = sessionModule.getCurrentSessionId();
+    const rec = Storage.getJSON(Storage.KEYS.PLAN, null);
+    return (rec && rec.sid === sid && rec.text) ? rec.text : '';
+  }
+  // A line like "- [ ] step" / "- [x] step" marks a GitHub-style checklist.
+  const _CHECKLIST_RE = /^\s*[-*]\s+\[[ xX]\]\s+/m;
+  // Exposed for app.js (plan-button menu) — re-open the stored plan window.
+  window._getStoredPlan = _getStoredPlan;
+  window.planWindowModule = planWindowModule;

  /** Check if an SSE reader is still actively connected for a session. */
  function hasActiveStream(sessionId) {
@@ -774,6 +804,22 @@ import createResearchSynapse from './researchSynapse.js';
      if (el('bash-toggle').checked) {
        fd.append('allow_bash', 'true');
      }
+      // Plan mode: agent investigates read-only and proposes a plan to approve.
+      // Only meaningful in agent mode, and never alongside deep research.
+      // _forcePlanOff is a one-shot set by "Approve & Run" so the execution turn
+      // runs with full tools even though the Plan toggle is still on.
+      const _planToggle = el('plan-toggle');
+      const planTurn = !_forcePlanOff && isAgentMode && _planToggle && _planToggle.checked && !el('research-toggle').checked;
+      _forcePlanOff = false;
+      if (planTurn) {
+        fd.append('plan_mode', 'true');
+        fd.set('mode', 'agent');
+      } else if (isAgentMode) {
+        // Executing (not proposing): send the stored plan back so the backend
+        // pins it in context and the agent can always re-reference it.
+        const _sp = _getStoredPlan();
+        if (_sp) fd.append('approved_plan', _sp);
+      }
      const ragChk = el('rag-toggle');
      if (ragChk && !ragChk.checked) {
        fd.append('use_rag', 'false');
@@ -2408,6 +2454,13 @@ import createResearchSynapse from './researchSynapse.js';
                  try { card.focus(); } catch (_) {}
                }

+              } else if (json.type === 'plan_update') {
+                if (_isBg) continue;
+                // Agent wrote back to the plan (ticked a step / revised). Update
+                // the stored plan + live-refresh the docked plan window.
+                const _pu = (json.data && json.data.plan) ? json.data.plan : '';
+                if (_pu) _setStoredPlan(_pu);
+
              } else if (json.type === 'agent_step') {
                if (_isBg) continue;
                _cancelThinkingTimer();
@@ -2708,6 +2761,61 @@ import createResearchSynapse from './researchSynapse.js';
        // Attach footer to the last visible bubble (roundHolder for multi-round agent, holder for single)
        const footerTarget = (roundHolder && roundHolder !== holder && roundHolder.style.display !== 'none') ? roundHolder : holder;
        footerTarget.appendChild(createMsgFooter(footerTarget));
+        // Capture any checklist this message produced as the current plan — both
+        // the initial proposal AND restated progress during execution. Keeps the
+        // stored plan (and the docked plan window) in sync with the latest state.
+        if (accumulated && _CHECKLIST_RE.test(accumulated)) {
+          _setStoredPlan(accumulated);
+        }
+        // Plan mode: the agent has proposed a plan — offer to approve & execute it.
+        // Approving re-sends with plan_mode suppressed (full tools) for one turn.
+        if (planTurn && accumulated.trim()) {
+          const _planText = accumulated;
+          const _runApproved = () => {
+            _approveWrap.remove();
+            _forcePlanOff = true;
+            // Persist the approved plan for THIS chat so it's (a) re-sent and
+            // pinned in context every execution turn, and (b) re-openable via the
+            // plan-button menu. Do this BEFORE flipping the toggle, since the menu
+            // intercept keys off a stored plan existing.
+            _setStoredPlan(_planText);
+            // Approving exits plan mode for good — turn it OFF directly (NOT via
+            // the button's click, which would now open the plan menu instead of
+            // toggling) so execution and every follow-up keep full write tools.
+            try { if (window._setPlanMode) window._setPlanMode(false); } catch (_) {}
+            const _inp = el('message');
+            if (_inp) {
+              _inp.value = 'Approved — execute the plan. The full approved checklist is pinned '
+                + 'for you under "## ACTIVE PLAN"; do NOT go looking for it in tasks, notes, or '
+                + 'memory. Work through it in order, and after each step call the update_plan tool '
+                + 'with the full checklist and that step marked `- [x]`. Do the next unchecked item '
+                + 'until all are done.';
+              _inp.dispatchEvent(new Event('input'));
+            }
+            // Show a clean bubble; the full instruction still goes to the model.
+            _displayOverride = 'Approved the plan.';
+            handleChatSubmit({ preventDefault() {} });
+          };
+          var _approveWrap = document.createElement('div');
+          _approveWrap.className = 'plan-approve-bar';
+          const _approveBtn = document.createElement('button');
+          _approveBtn.type = 'button';
+          _approveBtn.className = 'plan-approve-btn';
+          _approveBtn.textContent = 'Approve & Run';
+          _approveBtn.addEventListener('click', _runApproved);
+          // Open the plan in a draggable, side-dockable window (reuses the
+          // shared modal framework). Approving from the window runs it too.
+          const _openBtn = document.createElement('button');
+          _openBtn.type = 'button';
+          _openBtn.className = 'plan-open-btn';
+          _openBtn.textContent = 'Open in window';
+          _openBtn.addEventListener('click', () => {
+            planWindowModule.openPlanWindow(_planText, _runApproved);
+          });
+          _approveWrap.appendChild(_approveBtn);
+          _approveWrap.appendChild(_openBtn);
+          footerTarget.appendChild(_approveWrap);
+        }
        // Add "View Report" link for completed research
        if (_researchingStreamIds.has(streamSessionId)) {
          _appendViewReportLink(footerTarget, streamSessionId);
@@ -655,10 +655,20 @@ export function mdToHtml(src, opts) {
  s = s.replace(/^(\d+)\. (.*)$/gm, '<oli>$2</oli>');
  s = s.replace(/(?:^|\n)(<oli>[\s\S]*?)(?=\n(?!<oli>)|$)/g, m => `<ol>${m.trim().replace(/<\/?oli>/g, (t) => t === '<oli>' ? '<li>' : '</li>')}</ol>`);

-  // Unordered lists
+  // GitHub-style task lists (- [ ] / - [x]) → checkbox items. Must run before
+  // the generic unordered-list rule so the "- " prefix isn't consumed first.
+  // Emits <uli> (with a class) so the unordered-list wrapper below treats it
+  // as a list item. Used by plan mode: plan + progress render as a checklist.
+  s = s.replace(/^(?:- |\* )\[([ xX])\] (.*)$/gm, (_m, mark, text) => {
+    const done = mark.toLowerCase() === 'x';
+    return `<uli class="task-item${done ? ' task-done' : ''}"><span class="task-check" aria-hidden="true"></span><span class="task-text">${text}</span></uli>`;
+  });
+
+  // Unordered lists. <uli> may carry attributes (task-item class), so the
+  // wrapper preserves them when converting <uli ...> → <li ...>.
  s = s.replace(/^(?:- |\* )(.*)$/gm, '<uli>$1</uli>');
-  s = s.replace(/(^|\n)((?:<uli>[^\n]*<\/uli>(?:\n|$))+)/g, (_, prefix, block) =>
-    `${prefix}<ul>${block.trim().replace(/<\/?uli>/g, (t) => t === '<uli>' ? '<li>' : '</li>')}</ul>`);
+  s = s.replace(/(^|\n)((?:<uli\b[^>]*>[^\n]*<\/uli>(?:\n|$))+)/g, (_, prefix, block) =>
+    `${prefix}<ul>${block.trim().replace(/<uli\b([^>]*)>/g, '<li$1>').replace(/<\/uli>/g, '</li>')}</ul>`);

  // Blockquotes
  s = s.replace(/^&gt; (.*)$/gm, '<bq>$1</bq>');
@@ -666,7 +676,7 @@ export function mdToHtml(src, opts) {
    `<blockquote>${m.trim().replace(/<\/?bq>/g, (t) => t === '<bq>' ? '<p>' : '</p>')}</blockquote>`);

  // Paragraphs - but NOT for code block placeholders or allowed HTML
-  s = s.replace(/^(?!<h\d|<ul>|<ol>|<li>|<oli>|<pre>|<blockquote>|<bq>|<hr>|___CODE_BLOCK_|___ALLOWED_HTML_|___MATH_BLOCK_|___MERMAID_BLOCK_)([^\n]+)$/gm, '<p>$1</p>');
+  s = s.replace(/^(?!<h\d|<ul>|<ol>|<li|<oli>|<\/li>|<pre>|<blockquote>|<bq>|<hr>|___CODE_BLOCK_|___ALLOWED_HTML_|___MATH_BLOCK_|___MERMAID_BLOCK_)([^\n]+)$/gm, '<p>$1</p>');

  // Line breaks within paragraphs
  s = s.replace(/<p>([\s\S]*?)<\/p>/g, (match, content) => {
@@ -0,0 +1,79 @@
+// static/js/planWindow.js
+//
+// Plan mode: show a proposed plan in a draggable, side-dockable window —
+// reusing the same modal + makeWindowDraggable framework the calendar, email,
+// and document panels use. Approving from here runs the plan with full tools.
+
+import uiModule from './ui.js';
+import markdownModule from './markdown.js';
+import { makeWindowDraggable } from './windowDrag.js';
+
+let _modal = null;
+let _onApprove = null;
+
+function _getModal() {
+  if (_modal) return _modal;
+  _modal = document.createElement('div');
+  _modal.id = 'plan-window';
+  _modal.className = 'modal';
+  _modal.style.display = 'none';
+  _modal.innerHTML = `
+    <div class="modal-content plan-window-content">
+      <div class="modal-header">
+        <h4><svg width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" style="vertical-align:-2px;margin-right:6px"><path d="M9 11l3 3L22 4"/><path d="M21 12v7a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2V5a2 2 0 0 1 2-2h11"/></svg><span id="plan-window-title">Proposed plan</span></h4>
+        <button class="close-btn" id="plan-window-close">✖</button>
+      </div>
+      <div class="modal-body plan-window-body" id="plan-window-body"></div>
+      <div class="modal-footer plan-window-footer">
+        <button type="button" class="plan-approve-btn" id="plan-window-approve">Approve &amp; Run</button>
+      </div>
+    </div>`;
+  document.body.appendChild(_modal);
+  _modal.querySelector('#plan-window-close').addEventListener('click', closePlanWindow);
+  _modal.querySelector('#plan-window-approve').addEventListener('click', () => {
+    const cb = _onApprove;
+    closePlanWindow();
+    if (typeof cb === 'function') cb();
+  });
+  // Draggable + side-dockable, same one-call helper as the other windows.
+  const content = _modal.querySelector('.modal-content');
+  const header = _modal.querySelector('.modal-header');
+  if (content && header) makeWindowDraggable(_modal, { content, header });
+  return _modal;
+}
+
+/**
+ * Open the plan window with rendered markdown and an approve callback.
+ * @param {string} planMarkdown - the agent's proposed plan (raw markdown)
+ * @param {Function} onApprove - called when the user clicks Approve & Run
+ */
+export function openPlanWindow(planMarkdown, onApprove) {
+  const modal = _getModal();
+  _onApprove = onApprove || null;
+  const body = modal.querySelector('#plan-window-body');
+  if (body) {
+    body.innerHTML = markdownModule.processWithThinking(
+      markdownModule.squashOutsideCode(planMarkdown || '')
+    );
+    if (window.hljs) body.querySelectorAll('pre code').forEach((b) => window.hljs.highlightElement(b));
+  }
+  const approveBtn = modal.querySelector('#plan-window-approve');
+  if (approveBtn) approveBtn.style.display = onApprove ? '' : 'none';
+  // Title reflects state: still awaiting approval (approve callback present) vs
+  // already approved and being executed.
+  const title = modal.querySelector('#plan-window-title');
+  if (title) title.textContent = onApprove ? 'Proposed plan' : 'Approved plan';
+  modal.style.display = 'flex';
+  if (uiModule && uiModule.scrollHistory) { try { uiModule.scrollHistory(); } catch (_) {} }
+}
+
+export function closePlanWindow() {
+  if (_modal) _modal.style.display = 'none';
+}
+
+/** True when the plan window is currently visible (for live-refresh on progress). */
+export function isPlanWindowOpen() {
+  return !!(_modal && _modal.style.display !== 'none');
+}
+
+export default { openPlanWindow, closePlanWindow, isPlanWindowOpen };
@@ -1170,6 +1170,22 @@ async function _cmdWorkspace(args, ctx) {
  slashReply('Usage: <code>/workspace</code> · <code>set /path</code> · <code>clear</code> · <code>pick</code>');
  return true;
 }
+// Plan mode: drive the real toggle pill (#plan-toggle-btn) so its per-mode
+// persistence/UI logic runs. Only meaningful in agent mode.
+async function _cmdTogglePlan(args, ctx) {
+  const btn = document.getElementById('plan-toggle-btn');
+  const chk = document.getElementById('plan-toggle');
+  if (!btn || btn.style.display === 'none' || btn.offsetParent === null) {
+    slashReply('Plan mode is only available in agent mode — switch to Agent first.');
+    return true;
+  }
+  const cur = !!(chk && chk.checked);
+  const v = (args[0] || '').toLowerCase();
+  const target = v === 'on' ? true : v === 'off' ? false : !cur;
+  if (target !== cur) btn.click();
+  slashReply(`Plan mode: ${target ? 'on' : 'off'}`);
+  return true;
+}

 async function _cmdToggleShow(args, ctx) {
  const name = (args[0] || '').toLowerCase();
@@ -5489,6 +5505,7 @@ const COMMANDS = {
      'bash':      { handler: _cmdToggleBash,      alias: ['b','shell'],       help: 'Toggle bash/shell',       usage: '/toggle bash' },
      'research':  { handler: _cmdToggleResearch,  alias: ['r'],               help: 'Toggle deep research',    usage: '/toggle research' },
      'doc':       { handler: _cmdToggleDoc,       alias: [],     help: 'Toggle document editor',  usage: '/toggle doc' },
+      'plan':      { handler: _cmdTogglePlan,      alias: ['p'],  help: 'Toggle plan mode (agent)', usage: '/toggle plan' },
      'sidebar':   { handler: _cmdToggleSidebar,   alias: ['sb'], help: 'Cycle sidebar (full/mini/off)', usage: '/toggle sidebar [1|2|3]' },
      '_show':     { handler: _cmdToggleShow,      alias: [],     help: 'Show all toggle states',  usage: '/toggle' }
    }
@@ -5501,6 +5518,13 @@ const COMMANDS = {
    noUserBubble: true,
    usage: '/workspace [set <path> | clear | pick]',
  },
+  plan: {
+    alias: [],
+    category: 'Quick toggles',
+    help: 'Toggle plan mode (agent)',
+    handler: _cmdTogglePlan,
+    usage: '/plan [on|off]',
+  },
  memory: {
    alias: ['m'],
    category: 'Memory',
@@ -24,7 +24,8 @@ export const KEYS = {
  SECTION_ORDER: 'sidebar-section-order',
  ADMIN_LAST_TAB: 'admin-last-tab',
  DENSITY: 'odysseus-density',
-  WORKSPACE: 'odysseus-workspace'
+  WORKSPACE: 'odysseus-workspace',
+  PLAN: 'odysseus-plan'
 };

 /**