Commit Graph

8 Commits

Author SHA1 Message Date
Kenny Van de Maele 8ce945d338 feat: Add plan mode to the chat agent (#638)
* feat: Add plan mode to the chat agent

Adds a plan mode: the agent investigates read-only, proposes a checklist, and
waits for approval before changing anything. On approval it runs with full
tools and checks items off as it goes. Enforcement reuses the existing
disabled_tools gate.

Includes a slash command: `/plan [on|off]` (and `/toggle plan`) to flip the
plan toggle from the chat input.

- src/tool_security.py, src/mcp_manager.py: read-only allowlist (tools + MCP).
- src/agent_loop.py, routes/chat_routes.py: union the disabled set, prepend the
  plan directive, force agent mode.
- static/: plan toggle pill, Approve & Run, dockable plan window, task-list
  checkboxes, and the /plan slash command.
- tests/test_plan_mode.py.

* Plan mode: persistent re-referenceable plan + agent write-back

Three improvements so a long plan survives a weak model and stays in reach:

1. Re-reference the plan (out-of-context fix). On the execution turn the frontend
   sends the approved checklist back (`approved_plan`); the backend pins it as a
   top-of-context `## ACTIVE PLAN` system note (kept by the context trimmer), so
   the agent can always re-read the plan instead of losing the thread on a long
   run. New `build_active_plan_note()` (unit-tested).

2. Re-open / dock the plan anytime. The plan checklist is stored per-session
   (localStorage). When a plan exists, the plan-mode button opens a small menu
   ("Show plan" / "Plan mode: On/Off") that re-opens the side-dockable plan
   window — so it can stay docked while the agent works. The window live-refreshes
   as the plan changes.

3. Agent write-back: new `update_plan` tool. The agent calls it to tick steps
   `- [x]` after finishing them, or to revise steps when the user asks. Marker
   tool (no I/O) → `plan_update` SSE event → the stored plan + docked window
   update live. The ACTIVE PLAN note instructs the agent to use it.

Backend: src/agent_loop.py (param + pin + note builder + emit + prompt blurb),
src/tool_execution.py (update_plan handler), routes/chat_routes.py (parse
`approved_plan`, relay `plan_update`), registration in tool_schemas / agent_tools
/ tool_index (always-available, not admin-gated).
Frontend: static/js/chat.js (plan store, send `approved_plan`, handle
`plan_update`, capture restated checklists), static/app.js (plan-button menu),
static/js/planWindow.js (`isPlanWindowOpen`), static/js/storage.js (PLAN key).
Tests: tests/test_plan_mode.py (plan-note), tests/test_update_plan_tool.py.

* Plan mode: drop bash/python, rely on read-only discovery tools

Shell can mutate (write files, hit the network) and can't be constrained to
read-only at the tool layer, so plan mode no longer relies on a prompt to keep
it well-behaved — bash/python are removed from the read-only allowlist and added
to the fail-closed block set. Discovery is covered by the dedicated read-only
tools (read_file, grep, glob, ls) instead.

Rewrites the plan-mode directive to state shell is disabled and lists the
available read-only tools positively. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Comment: note _MCP_READONLY_VERBS are prefixes not whole words

Clarifies that entries like "summar" are intentional stems matched via
startswith (covers summarise/summarize/summary), not typos. Addresses review
feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: clarify why gating inverts the allowlist into a denylist

Rename _PLAN_MODE_FALLBACK_BLOCK -> _PLAN_MODE_KNOWN_MUTATORS and rewrite the
comments. The tool gate is a denylist (disabled_tools); plan mode's policy is an
allowlist, so it returns the inverse (all known tool names minus the allowlist).
The static mutator set is a backstop for the schema-derived name list, which
misses XML-only tools and can fail to import. Addresses review feedback on #638.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Plan mode: stop hardcoding the read-only tool list in the directive

The model is already shown its available (read-only) tools by _assemble_prompt,
which removes every disabled tool. Enumerating them again in the directive only
duplicated that list and would drift as tools change. Point at the tools listed
below instead. Addresses review feedback on #638.
2026-06-05 16:32:25 +02:00
nubs ae48ea7064 fix(mcp): sanitize and cap rendered MCP tool param hints (#2682) 2026-06-05 03:00:22 +02:00
Abylaikhan Zulbukharov 1d80bf5e65 feat(mcp): add Streamable HTTP transport with OAuth 2.0 (#1033)
* feat(mcp): add Streamable HTTP transport with OAuth 2.0

  Odysseus could only reach MCP servers over stdio and SSE, so modern
  remote servers like https://mcp.higgsfield.ai/mcp (Streamable HTTP,
  gated behind OAuth) could not be connected.

  Add an `http` transport that connects via the SDK's
  streamablehttp_client and authenticates with the SDK's
  OAuthClientProvider: RFC 9728 protected-resource discovery, RFC 8414
  authorization-server metadata, Dynamic Client Registration,
  authorization-code + PKCE, and token refresh. A small bridge
  (src/mcp_oauth.py) connects the SDK's blocking callback to the existing
  web callback route via an asyncio.Future keyed by the OAuth `state`,
  and the dynamic client registration plus tokens persist per-server in a
  new encrypted `oauth_tokens` column.

  The connect runs as a bounded background task so the "Add server"
  request returns immediately; redirect_handler publishes needs_auth +
  auth_url to connection state as soon as discovery/DCR completes (which
  can exceed the bounded wait), and the UI polls until connected. Remote
  users finish via the existing paste-back flow. The Google OAuth path is
  left unchanged.

  - core/database.py: encrypted oauth_tokens column + migration
  - src/mcp_oauth.py: OAuth provider, DB-backed TokenStorage, state registry
  - src/mcp_manager.py: http dispatch, background connect, _connect_http
  - routes/mcp_routes.py: http validation, needs_auth/auth_url, callback bridge
  - static/js/settings.js: Streamable HTTP option + OAuth flow with polling
  - tests: 5 new unit tests (transport dispatch, registry, token storage)

  Verified against the live Higgsfield server: discovery, DCR (client_id
  issued), loopback redirect accepted, and a PKCE authorization URL with
  needs_auth status. No regressions (full suite delta is only the 5 added
  passing tests).

* fix(mcp): address PR #1033 review feedback

  - mcp_oauth: derive redirect URI from OAUTH_REDIRECT_BASE_URL/APP_PUBLIC_URL
    (default http://localhost:7000) instead of hardcoding the port
  - mcp_oauth: leave OAuth scope unset so the SDK derives it from the server's
    WWW-Authenticate/protected-resource metadata; hardcoding an OIDC scope broke
    non-OpenID MCP servers (verified: Higgsfield still gets its server-derived
    scope)
  - mcp_oauth: prune abandoned OAuth flows (_prune_stale + _pending_ts) so the
    module-level registries can't grow unbounded
  - mcp_oauth: persist tokens/client-info in a single DB session/commit
    (_update) instead of a load+save double round-trip
  - mcp_manager: cancel and drop the background connect task in
    disconnect_server so a deleted server stops publishing status
  - database: document why the oauth_tokens migration uses TEXT while the model
    declares EncryptedText (encryption is applied at the Python layer)
  - settings.js: surface persistent OAuth-poll failures and an explicit timeout
    message instead of silently swallowing errors
  - tests: cover the stale-flow pruning

* static/js/settings.js now shows an in-flight loading state on the buttons that fire requests:
2026-06-05 02:40:52 +02:00
NubsCarson 019f8d614f fix(mcp): expose MCP tool input parameters to the agent
MCP server tools were presented to the agent with only their name and a
truncated description: get_tool_descriptions_for_prompt() emitted
"- name: description" and get_all_tools() dropped input_schema entirely.
On the fenced-block tool path (used by Ollama models), the agent could
not see a tool's declared inputs and guessed argument names from the
description alone, so tool calls failed (issue #2509). MCP inspector
showed the schemas fine, confirming the loss was on our side.

- get_all_tools() now carries each tool's input_schema.
- get_tool_descriptions_for_prompt() renders a compact args hint
  (parameter names, coarse types, required-ness) via a new
  _format_mcp_params() helper, matching the "Args (JSON): {...}" style
  the built-in tool descriptions already use.

Fixes #2509
2026-06-04 12:51:31 +00:00
Paulo Victor Cordeiro 1feb2ae7d5 fix: close AsyncExitStack on MCP init/tool-discovery failure (#1493)
If session.initialize() or list_tools() raises after the stdio
subprocess or SSE connection is already open, the AsyncExitStack is
never closed — leaking the child process or HTTP connection. Wrap the
setup phase in try/except to aclose() the stack before re-raising.
2026-06-03 14:23:46 +09:00
Shreyas S Joshi b29c200801 fix(mcp): invalidate tool prompt cache on connect/disconnect/error (#1235)
* fix(mcp): invalidate tool prompt cache on connect/disconnect/error

get_tool_descriptions_for_prompt cached its result keyed only on
(disabled_map, len(_tools)). If a server reconnects with the same
tool count (or transitions to error state), the cache was never
busted — the agent received stale tool descriptions for the new
connection state.

Add a _generation counter incremented on every structural change
(successful connect, disconnect, connection error) and include it in
the cache key.

* test(mcp): regression test for _generation cache invalidation
2026-06-03 00:49:29 +09:00
Prakhya bdc99d746a fix: add Browser MCP connection diagnostics (#662) 2026-06-02 11:50:17 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00