Add ask_user tool: agent-posed multiple-choice questions (#2111)

Let the agent pause and ask the user a multiple-choice question when a task is genuinely ambiguous and the answer changes what it does next — choosing between approaches, confirming an assumption, picking a target — instead of guessing. Modeled on the existing `ui_control` marker pattern: the `ask_user` tool returns an `ask_user` payload that the agent loop emits as an SSE event and then ends the turn. The frontend renders the question with clickable option buttons, a free-text "Other" input, and an x to dismiss; the user's choice is sent as the next message and the agent resumes with it in context. - src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O. Validates a non-empty question + 2..6 options, normalizes string/object options, returns the payload. - src/agent_loop.py: emit the `ask_user` event and break the round loop so the turn ends and waits for the user's selection. Stream the question as assistant text so it persists/replays (prevents a re-ask loop). - Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS, FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any user can be asked); the structured args serialize via the default json.dumps path. - routes/chat_routes.py: relay the `ask_user` event to the client. - static/js/chat.js + static/style.css: render the question card (options + free-text Other + dismiss x; removed once answered). Reuses CSS vars and the .modal-close button; emoji go through the monochrome-SVG pipeline. Bump chat.js cache pin. - tests/test_ask_user_tool.py: payload, multi flag, string options, option cap, validation errors, serializer round-trip, registration.
2026-06-23 05:05:24 -04:00 · 2026-06-05 11:49:11 +02:00
parent 621885ac06
commit 0a2adc9c96
9 changed files with 432 additions and 1 deletions
@@ -1184,6 +1184,53 @@ async def execute_tool_block(
        logger.warning("Public tool policy blocked owner=%r tool=%s", owner, tool)
        return desc, result

+    # ask_user: the agent poses a multiple-choice question to the user to get a
+    # decision/clarification. This is a pure UI-control marker — no subprocess,
+    # no filesystem. It returns an `ask_user` payload that the agent loop turns
+    # into an `ask_user` SSE event and then ENDS the turn, so the chat waits for
+    # the user's selection (their choice arrives as the next message).
+    if tool == "ask_user":
+        import json as _json
+        question, options, multi = "", [], False
+        raw = (content or "").strip()
+        try:
+            parsed = _json.loads(raw) if raw else {}
+        except (ValueError, TypeError):
+            parsed = {}
+        if isinstance(parsed, dict):
+            question = str(parsed.get("question", "")).strip()
+            multi = bool(parsed.get("multi") or parsed.get("multiSelect"))
+            for opt in (parsed.get("options") or []):
+                if isinstance(opt, dict):
+                    label = str(opt.get("label", "")).strip()
+                    descr = str(opt.get("description", "")).strip()
+                elif isinstance(opt, str):
+                    label, descr = opt.strip(), ""
+                else:
+                    continue
+                if label:
+                    options.append({"label": label, "description": descr})
+        else:
+            question = raw
+        if not question or len(options) < 2:
+            return "ask_user: invalid", {
+                "error": (
+                    "ask_user needs a non-empty `question` and at least 2 `options` "
+                    "(each an object with a `label`, optional `description`)."
+                ),
+                "exit_code": 1,
+            }
+        options = options[:6]  # keep the choice list sane
+        desc = f"ask_user: {question[:80]}"
+        labels = ", ".join(o["label"] for o in options)
+        result = {
+            "ask_user": {"question": question, "options": options, "multi": multi},
+            "output": f"Asked the user: {question}\nOptions: {labels}\nAwaiting their selection.",
+            "exit_code": 0,
+        }
+        logger.info("Tool executed: %s (%d options, multi=%s)", desc, len(options), multi)
+        return desc, result
+
    # Background execution: a `bash` block whose first line is the `#!bg`
    # marker runs DETACHED — returns a job id immediately so the chat stream
    # isn't held open for a multi-minute install/ffmpeg/download. The always-on