Add ask_user tool: agent-posed multiple-choice questions (#2111)

Let the agent pause and ask the user a multiple-choice question when a
task is genuinely ambiguous and the answer changes what it does next —
choosing between approaches, confirming an assumption, picking a target —
instead of guessing.

Modeled on the existing `ui_control` marker pattern: the `ask_user` tool
returns an `ask_user` payload that the agent loop emits as an SSE event
and then ends the turn. The frontend renders the question with clickable
option buttons, a free-text "Other" input, and an x to dismiss; the user's
choice is sent as the next message and the agent resumes with it in
context.

- src/tool_execution.py: `ask_user` handler — pure UI marker, no I/O.
  Validates a non-empty question + 2..6 options, normalizes string/object
  options, returns the payload.
- src/agent_loop.py: emit the `ask_user` event and break the round loop so
  the turn ends and waits for the user's selection. Stream the question as
  assistant text so it persists/replays (prevents a re-ask loop).
- Registration: TOOL_TAGS, ALWAYS_AVAILABLE, BUILTIN_TOOL_DESCRIPTIONS,
  FUNCTION_TOOL_SCHEMAS, the system-prompt blurb. Not admin-gated (any
  user can be asked); the structured args serialize via the default
  json.dumps path.
- routes/chat_routes.py: relay the `ask_user` event to the client.
- static/js/chat.js + static/style.css: render the question card (options +
  free-text Other + dismiss x; removed once answered). Reuses CSS vars and
  the .modal-close button; emoji go through the monochrome-SVG pipeline.
  Bump chat.js cache pin.
- tests/test_ask_user_tool.py: payload, multi flag, string options, option
  cap, validation errors, serializer round-trip, registration.
This commit is contained in:
Kenny Van de Maele
2026-06-05 11:49:11 +02:00
committed by GitHub
parent 621885ac06
commit 0a2adc9c96
9 changed files with 432 additions and 1 deletions
+47
View File
@@ -1184,6 +1184,53 @@ async def execute_tool_block(
logger.warning("Public tool policy blocked owner=%r tool=%s", owner, tool)
return desc, result
# ask_user: the agent poses a multiple-choice question to the user to get a
# decision/clarification. This is a pure UI-control marker — no subprocess,
# no filesystem. It returns an `ask_user` payload that the agent loop turns
# into an `ask_user` SSE event and then ENDS the turn, so the chat waits for
# the user's selection (their choice arrives as the next message).
if tool == "ask_user":
import json as _json
question, options, multi = "", [], False
raw = (content or "").strip()
try:
parsed = _json.loads(raw) if raw else {}
except (ValueError, TypeError):
parsed = {}
if isinstance(parsed, dict):
question = str(parsed.get("question", "")).strip()
multi = bool(parsed.get("multi") or parsed.get("multiSelect"))
for opt in (parsed.get("options") or []):
if isinstance(opt, dict):
label = str(opt.get("label", "")).strip()
descr = str(opt.get("description", "")).strip()
elif isinstance(opt, str):
label, descr = opt.strip(), ""
else:
continue
if label:
options.append({"label": label, "description": descr})
else:
question = raw
if not question or len(options) < 2:
return "ask_user: invalid", {
"error": (
"ask_user needs a non-empty `question` and at least 2 `options` "
"(each an object with a `label`, optional `description`)."
),
"exit_code": 1,
}
options = options[:6] # keep the choice list sane
desc = f"ask_user: {question[:80]}"
labels = ", ".join(o["label"] for o in options)
result = {
"ask_user": {"question": question, "options": options, "multi": multi},
"output": f"Asked the user: {question}\nOptions: {labels}\nAwaiting their selection.",
"exit_code": 0,
}
logger.info("Tool executed: %s (%d options, multi=%s)", desc, len(options), multi)
return desc, result
# Background execution: a `bash` block whose first line is the `#!bg`
# marker runs DETACHED — returns a job id immediately so the chat stream
# isn't held open for a multi-minute install/ffmpeg/download. The always-on