feat: round-limit handling — Continue affordance at the cap + configurable cap (#1999)

* feat: round-limit handling — Continue affordance at the cap + configurable cap When the agent loop runs out of rounds (per-message step cap, default 20) while still actively using tools, it stopped silently mid-task. Now: 1. The loop emits a `rounds_exhausted` SSE event at the cap, and the UI shows a "Continue" pill at the bottom of the chat that resumes the task from where it left off. Repeated cap-hits each get a fresh Continue (multiple continues in a row). 2. The cap is configurable in Settings → Agent ("Max steps per message"), validated on the client, at the save endpoint, and at the read site. - src/agent_loop.py: track `_exhausted_rounds` (set only when a full tool-executing round completes on the last allowed round — i.e. the agent wanted to keep going); emit `{"type":"rounds_exhausted","rounds":N}` (logged). - routes/chat_routes.py: read `agent_max_rounds` (clamped 1..200), pass as `max_rounds`; forward the new event through the SSE relay. - routes/auth_routes.py: validate numeric settings on save (int + clamp; agent_max_rounds 1..200, agent_max_tool_calls 0..1000; 400 on non-int). - src/settings.py: default `agent_max_rounds = 20`. - static/: Settings input + client-side clamp; the Continue pill (reuses the existing .stopped-indicator / .continue-btn classes and theme vars --border/--fg/--bg/--accent); appended to the chat container so it survives the message re-render at stream finalize. chat.js cache version bumped. * test: cover rounds_exhausted emission (cap-hit vs normal finish) Drives the real stream_agent_loop with mocked LLM stream / tool exec / settings: a tool block every round exhausts the cap and must emit rounds_exhausted; a plain answer hits the done-break and must not. Guards the for/else logic.
2026-06-17 02:05:22 -04:00 · 2026-06-04 22:36:05 +02:00
parent a54f41037d
commit 64d65b73c1
9 changed files with 215 additions and 14 deletions
@@ -438,9 +438,24 @@ def setup_auth_routes(auth_manager: AuthManager) -> APIRouter:
            raise HTTPException(403, "Admin only")
        body = await request.json()
        current = _load_settings()
+        # Per-key validation for numeric settings: coerce to int and clamp to a
+        # sane range so a bad value can't disable the agent or let it run away.
+        _INT_RANGES = {
+            "agent_max_rounds": (1, 200),
+            "agent_max_tool_calls": (0, 1000),  # 0 = unlimited
+        }
        for key in DEFAULT_SETTINGS:
-            if key in body:
-                current[key] = body[key]
+            if key not in body:
+                continue
+            val = body[key]
+            if key in _INT_RANGES:
+                lo, hi = _INT_RANGES[key]
+                try:
+                    val = int(val)
+                except (TypeError, ValueError):
+                    raise HTTPException(400, f"{key} must be an integer")
+                val = max(lo, min(val, hi))
+            current[key] = val
        _save_settings(current)
        return current

@@ -981,7 +981,15 @@ def setup_chat_routes(
                _answered_by = None  # set if the selected model failed and a fallback answered
                try:
                    from src.settings import get_setting
+                    from src.agent_tools import MAX_AGENT_ROUNDS as _DEFAULT_ROUNDS
                    _tool_budget = int(get_setting("agent_max_tool_calls", 0))
+                    # Per-message round cap from settings; clamp defensively in
+                    # case settings.json was hand-edited to a bad value.
+                    try:
+                        _max_rounds = int(get_setting("agent_max_rounds", _DEFAULT_ROUNDS) or _DEFAULT_ROUNDS)
+                    except (TypeError, ValueError):
+                        _max_rounds = _DEFAULT_ROUNDS
+                    _max_rounds = max(1, min(_max_rounds, 200))

                    async for chunk in stream_agent_loop(
                        sess.endpoint_url,
@@ -992,6 +1000,7 @@ def setup_chat_routes(
                        max_tokens=ctx.preset.max_tokens,
                        prompt_type=preset_id,
                        max_tool_calls=_tool_budget,
+                        max_rounds=_max_rounds,
                        context_length=ctx.context_length,
                        active_document=active_doc,
                        session_id=session,
@@ -1017,6 +1026,7 @@ def setup_chat_routes(
                                    "tool_start", "tool_output", "agent_step",
                                    "doc_stream_open", "doc_stream_delta",
                                    "doc_update", "doc_suggestions", "ui_control",
+                                    "rounds_exhausted",
                                ):
                                    if data.get("type") == "agent_step":
                                        _agent_rounds = max(_agent_rounds, data.get("round", 1))