fix(images): render agent-generated images in chat (#2809)

* fix(images): render agent-generated images in chat When a chat model calls generate_image mid-conversation (agentic flow), the image does not display — it survives only as a URL the model echoes in prose. generate_image runs as a text-only MCP server, so result['image_url'] is never populated and the existing buildImageBubble render path never fires. Promote the image URL out of the tool's stdout in tool_execution so the agent loop's existing forwarding renders it via buildImageBubble — deterministically, no dependence on the model echoing the URL. Backend-only; reuses dev's image bubble, forwarding, and the tool's existing parseable output. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(images): fully-qualified, valid generated-image links The chat model often mangled the generated-image URL it echoed in prose (relative path, or copying the 'image_url:' label into the link href). Build a fully-qualified link by prefixing the existing app_public_url setting (empty default keeps relative paths), and present it as a clean 'Direct link:' the model can echo verbatim (the frontend auto-links bare https URLs). One file; independent of how the image is rendered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(images): cover _promote_image_fields; make exit-code guard self-contained Adds the unit tests requested in review on #2809: absolute URL, relative URL, no URL (result unchanged), and non-zero exit_code (not promoted). Moves the dict/exit_code==0 guard from the call site into _promote_image_fields so the function is self-contained and the failure case is unit-testable; call-site behavior is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 01:35:36 -04:00 · 2026-06-05 19:04:33 +08:00
parent 201e207b56
commit 0f8d12363a
3 changed files with 101 additions and 2 deletions
@@ -13,6 +13,7 @@ import json
 import logging
 import os
 import pathlib
+import re
 import sys
 import time
 from typing import Any, Awaitable, Callable, Dict, Optional, Tuple
@@ -594,9 +595,40 @@ async def _call_mcp_tool(
        if fallback:
            return fallback

+    # generate_image runs as a text-only MCP tool, so the saved image URL never
+    # reaches the agent loop's structured forwarding (which renders the image via
+    # buildImageBubble on result["image_url"]). Lift it out of the tool's stdout so
+    # the image renders deterministically — no dependence on the model echoing the
+    # URL into its prose (which it mangles/hallucinates).
+    if tool == "generate_image":
+        _promote_image_fields(result)
+
    return result


+def _promote_image_fields(result: Dict) -> None:
+    """Lift the image URL (+ prompt/model/size) from a successful generate_image MCP
+    text result into structured fields the agent loop already forwards to
+    buildImageBubble. Only acts on a dict result with exit_code 0; matches the
+    generated-image URL by pattern (absolute or relative) so it's robust to the
+    result's wording."""
+    if not isinstance(result, dict) or result.get("exit_code") != 0:
+        return
+    out = result.get("stdout") or ""
+    m = re.search(r'(?:https?://[^\s)\]]+)?/api/generated-image/[A-Za-z0-9._-]+', out)
+    if not m:
+        return
+    result["image_url"] = m.group(0).strip()
+    for field, pat in (
+        ("image_prompt", r'^Generated image for:\s*(.+)$'),
+        ("image_model", r'^model:\s*(.+)$'),
+        ("image_size", r'^size:\s*(.+)$'),
+    ):
+        fm = re.search(pat, out, re.M)
+        if fm:
+            result[field] = fm.group(1).strip()
+
+
 _BG_MARKERS = {"#!bg", "#bg", "# bg", "#background", "# background", "@background", "# @background"}