fix(agent): index api_call so RAG tool selection can retrieve it (#3923)

* fix(agent): index api_call so RAG tool selection can retrieve it

api_call exists in FUNCTION_TOOL_SCHEMAS and the agent's system prompt
advertises configured API integrations, but the tool had no entry in
BUILTIN_TOOL_DESCRIPTIONS. RAG tool selection embeds those descriptions and
retrieves the top-K per message, so a tool without one can never be selected:
the agent claims it can call Home Assistant/Miniflux/Gitea/etc. and then
never receives the api_call schema (unless the Personal Assistant
ASSISTANT_ALWAYS_AVAILABLE path applies).

Add a retrieval-rich description for api_call, plus an ast-based parity test
asserting every FUNCTION_TOOL_SCHEMAS tool has an index description so the
next added tool cannot silently drift the same way.

Fixes #3794

* fix(agent): route API-integration intent to api_call at selection time

Addresses review (RaresKeY) on #3923: indexing api_call in the ToolIndex
description was necessary but not sufficient — the #3794 repro ('Use the
api_call tool to call Home Assistant GET /api/states') matched no domain in
_classify_agent_request, classified as low-signal, so the agent loop skipped
retrieval entirely and the schema filter sent only ALWAYS_AVAILABLE
(manage_memory/ask_user/update_plan). api_call never reached the model.

- _classify_agent_request: detect API-integration intent (api_call,
  integration(s), Home Assistant/Miniflux/Gitea/Linkding/Jellyfin) -> new
  'integrations' domain, so the turn is no longer low-signal.
- _DOMAIN_TOOL_MAP['integrations'] = {api_call}: deterministically seeds
  api_call into relevant tools after retrieval, independent of embeddings.
- _DOMAIN_RULES['integrations']: rule pack (required — _domain_rules_for_tools
  indexes _DOMAIN_RULES[domain] directly).
- tool_index _KEYWORD_HINTS: parity hint for the retrieval / keyword-fallback
  paths.
- Regression drives the real classifier -> domain-map -> FUNCTION_TOOL_SCHEMAS
  filter chain and asserts api_call is advertised for the #3794 prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Mazen Tamer Salah
2026-06-18 11:43:25 +03:00
committed by GitHub
parent f70db19cc6
commit b51d83b16d
4 changed files with 157 additions and 0 deletions
+9
View File
@@ -94,6 +94,7 @@ BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = {
"manage_endpoints": "Endpoint management: list, add, delete, enable, or disable model API endpoints.",
"manage_mcp": "MCP server management: list, add, delete, reconnect servers, or list available tools.",
"manage_webhooks": "Webhook management: list, add, delete, enable, or disable webhooks.",
"api_call": "Call a configured API integration by name (Home Assistant, Miniflux, Gitea, Linkding, Jellyfin, RSS reader, git forge, bookmark manager, smart home, or any other registered service). Make a GET/POST/PUT/PATCH/DELETE request to the integration's endpoint path, with an optional JSON body. Use whenever the user asks to query or control one of their connected integrations/services.",
"manage_tokens": "API token management: list, create, or delete API access tokens.",
"manage_documents": "List, read, delete, or tidy documents in the editor panel. action='list' returns clickable rows (most-recent first) so the user can open any doc by clicking. action='read' (aka view/open/get) with document_id returns the content; supports offset=<N> + limit=<N> to page through large docs (response includes next_offset when more remains, so you can keep calling with offset=next_offset). action='delete' with document_id removes a doc (only way to delete). Use this for ANY 'show/read/list/open my documents/docs/files/notes' request — never shell or curl.",
"manage_research": "List, read/open, or delete saved DEEP RESEARCH results from the Library. action='list' returns clickable [query](#research-<id>) rows (most-recent first). action='read' (aka open/view/get) with id returns the report + sources. action='delete' with id removes it. Use this for ANY 'open/read/find/delete my research / that report / the research on X' request. NOTE: this is for EXISTING research; to START new research use trigger_research.",
@@ -414,6 +415,14 @@ class ToolIndex:
"my settings", "change setting", "change a setting", "set setting",
"preference", "preferences", "configure"}):
{"manage_settings", "ui_control"},
# API-integration intent → the api_call tool. Mirrors the agent-loop
# "integrations" domain so api_call still surfaces on the retrieval and
# keyword-fallback paths (not just the deterministic domain seed) when a
# user names a connected service.
frozenset({"api_call", "api call", "integration", "integrations",
"home assistant", "homeassistant", "miniflux", "gitea",
"linkding", "jellyfin"}):
{"api_call"},
# Managing EXISTING research in the Library — open/read/find/delete.
frozenset({"my research", "the research", "research on", "open research",
"read research", "find research", "delete research",