feat: add code-navigation tools (grep, glob, ls) + read_file line ranges (#1670)

Gives the agent first-class code navigation instead of shelling out via bash
(token-heavy, unreliable on weaker models, unstructured). Mirrors the
Grep/Glob/Read primitives that Claude Code / opencode expose.

- grep: regex search over file contents across a tree. Uses ripgrep when
  available (with explicit excludes so junk dirs are skipped even without a
  .gitignore); falls back to a pure-Python walk+regex when rg is absent.
  Returns file:line:match, capped.
- glob: find files by glob pattern (recursive), newest first.
- ls: list a directory (folders first, then files with sizes).
- read_file: optional offset/limit for line-range reads of large files
  (plain-path calls stay back-compatible).

All confined by the same path policy as read_file (_resolve_tool_path:
data/tmp allowlist + sensitive-file deny). Junk dirs (.git, node_modules,
venv, __pycache__, dist/build, …) skipped. Output capped (200 hits,
400 chars/line). Admin-gated like the other filesystem tools.

Wiring: schemas + native arg->content serializer (src/tool_schemas.py), tool
tags (src/agent_tools.py), always-available + descriptions (src/tool_index.py),
admin gate (src/tool_security.py), dispatch + impls (src/tool_execution.py).

Tests: tests/test_code_nav_tools.py — match/skip-junk/ignore-case/glob-filter,
allowlist rejection, glob/ls, read-range, and the no-ripgrep Python fallback.
This commit is contained in:
Kenny Van de Maele
2026-06-04 18:37:32 +02:00
committed by GitHub
parent 7443c36bd9
commit 1f00fff837
6 changed files with 464 additions and 8 deletions
+58 -3
View File
@@ -82,16 +82,65 @@ FUNCTION_TOOL_SCHEMAS = [
"type": "function",
"function": {
"name": "read_file",
"description": "Read a file from disk",
"description": "Read a file from disk. Optionally read a line range with offset/limit for large files.",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "File path to read"}
"path": {"type": "string", "description": "File path to read"},
"offset": {"type": "integer", "description": "1-based line to start reading from (optional)"},
"limit": {"type": "integer", "description": "Max number of lines to read from offset (optional)"}
},
"required": ["path"]
}
}
},
{
"type": "function",
"function": {
"name": "grep",
"description": "Search file contents for a regular expression across a directory tree (uses ripgrep when available, respecting .gitignore). Returns file:line:match. PREFER this over `bash grep/rg` for code search — confined to the allowed roots, structured output.",
"parameters": {
"type": "object",
"properties": {
"pattern": {"type": "string", "description": "Regular expression to search for"},
"path": {"type": "string", "description": "Directory or file to search (optional; defaults to the project root)"},
"glob": {"type": "string", "description": "Only search files matching this glob, e.g. '*.py' (optional)"},
"ignore_case": {"type": "boolean", "description": "Case-insensitive match (optional)"},
"max_results": {"type": "integer", "description": "Max matches to return (optional)"}
},
"required": ["pattern"]
}
}
},
{
"type": "function",
"function": {
"name": "glob",
"description": "Find files by glob pattern (recursive), newest first. e.g. '**/*.py'. PREFER this over `bash find/ls` for locating files — confined to the allowed roots.",
"parameters": {
"type": "object",
"properties": {
"pattern": {"type": "string", "description": "Glob pattern, e.g. '**/*.ts' or 'src/**/test_*.py'"},
"path": {"type": "string", "description": "Base directory (optional; defaults to the project root)"}
},
"required": ["pattern"]
}
}
},
{
"type": "function",
"function": {
"name": "ls",
"description": "List the entries of a directory (folders first, then files with sizes). PREFER this over `bash ls` — confined to the allowed roots.",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "Directory to list (optional; defaults to the project root)"}
},
"required": []
}
}
},
{
"type": "function",
"function": {
@@ -1128,7 +1177,13 @@ def function_call_to_tool_block(name: str, arguments: str) -> Optional[ToolBlock
else:
content = args.get("query", "")
elif tool_type == "read_file":
content = args.get("path", "")
# Plain path (back-compat) unless a line range is requested → JSON.
if args.get("offset") or args.get("limit"):
content = json.dumps(args)
else:
content = args.get("path", "")
elif tool_type in ("grep", "glob", "ls"):
content = json.dumps(args) if args else "{}"
elif tool_type == "write_file":
content = args.get("path", "") + "\n" + args.get("content", "")
elif tool_type == "edit_file":