7 Commits

Author SHA1 Message Date
Alexandre Teixeira b0ade9964d test: refresh oversized split plan 2026-06-16 02:14:25 +01:00
Alexandre Teixeira b010b99bd4 test: add oversized test split plan 2026-06-16 02:14:13 +01:00
RaresKeY b58af4267b fix(companion): require chat scope for model inventory (#4319) 2026-06-16 01:15:05 +02:00
AkioKoneko 8ff76f083c fix(cookbook): avoid launching Ollama during Windows cache scan (#4368) 2026-06-16 01:00:40 +02:00
Wei Hong 2196869c86 fix(webhooks): route public emitters through fire_and_forget (#3964) (#4336)
The three public webhook emitters in chat_helpers and webhook_routes
schedule deliveries via asyncio.create_task(webhook_manager.fire(...)),
which bypasses WebhookManager._bg_tasks. asyncio only holds a weak
reference to the outer task, so the GC can collect it mid-delivery and
the webhook is silently dropped.

Route all three through webhook_manager.fire_and_forget() so the task
is tracked by _spawn_tracked() and the manager owns the full lifecycle.

Adds an AST-level guard test that scans routes/ for direct
asyncio.create_task wrapping webhook_manager.fire(...) to prevent
regressions.
2026-06-16 00:41:45 +02:00
holden093 dd2e23c9af fix(agent): report phone numbers from resolve_contact when a matched contact has no email (#4327)
When a CardDAV contact matched the search query but had no email
address (only phone numbers), the tool silently dropped it and
returned 'No contacts found'.  Fall back to the contact's phone
number(s) so the caller still receives usable information.

Refs: #4178 (the contacts-domain classifier fix that made the model
actually call resolve_contact for contacts queries, surfacing this
pre-existing gap)
2026-06-16 00:03:33 +02:00
Fahim facc50cb0f fix(api): attribute bearer-token actions to the token owner on owner-scoped routes (#4054)
* fix(api): attribute bearer-token actions to the token owner on owner-scoped routes

Owner-scoped chat, session, and upload routes called
get_current_user(), which resolves a bearer ody_ API token to the
sandboxed "api" pseudo-user. A paired API-token client (companion, CLI,
IDE extension) therefore saw and created a separate "api"-owned silo
instead of the owner's data.

effective_user() already exists for exactly this: it attributes a token's
actions to request.state.api_token_owner, is identical to
get_current_user() for cookie sessions, and falls back safely when a
token has no owner. session_routes.py was already migrated; this
completes the migration for the remaining owner-scoped routes:

- chat_helpers.py: chat-privilege enforcement, message attribution, prefs/context
- chat_routes.py: orphaned-endpoint owner, session-auth owner, message search
- upload_routes.py: upload owner attribution + access checks

The /api/models swap is intentionally omitted: #4292 already migrated it
to effective_user (plus the chat-scope gate and ownerless-token 403), so
this PR keeps dev's version of routes/model_routes.py unchanged.

chat_routes.py keeps importing get_current_user for the workspace owner
gate; session_routes.py drops the now-unused import.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* test: target effective_user in auth monkeypatches and owner-scope assertion

The owner-scoped routes now call effective_user() instead of
get_current_user(), so the tests that stubbed get_current_user (or
asserted on it) follow suit:

- test_chat_helpers.py, test_review_regressions.py,
  test_kv_cache_invalidation_2927.py: monkeypatch effective_user
- test_session_endpoint_owner_scope.py: assert the owner-scope guard uses
  effective_user(request)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:56:22 +02:00
18 changed files with 1098 additions and 44 deletions
+17 -3
View File
@@ -5,8 +5,9 @@ offers and pair to it, without duplicating any LLM logic.
Auth is enforced globally by AuthMiddleware (app.py), so reaching a handler here
means the caller is authenticated by either a cookie session or a Bearer `ody_`
API token. The read endpoints (ping/info/models) accept either; the pairing
endpoints are admin-cookie only.
API token. Ping/info accept either credential type, models requires a chat-
scoped API token for bearer callers, and the pairing endpoints are admin-cookie
only.
Pairing CSRF posture: minting happens ONLY on POST. The session cookie is
SameSite=Lax (routes/auth_routes.py), which a browser does not send on a
@@ -18,7 +19,7 @@ on a GET would be unsafe (Lax cookies ride top-level GET navigations), so GET
import html
from fastapi import APIRouter, Request
from fastapi import APIRouter, HTTPException, Request
from fastapi.responses import HTMLResponse
from core.middleware import require_admin
@@ -52,6 +53,18 @@ def owner_can_see(row_owner, owner) -> bool:
return row_owner is None or row_owner == owner
def require_models_scope(request: Request) -> None:
"""Require the companion chat scope for bearer-token model inventory."""
if not getattr(request.state, "api_token", False):
return
scopes = getattr(request.state, "api_token_scopes", None) or []
if isinstance(scopes, str):
scopes = [scope.strip() for scope in scopes.split(",")]
scope_set = {str(scope).strip() for scope in scopes if str(scope).strip()}
if _pairing.COMPANION_SCOPE not in scope_set:
raise HTTPException(403, "API token requires chat scope")
def mint_pairing_token(owner: str, invalidate=None) -> tuple[str, str]:
"""Mint a pairing token AND invalidate the auth middleware's in-memory token
cache, so the new token is accepted on the very next request without a server
@@ -103,6 +116,7 @@ def setup_companion_routes() -> APIRouter:
rows -- the same rule as owner_filter. Read-only; never returns api_key
material.
"""
require_models_scope(request)
import json as _json
from core.database import SessionLocal, ModelEndpoint
+8 -8
View File
@@ -14,7 +14,7 @@ from core.database import Session as DBSession, ModelEndpoint
from src.llm_core import normalize_model_id
from src.endpoint_resolver import normalize_base
from src.context_compactor import maybe_compact, trim_for_context
from src.auth_helpers import get_current_user
from src.auth_helpers import effective_user
from src.prompt_security import untrusted_context_message
from routes.prefs_routes import _load_for_user as load_prefs_for_user
@@ -78,7 +78,7 @@ def _enforce_chat_privileges(request, sess) -> None:
which means unrestricted allowed_models / zero cap -> no-op for them.
"""
try:
user = get_current_user(request)
user = effective_user(request)
except Exception:
user = None
if not user:
@@ -346,11 +346,11 @@ def add_user_message(sess, chat_handler, preprocessed: PreprocessedMessage, inco
def fire_message_event(request, webhook_manager, session_id: str, sess, message: str, compare_mode: bool = False):
"""Fire webhook and event_bus events for a new user message."""
if webhook_manager and not compare_mode:
asyncio.create_task(webhook_manager.fire("chat.message", {
webhook_manager.fire_and_forget("chat.message", {
"session_id": session_id, "model": sess.model, "message": message[:2000],
}))
})
from src.event_bus import fire_event
user = get_current_user(request)
user = effective_user(request)
fire_event("message_sent", user)
@@ -577,7 +577,7 @@ async def build_chat_context(
fire_message_event(request, webhook_manager, session_id, sess, message, compare_mode)
# Resolve user prefs
user = get_current_user(request)
user = effective_user(request)
uprefs = load_prefs_for_user(user)
# Memory enabled?
@@ -1120,10 +1120,10 @@ def run_post_response_tasks(
# Webhook
if webhook_manager and not compare_mode:
asyncio.create_task(webhook_manager.fire("chat.completed", {
webhook_manager.fire_and_forget("chat.completed", {
"session_id": session_id, "model": sess.model,
"user_message": message, "response": full_response[:2000],
}))
})
# Auto-name
if needs_auto_name(sess.name):
+5 -5
View File
@@ -23,7 +23,7 @@ from src.endpoint_resolver import normalize_base as _normalize_base, build_chat_
from src.session_search import search_session_messages
from src.prompt_security import untrusted_context_message
from core.exceptions import SessionNotFoundError
from src.auth_helpers import get_current_user
from src.auth_helpers import effective_user, get_current_user
from routes.session_routes import _verify_session_owner
from routes.document_helpers import _owner_session_filter
from core.database import SessionLocal, get_session_mode, set_session_mode
@@ -363,7 +363,7 @@ def setup_chat_routes(
sess = session_manager.get_session(session)
except KeyError:
raise HTTPException(404, f"Session '{session}' not found")
owner = get_current_user(request)
owner = effective_user(request)
if _clear_orphaned_session_endpoint(sess, owner=owner):
raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.")
@@ -603,7 +603,7 @@ def setup_chat_routes(
# but BEFORE loading. Prevents cross-user session hijack.
_verify_session_owner(request, session)
sess = session_manager.get_session(session)
owner = get_current_user(request)
owner = effective_user(request)
if _clear_orphaned_session_endpoint(sess, owner=owner):
raise HTTPException(400, "Selected model endpoint was removed. Pick another model in Settings.")
# Issue #587: picker shows a model from the endpoint cache but
@@ -634,7 +634,7 @@ def setup_chat_routes(
_enforce_chat_privileges(request, sess)
# Ensure session has auth headers
resolve_session_auth(sess, session, owner=get_current_user(request))
resolve_session_auth(sess, session, owner=effective_user(request))
# Check for research_pending BEFORE mode persist overwrites it
do_research = str(use_research).lower() == "true"
@@ -1485,7 +1485,7 @@ def setup_chat_routes(
if not q or not q.strip():
return []
_user = get_current_user(request)
_user = effective_user(request)
return [
result.to_dict()
for result in search_session_messages(
+3 -1
View File
@@ -505,6 +505,8 @@ def _cached_model_scan_script(model_dirs: list[str] | None = None, add_hf_cache:
" if u.startswith('KB'): return int(n * 1024)",
" return int(n)",
"def scan_ollama():",
" if any(m.get('is_ollama') for m in models): return",
" if os.name == 'nt' and not os.environ.get('ODYSSEUS_ALLOW_OLLAMA_CLI_SCAN'): return",
" if not shutil.which('ollama'): return",
" try:",
" p = subprocess.run(['ollama', 'list'], stdout=subprocess.PIPE, stderr=subprocess.DEVNULL, text=True, timeout=6)",
@@ -535,8 +537,8 @@ def _cached_model_scan_script(model_dirs: list[str] | None = None, add_hf_cache:
" models.append({'repo_id':name,'size_bytes':size_bytes,'nb_files':1,'has_incomplete':False,'path':'ollama','backend':'ollama','is_ollama':True})",
" return",
"for _hf_cache in hf_cache_paths(): scan_hf(_hf_cache)",
"scan_ollama()",
"scan_ollama_api()",
"scan_ollama()",
]
for model_dir in model_dirs or []:
lines.append(f"scan_dir(os.path.expanduser({model_dir!r}))")
+3 -3
View File
@@ -11,7 +11,7 @@ from core.session_manager import SessionManager
from core.models import ChatMessage
from src.request_models import SessionResponse
from core.database import Session as DbSession, SessionLocal, Document, GalleryImage, utcnow_naive
from src.auth_helpers import get_current_user, effective_user, _auth_disabled, owner_filter
from src.auth_helpers import effective_user, _auth_disabled, owner_filter
from src.session_actions import is_session_recently_active
@@ -328,7 +328,7 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
endpoint_id: str = Form(""),
):
skip_val = str(skip_validation).lower() == "true"
user = get_current_user(request)
user = effective_user(request)
endpoint_api_key = ""
endpoint_base_url = ""
_reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url)
@@ -477,7 +477,7 @@ def setup_session_routes(session_manager: SessionManager, config: dict, webhook_
db.close()
# Switch model/endpoint mid-session
if model is not None and endpoint_url is not None:
user = get_current_user(request)
user = effective_user(request)
_reject_raw_endpoint_url_for_non_admin(request, user, endpoint_id, endpoint_url)
endpoint_api_key = ""
endpoint_base_url = ""
+5 -5
View File
@@ -7,7 +7,7 @@ from fastapi import APIRouter, Request, File, UploadFile, HTTPException
from typing import List
import logging
from core.middleware import require_admin
from src.auth_helpers import get_current_user
from src.auth_helpers import effective_user
from src.upload_handler import count_recent_uploads
logger = logging.getLogger(__name__)
@@ -78,7 +78,7 @@ def setup_upload_routes(upload_handler):
for u in files:
try:
meta = upload_handler.save_upload(u, client_ip, owner=get_current_user(request))
meta = upload_handler.save_upload(u, client_ip, owner=effective_user(request))
out.append({
"id": meta["id"],
"name": meta["name"],
@@ -138,7 +138,7 @@ def setup_upload_routes(upload_handler):
original_name = info.get("name", file_id)
auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request)
current_user = effective_user(request)
file_owner = info.get("owner") if info else None
if auth_configured:
if not current_user:
@@ -204,7 +204,7 @@ def setup_upload_routes(upload_handler):
info = _load_upload_info(file_id)
auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request)
current_user = effective_user(request)
file_owner = info.get("owner") if info else None
if auth_configured:
if not current_user:
@@ -247,7 +247,7 @@ def setup_upload_routes(upload_handler):
raise HTTPException(404, "File not found")
auth_mgr = getattr(request.app.state, "auth_manager", None)
auth_configured = bool(auth_mgr and auth_mgr.is_configured)
current_user = get_current_user(request)
current_user = effective_user(request)
file_owner = info.get("owner")
if auth_configured:
if not current_user:
+2 -3
View File
@@ -1,6 +1,5 @@
"""Webhook, API Token, and sync chat routes."""
import asyncio
import uuid
import logging
from typing import Optional
@@ -385,10 +384,10 @@ def setup_webhook_routes(
sess.add_message(ChatMessage("assistant", reply))
session_manager.save_sessions()
asyncio.create_task(webhook_manager.fire("chat.completed", {
webhook_manager.fire_and_forget("chat.completed", {
"session_id": session_id, "model": sess.model,
"user_message": message[:2000], "response": reply[:2000],
}))
})
return {"response": reply, "session_id": session_id, "model": sess.model}
+14 -3
View File
@@ -3797,7 +3797,7 @@ async def do_resolve_contact(content: str, owner: Optional[str] = None) -> Dict:
if not name:
return {"error": "name is required", "exit_code": 1}
contacts = {} # email -> {name, source}
contacts = {} # email_or_phone -> {name, source, phone?}
# 1. CardDAV (Radicale) — structured contacts. Call in-process: a
# server-side httpx GET to /api/contacts/search carries no session
@@ -3812,10 +3812,18 @@ async def do_resolve_contact(content: str, owner: Optional[str] = None) -> Dict:
match = q in hay_name or any(q in (e or "").lower() for e in c.get("emails", []))
if not match:
continue
has_email = False
for email in (c.get("emails") or []):
email = (email or "").strip().lower()
if email and "@" in email:
contacts[email] = {"name": c.get("name") or email, "source": "contacts"}
has_email = True
# Fall back to phone numbers when the contact has no email address
if not has_email:
for phone in (c.get("phones") or []):
phone = (phone or "").strip()
if phone:
contacts[phone] = {"name": c.get("name") or phone, "source": "contacts", "phone": phone}
except Exception:
pass
@@ -3835,8 +3843,11 @@ async def do_resolve_contact(content: str, owner: Optional[str] = None) -> Dict:
return {"output": f"No contacts found matching '{name}'.", "exit_code": 0}
lines = [f"Contacts matching '{name}':"]
for email, info in contacts.items():
lines.append(f"- {info['name']} <{email}> ({info['source']})")
for key, info in contacts.items():
if info.get("phone"):
lines.append(f"- {info['name']} — phone: {info['phone']} ({info['source']})")
else:
lines.append(f"- {info['name']} <{key}> ({info['source']})")
return {"output": "\n".join(lines), "exit_code": 0}
+1 -1
View File
@@ -1009,7 +1009,7 @@ FUNCTION_TOOL_SCHEMAS = [
"type": "function",
"function": {
"name": "resolve_contact",
"description": "Look up a contact's email address by name. Searches CardDAV address book and sent email history. Use when the user says 'message [name]' or 'email [name]' without an email address.",
"description": "Look up a contact by name. Searches CardDAV address book and sent email history. Returns email addresses (when available) or phone numbers. Use when the user says 'message [name]', 'email [name]', or asks for someone's contact details.",
"parameters": {
"type": "object",
"properties": {
+326
View File
@@ -0,0 +1,326 @@
# Oversized Test File Split Plan
## Purpose
This document plans future oversized test-file splits using current repo data.
It does not move files, rewrite assertions, extract helpers, or change CI.
## Roadmap context
- Issue: #3983
- Parent tracker: #2523
- Follows #3973 / #3982, the report-only order-sensitivity diagnostics slice.
## Methodology
Metrics were generated from the current test tree using:
- physical line counts for every recursive `test_*.py` file under `tests/`;
- AST counts for `test_*` functions and `Test*` classes;
- one `pytest --collect-only -q tests` run to count collected items per file;
- current taxonomy classification from `tests._taxonomy.classify_test_path`; and
- static setup-signal scans for route/API, DB/session, import-state, security, filesystem, subprocess/script, async/threading, and UI/static indicators.
Static signals are not proof of risk. They are review prompts.
Future split PRs must still inspect each file manually before editing.
## Current summary
- test files scanned: 583
- collected pytest items counted: 3586
- large-file threshold: 300 lines
- large-collected threshold: 20 collected items
Area distribution:
| Value | Files |
|---|---:|
| cli | 28 |
| helpers | 1 |
| js | 39 |
| routes | 23 |
| security | 77 |
| services | 144 |
| uncategorized | 234 |
| unit | 37 |
Sub-area distribution:
| Value | Files |
|---|---:|
| api | 6 |
| atomic | 3 |
| auth | 9 |
| calendar | 10 |
| cli | 28 |
| confinement | 7 |
| cookbook | 13 |
| document | 11 |
| email | 12 |
| embedding | 3 |
| gallery | 5 |
| history | 3 |
| js | 39 |
| llm | 16 |
| mcp | 8 |
| memory | 15 |
| nondict | 7 |
| nonstring | 22 |
| owner | 14 |
| owner_scope | 23 |
| parse | 4 |
| provider | 6 |
| research | 16 |
| route | 6 |
| routes | 9 |
| scheduler | 3 |
| scope | 5 |
| security | 9 |
| session | 16 |
| ssrf | 3 |
| webhook | 3 |
| xss | 5 |
Values below 2 files: 244 values covering 244 files.
## Top files by collected pytest items
| File | Lines | Collected tests | Test defs | Test classes | Area | Sub-area | Signals |
|---|---:|---:|---:|---:|---|---|---|
| `tests/test_model_routes.py` | 1778 | 139 | 116 | 10 | routes | routes | route/api, db/session, import-state, async/threading |
| `tests/test_security_regressions.py` | 1224 | 92 | 68 | 0 | security | security | route/api, db/session, import-state, security, filesystem, async/threading, ui/static |
| `tests/test_provider_classification.py` | 188 | 67 | 21 | 4 | services | provider | - |
| `tests/test_cookbook_helpers.py` | 912 | 65 | 65 | 0 | services | cookbook | route/api, filesystem, subprocess/script, async/threading, ui/static |
| `tests/test_shell_routes.py` | 481 | 63 | 48 | 8 | routes | routes | route/api, import-state, filesystem |
| `tests/test_pr_blocker_audit.py` | 964 | 58 | 58 | 0 | uncategorized | pr_blocker_audit | import-state, security, filesystem |
| `tests/test_provider_endpoints.py` | 241 | 58 | 18 | 1 | services | provider | subprocess/script |
| `tests/test_agent_loop.py` | 469 | 52 | 52 | 5 | uncategorized | agent_loop | db/session, import-state |
| `tests/test_service_health.py` | 472 | 47 | 42 | 0 | uncategorized | service_health | async/threading |
| `tests/test_run_focus.py` | 399 | 47 | 44 | 0 | uncategorized | run_focus | security, filesystem, subprocess/script, ui/static |
| `tests/test_llm_core_temperature.py` | 196 | 41 | 17 | 0 | services | llm | - |
| `tests/test_endpoint_probing.py` | 411 | 34 | 30 | 6 | uncategorized | endpoint_probing | route/api, db/session, import-state |
| `tests/test_llm_core_anthropic_temp_omit.py` | 94 | 32 | 6 | 0 | services | llm | db/session |
| `tests/test_chat_helpers.py` | 264 | 31 | 18 | 0 | uncategorized | chat_helpers | route/api |
| `tests/test_provider_detection.py` | 148 | 31 | 31 | 5 | services | provider | - |
| `tests/test_model_context.py` | 251 | 30 | 30 | 4 | uncategorized | model_context | db/session, import-state |
| `tests/test_endpoint_resolver.py` | 148 | 30 | 30 | 6 | uncategorized | endpoint_resolver | - |
| `tests/test_embedding_lanes.py` | 1104 | 29 | 29 | 0 | services | embedding | filesystem |
| `tests/test_upload_limits_centralized.py` | 110 | 29 | 5 | 0 | uncategorized | upload_limits_centralized | import-state, filesystem |
| `tests/test_email_oauth.py` | 580 | 28 | 25 | 0 | services | email | route/api, db/session, security, async/threading |
| `tests/test_review_regressions.py` | 930 | 26 | 26 | 0 | uncategorized | review_regressions | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_rename_user_owner_sync.py` | 686 | 26 | 26 | 0 | security | owner | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_helpers_import_state.py` | 426 | 26 | 26 | 0 | helpers | helpers | route/api, db/session, import-state |
| `tests/test_taxonomy.py` | 145 | 26 | 16 | 0 | uncategorized | taxonomy | security, ui/static |
| `tests/test_tool_path_confinement.py` | 282 | 24 | 24 | 0 | security | confinement | import-state, filesystem, async/threading |
| `tests/test_copilot.py` | 170 | 23 | 16 | 0 | uncategorized | copilot | - |
| `tests/test_research_utils.py` | 97 | 23 | 23 | 2 | services | research | - |
| `tests/test_api_chat_security.py` | 401 | 22 | 8 | 0 | security | security | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_tool_support_heuristic.py` | 166 | 22 | 22 | 3 | uncategorized | tool_support_heuristic | - |
| `tests/test_platform_compat.py` | 318 | 21 | 21 | 0 | uncategorized | platform_compat | import-state, filesystem, subprocess/script |
## Top files by physical line count
| File | Lines | Collected tests | Test defs | Test classes | Area | Sub-area | Signals |
|---|---:|---:|---:|---:|---|---|---|
| `tests/test_model_routes.py` | 1778 | 139 | 116 | 10 | routes | routes | route/api, db/session, import-state, async/threading |
| `tests/test_security_regressions.py` | 1224 | 92 | 68 | 0 | security | security | route/api, db/session, import-state, security, filesystem, async/threading, ui/static |
| `tests/test_embedding_lanes.py` | 1104 | 29 | 29 | 0 | services | embedding | filesystem |
| `tests/test_pr_blocker_audit.py` | 964 | 58 | 58 | 0 | uncategorized | pr_blocker_audit | import-state, security, filesystem |
| `tests/test_review_regressions.py` | 930 | 26 | 26 | 0 | uncategorized | review_regressions | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_cookbook_helpers.py` | 912 | 65 | 65 | 0 | services | cookbook | route/api, filesystem, subprocess/script, async/threading, ui/static |
| `tests/test_rename_user_owner_sync.py` | 686 | 26 | 26 | 0 | security | owner | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_email_oauth.py` | 580 | 28 | 25 | 0 | services | email | route/api, db/session, security, async/threading |
| `tests/test_api_token_routes.py` | 578 | 17 | 17 | 0 | routes | api_routes | route/api, db/session, import-state, async/threading |
| `tests/test_shell_routes.py` | 481 | 63 | 48 | 8 | routes | routes | route/api, import-state, filesystem |
| `tests/test_email_owner_scope.py` | 474 | 9 | 9 | 0 | security | owner_scope | route/api, db/session, filesystem, async/threading |
| `tests/test_service_health.py` | 472 | 47 | 42 | 0 | uncategorized | service_health | async/threading |
| `tests/test_agent_loop.py` | 469 | 52 | 52 | 5 | uncategorized | agent_loop | db/session, import-state |
| `tests/test_kv_cache_invalidation_2927.py` | 463 | 8 | 8 | 0 | uncategorized | kv_cache_invalidation_2927 | route/api, db/session, import-state, async/threading |
| `tests/test_helpers_import_state.py` | 426 | 26 | 26 | 0 | helpers | helpers | route/api, db/session, import-state |
| `tests/test_endpoint_owner_scope_followup.py` | 414 | 11 | 11 | 0 | security | owner_scope | route/api, db/session, filesystem |
| `tests/test_endpoint_probing.py` | 411 | 34 | 30 | 6 | uncategorized | endpoint_probing | route/api, db/session, import-state |
| `tests/test_imap_leak_fixes.py` | 404 | 15 | 15 | 0 | uncategorized | imap_leak_fixes | route/api, db/session, security, filesystem |
| `tests/test_companion_readonly.py` | 402 | 17 | 17 | 0 | uncategorized | companion_readonly | db/session, import-state |
| `tests/test_api_chat_security.py` | 401 | 22 | 8 | 0 | security | security | route/api, db/session, import-state, filesystem, async/threading |
| `tests/test_upload_handler_atomicity.py` | 401 | 9 | 9 | 0 | uncategorized | upload_handler_atomicity | filesystem, async/threading |
| `tests/test_run_focus.py` | 399 | 47 | 44 | 0 | uncategorized | run_focus | security, filesystem, subprocess/script, ui/static |
| `tests/test_auth_regressions.py` | 375 | 15 | 15 | 0 | security | auth | route/api, db/session, import-state, async/threading |
| `tests/test_calendar_owner_scope.py` | 345 | 7 | 7 | 0 | security | owner_scope | route/api, db/session, import-state, filesystem, async/threading, ui/static |
| `tests/test_null_owner_gates.py` | 342 | 20 | 20 | 0 | security | owner | route/api, db/session, import-state |
| `tests/test_agent_migration_manifest.py` | 340 | 15 | 15 | 0 | uncategorized | agent_migration_manifest | import-state, filesystem |
| `tests/test_calendar_recurrence.py` | 338 | 19 | 19 | 0 | services | calendar | - |
| `tests/test_tool_policy.py` | 330 | 13 | 13 | 0 | uncategorized | tool_policy | import-state, async/threading |
| `tests/test_workspace_confine.py` | 328 | 18 | 18 | 0 | uncategorized | workspace_confine | route/api, filesystem, subprocess/script, async/threading |
| `tests/test_diffusion_server_security.py` | 325 | 14 | 14 | 0 | security | security | route/api, import-state, security, filesystem, async/threading, ui/static |
## Split planning candidates
This section is generated from metrics, not from manual judgement.
Files are included when they meet at least one threshold:
- at least 300 physical lines; or
- at least 20 collected pytest items.
These are planning candidates only. A later split PR still needs a focused manual review of each file before moving tests.
| File | Why included | Setup/risk signals | Suggested handling |
|---|---|---|---|
| `tests/test_model_routes.py` | 1778 lines, 139 collected tests | route/api, db/session, import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_security_regressions.py` | 1224 lines, 92 collected tests | route/api, db/session, import-state, security, filesystem, async/threading, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_provider_classification.py` | 67 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_cookbook_helpers.py` | 912 lines, 65 collected tests | route/api, filesystem, subprocess/script, async/threading, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_shell_routes.py` | 481 lines, 63 collected tests | route/api, import-state, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_pr_blocker_audit.py` | 964 lines, 58 collected tests | import-state, security, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_provider_endpoints.py` | 58 collected tests | subprocess/script | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_agent_loop.py` | 469 lines, 52 collected tests | db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_service_health.py` | 472 lines, 47 collected tests | async/threading | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_run_focus.py` | 399 lines, 47 collected tests | security, filesystem, subprocess/script, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_llm_core_temperature.py` | 41 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_endpoint_probing.py` | 411 lines, 34 collected tests | route/api, db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_llm_core_anthropic_temp_omit.py` | 32 collected tests | db/session | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_chat_helpers.py` | 31 collected tests | route/api | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_provider_detection.py` | 31 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_model_context.py` | 30 collected tests | db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_endpoint_resolver.py` | 30 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_embedding_lanes.py` | 1104 lines, 29 collected tests | filesystem | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_upload_limits_centralized.py` | 29 collected tests | import-state, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_email_oauth.py` | 580 lines, 28 collected tests | route/api, db/session, security, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_review_regressions.py` | 930 lines, 26 collected tests | route/api, db/session, import-state, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_rename_user_owner_sync.py` | 686 lines, 26 collected tests | route/api, db/session, import-state, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_helpers_import_state.py` | 426 lines, 26 collected tests | route/api, db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_taxonomy.py` | 26 collected tests | security, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_tool_path_confinement.py` | 24 collected tests | import-state, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_copilot.py` | 23 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_research_utils.py` | 23 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_api_chat_security.py` | 401 lines, 22 collected tests | route/api, db/session, import-state, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_tool_support_heuristic.py` | 22 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_platform_compat.py` | 318 lines, 21 collected tests | import-state, filesystem, subprocess/script | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_context_compactor.py` | 21 collected tests | db/session, import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_prompt_security.py` | 21 collected tests | No obvious setup signals from static scan. | Good first manual-review candidate if test themes are cohesive. |
| `tests/test_null_owner_gates.py` | 342 lines, 20 collected tests | route/api, db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_youtube_handler_consolidation.py` | 20 collected tests | route/api, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_calendar_recurrence.py` | 338 lines | No obvious setup signals from static scan. | Plan split boundaries before editing. |
| `tests/test_workspace_confine.py` | 328 lines | route/api, filesystem, subprocess/script, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_api_token_routes.py` | 578 lines | route/api, db/session, import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_companion_readonly.py` | 402 lines | db/session, import-state | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_set_admin.py` | 317 lines | route/api, import-state, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_imap_leak_fixes.py` | 404 lines | route/api, db/session, security, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_auth_regressions.py` | 375 lines | route/api, db/session, import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_agent_migration_manifest.py` | 340 lines | import-state, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_diffusion_server_security.py` | 325 lines | route/api, import-state, security, filesystem, async/threading, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_tool_policy.py` | 330 lines | import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_endpoint_owner_scope_followup.py` | 414 lines | route/api, db/session, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_upload_routes_owner_scope.py` | 315 lines | route/api, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_email_owner_scope.py` | 474 lines | route/api, db/session, filesystem, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_upload_handler_atomicity.py` | 401 lines | filesystem, async/threading | Plan split boundaries before editing. |
| `tests/test_kv_cache_invalidation_2927.py` | 463 lines | route/api, db/session, import-state, async/threading | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_calendar_owner_scope.py` | 345 lines | route/api, db/session, import-state, filesystem, async/threading, ui/static | Defer mechanical split until setup/risk boundaries are mapped. |
| `tests/test_skills_manager_owner_isolation.py` | 306 lines | import-state, filesystem | Defer mechanical split until setup/risk boundaries are mapped. |
## Taxonomy coverage gaps among split candidates
`uncategorized` is a current taxonomy area, not a builder failure.
This plan does not reclassify tests because taxonomy changes should be reviewed separately from oversized-file split planning.
Before using any of these files as a split target, first decide whether the taxonomy should be refined in a separate focused issue/PR.
| File | Lines | Collected tests | Sub-area | Signals | Suggested follow-up |
|---|---:|---:|---|---|---|
| `tests/test_pr_blocker_audit.py` | 964 | 58 | pr_blocker_audit | import-state, security, filesystem | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_agent_loop.py` | 469 | 52 | agent_loop | db/session, import-state | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_service_health.py` | 472 | 47 | service_health | async/threading | Review taxonomy mapping before using as a split target. |
| `tests/test_run_focus.py` | 399 | 47 | run_focus | security, filesystem, subprocess/script, ui/static | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_endpoint_probing.py` | 411 | 34 | endpoint_probing | route/api, db/session, import-state | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_chat_helpers.py` | 264 | 31 | chat_helpers | route/api | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_model_context.py` | 251 | 30 | model_context | db/session, import-state | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_endpoint_resolver.py` | 148 | 30 | endpoint_resolver | - | Review taxonomy mapping before using as a split target. |
| `tests/test_upload_limits_centralized.py` | 110 | 29 | upload_limits_centralized | import-state, filesystem | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_review_regressions.py` | 930 | 26 | review_regressions | route/api, db/session, import-state, filesystem, async/threading | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_taxonomy.py` | 145 | 26 | taxonomy | security, ui/static | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_copilot.py` | 170 | 23 | copilot | - | Review taxonomy mapping before using as a split target. |
| `tests/test_tool_support_heuristic.py` | 166 | 22 | tool_support_heuristic | - | Review taxonomy mapping before using as a split target. |
| `tests/test_platform_compat.py` | 318 | 21 | platform_compat | import-state, filesystem, subprocess/script | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_context_compactor.py` | 233 | 21 | context_compactor | db/session, import-state, async/threading | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_youtube_handler_consolidation.py` | 104 | 20 | youtube_handler_consolidation | route/api, import-state | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_workspace_confine.py` | 328 | 18 | workspace_confine | route/api, filesystem, subprocess/script, async/threading | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_companion_readonly.py` | 402 | 17 | companion_readonly | db/session, import-state | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_set_admin.py` | 317 | 17 | set_admin | route/api, import-state, filesystem, async/threading | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_imap_leak_fixes.py` | 404 | 15 | imap_leak_fixes | route/api, db/session, security, filesystem | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_agent_migration_manifest.py` | 340 | 15 | agent_migration_manifest | import-state, filesystem | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_tool_policy.py` | 330 | 13 | tool_policy | import-state, async/threading | Review taxonomy and setup/risk boundaries before any split. |
| `tests/test_upload_handler_atomicity.py` | 401 | 9 | upload_handler_atomicity | filesystem, async/threading | Review taxonomy mapping before using as a split target. |
| `tests/test_kv_cache_invalidation_2927.py` | 463 | 8 | kv_cache_invalidation_2927 | route/api, db/session, import-state, async/threading | Review taxonomy and setup/risk boundaries before any split. |
## Suggested first manual-review candidates
These are not automatic split approvals. They are categorized candidates with enough size/collection value and no route/API, DB/session, import-state, or security signal from the static scan.
Files still in the `uncategorized` taxonomy area are listed separately below so taxonomy review does not get mixed into the first split decision.
| File | Lines | Collected tests | Area | Sub-area | Signals | Why this is a candidate |
|---|---:|---:|---|---|---|---|
| `tests/test_provider_classification.py` | 188 | 67 | services | provider | - | 67 collected tests |
| `tests/test_provider_endpoints.py` | 241 | 58 | services | provider | subprocess/script | 58 collected tests |
| `tests/test_llm_core_temperature.py` | 196 | 41 | services | llm | - | 41 collected tests |
| `tests/test_provider_detection.py` | 148 | 31 | services | provider | - | 31 collected tests |
| `tests/test_embedding_lanes.py` | 1104 | 29 | services | embedding | filesystem | 1104 lines, 29 collected tests |
| `tests/test_research_utils.py` | 97 | 23 | services | research | - | 23 collected tests |
| `tests/test_prompt_security.py` | 203 | 21 | security | security | - | 21 collected tests |
| `tests/test_calendar_recurrence.py` | 338 | 19 | services | calendar | - | 338 lines |
## High-risk candidates to defer first
These files may still be split later, but not as the first implementation slice without a separate manual boundary review.
| File | Lines | Collected tests | High-risk signals |
|---|---:|---:|---|
| `tests/test_model_routes.py` | 1778 | 139 | db/session, import-state, route/api |
| `tests/test_security_regressions.py` | 1224 | 92 | db/session, import-state, route/api, security |
| `tests/test_cookbook_helpers.py` | 912 | 65 | route/api |
| `tests/test_shell_routes.py` | 481 | 63 | import-state, route/api |
| `tests/test_pr_blocker_audit.py` | 964 | 58 | import-state, security |
| `tests/test_agent_loop.py` | 469 | 52 | db/session, import-state |
| `tests/test_run_focus.py` | 399 | 47 | security |
| `tests/test_endpoint_probing.py` | 411 | 34 | db/session, import-state, route/api |
| `tests/test_llm_core_anthropic_temp_omit.py` | 94 | 32 | db/session |
| `tests/test_chat_helpers.py` | 264 | 31 | route/api |
| `tests/test_model_context.py` | 251 | 30 | db/session, import-state |
| `tests/test_upload_limits_centralized.py` | 110 | 29 | import-state |
| `tests/test_email_oauth.py` | 580 | 28 | db/session, route/api, security |
| `tests/test_review_regressions.py` | 930 | 26 | db/session, import-state, route/api |
| `tests/test_rename_user_owner_sync.py` | 686 | 26 | db/session, import-state, route/api |
## Rules for future split PRs
- One file or one coherent file-family per PR.
- No assertion rewrites mixed with file moves.
- No helper extraction mixed with file moves.
- No production code changes.
- No CI workflow changes.
- Preserve existing markers and taxonomy unless the split issue explicitly says otherwise.
- Validate the original file's collected tests before and after the split.
- Validate any neighboring taxonomy/focused-runner behavior if paths change.
- Treat files with route/API, DB/session, import-state, or security signals as higher-risk until manually reviewed.
## Suggested next step
Use this plan to choose the first actual oversized-file split issue.
The first split should prefer a file with high review value and low setup risk.
Do not start a split PR from this planning issue alone if the file's boundaries are still ambiguous.
## Reproduction command
This document was generated with:
```bash
.venv/bin/python tests/tools/build_oversized_test_split_plan.py
```
## Freshness check
After editing the builder or rebasing the branch, regenerate the plan and confirm no unexpected plan drift:
```bash
.venv/bin/python tests/tools/build_oversized_test_split_plan.py
git diff --exit-code -- tests/OVERSIZED_TEST_SPLIT_PLAN.md
```
+7 -7
View File
@@ -30,7 +30,7 @@ class _Session:
def test_allowed_models_legacy_empty_list_remains_unrestricted(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
_enforce_chat_privileges(
_Request({"allowed_models": [], "max_messages_per_day": 0}),
@@ -39,7 +39,7 @@ def test_allowed_models_legacy_empty_list_remains_unrestricted(monkeypatch):
def test_allowed_models_explicit_empty_restricted_list_blocks_all_models(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
with pytest.raises(HTTPException) as exc:
_enforce_chat_privileges(
@@ -56,7 +56,7 @@ def test_allowed_models_explicit_empty_restricted_list_blocks_all_models(monkeyp
def test_allowed_models_nonempty_list_still_restricts_without_new_flag(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
_enforce_chat_privileges(
_Request({"allowed_models": ["provider/model-a"], "max_messages_per_day": 0}),
@@ -70,7 +70,7 @@ def test_allowed_models_nonempty_list_still_restricts_without_new_flag(monkeypat
def test_no_restriction_allows_any_model(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
privs = {"allowed_models": [], "block_all_models": False, "max_messages_per_day": 0}
_enforce_chat_privileges(_Request(privs), _Session("provider/model-a"))
@@ -78,7 +78,7 @@ def test_no_restriction_allows_any_model(monkeypatch):
def test_specific_allowlist_blocks_models_outside_it(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
privs = {
"allowed_models": ["gpt-4"],
@@ -92,7 +92,7 @@ def test_specific_allowlist_blocks_models_outside_it(monkeypatch):
def test_block_all_models_blocks_regardless_of_allowed_models_contents(monkeypatch):
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "alice")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "alice")
# Even if allowed_models contains entries, block_all_models wins.
privs = {
@@ -111,7 +111,7 @@ def test_block_all_models_blocks_regardless_of_allowed_models_contents(monkeypat
def test_admin_user_is_never_blocked(monkeypatch):
from core.auth import ADMIN_PRIVILEGES
monkeypatch.setattr("routes.chat_helpers.get_current_user", lambda request: "admin")
monkeypatch.setattr("routes.chat_helpers.effective_user", lambda request: "admin")
class _AdminAuthManager:
def get_privileges(self, username):
+32 -2
View File
@@ -13,6 +13,9 @@ import json
from types import SimpleNamespace
from unittest.mock import MagicMock
import pytest
from fastapi import HTTPException
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
# core.database instantiates SQLAlchemy declarative classes at import time, which
@@ -225,12 +228,34 @@ def test_models_route_scopes_api_token_to_token_owner(monkeypatch):
endpoints = _call_models_route(
monkeypatch,
rows,
_request(api_token=True, api_token_owner="alice", current_user="api"),
_request(
api_token=True,
api_token_owner="alice",
api_token_scopes=["chat"],
current_user="api",
),
)
assert _endpoint_names(endpoints) == ["alice-endpoint", "shared-endpoint"]
def test_models_route_rejects_api_token_without_chat_scope(monkeypatch):
monkeypatch.setattr(companion_routes, "get_current_user", lambda request: "api")
with pytest.raises(HTTPException) as exc:
_models_route()(
_request(
api_token=True,
api_token_owner="alice",
api_token_scopes=["todos:read"],
current_user="api",
)
)
assert exc.value.status_code == 403
assert "chat scope" in exc.value.detail
def test_models_route_unresolved_owner_returns_only_shared_rows(monkeypatch):
rows = [
_ep(1, "alice-endpoint", "alice"),
@@ -242,7 +267,12 @@ def test_models_route_unresolved_owner_returns_only_shared_rows(monkeypatch):
endpoints = _call_models_route(
monkeypatch,
rows,
_request(api_token=True, api_token_owner=None, current_user="api"),
_request(
api_token=True,
api_token_owner=None,
api_token_scopes=["chat"],
current_user="api",
),
)
assert _endpoint_names(endpoints) == ["shared-endpoint"]
+44
View File
@@ -786,6 +786,50 @@ def test_cached_model_scan_reports_plain_dir_gguf(tmp_path):
assert ggufs[3]["quant"] == "BF16"
def test_cached_model_scan_uses_ollama_api_before_cli_and_windows_opt_in():
script = _cached_model_scan_script()
assert "scan_ollama_api()\nscan_ollama()" in script
assert "if any(m.get('is_ollama') for m in models): return" in script
assert "os.name == 'nt'" in script
assert "ODYSSEUS_ALLOW_OLLAMA_CLI_SCAN" in script
@pytest.mark.skipif(os.name != "nt", reason="Windows Ollama CLI startup guard")
def test_cached_model_scan_does_not_launch_ollama_cli_on_windows(tmp_path):
"""Official Ollama for Windows can auto-start the tray/server on `ollama list`.
The read-only cache scanner must not invoke that CLI unless explicitly opted in.
"""
marker = tmp_path / "ollama-called.txt"
fake_ollama = tmp_path / "ollama.cmd"
fake_ollama.write_text(
"@echo off\r\n"
f'echo called>"{marker}"\r\n'
"echo NAME ID SIZE MODIFIED\r\n"
"echo local-model:latest abc 1 GB now\r\n",
encoding="utf-8",
)
empty_home = tmp_path / "home"
empty_home.mkdir()
scan_py = tmp_path / "scan_cache.py"
scan_py.write_text(_cached_model_scan_script(), encoding="utf-8")
env = dict(os.environ)
env["PATH"] = str(tmp_path) + os.pathsep + env.get("PATH", "")
env["HOME"] = str(empty_home)
env.pop("ODYSSEUS_ALLOW_OLLAMA_CLI_SCAN", None)
proc = subprocess.run(
[sys.executable, str(scan_py)],
check=True,
capture_output=True,
text=True,
env=env,
)
assert marker.exists() is False
assert all(m.get("backend") != "ollama" for m in json.loads(proc.stdout))
def test_cached_model_scan_uses_huggingface_cache_env(tmp_path):
"""Docker recreates can leave the persisted HF cache outside HOME.
The Serve scanner should honor the cache env path instead of only ~/.cache.
+1 -1
View File
@@ -79,7 +79,7 @@ def _build_context_harness(monkeypatch, chat_helpers, history):
monkeypatch.setattr(chat_helpers, "extract_preset", fake_extract_preset)
monkeypatch.setattr(chat_helpers, "add_user_message", fake_add_user_message)
monkeypatch.setattr(chat_helpers, "load_prefs_for_user", lambda user: {})
monkeypatch.setattr(chat_helpers, "get_current_user", lambda request: "tester")
monkeypatch.setattr(chat_helpers, "effective_user", lambda request: "tester")
monkeypatch.setattr(chat_helpers, "normalize_model_id", lambda endpoint_url, model, **kwargs: None)
monkeypatch.setattr(chat_helpers, "maybe_compact", fake_maybe_compact)
monkeypatch.setattr(chat_helpers, "trim_for_context", lambda messages, context_length: messages)
+1 -1
View File
@@ -385,7 +385,7 @@ async def test_build_chat_context_incognito_does_not_duplicate_current_user_mess
monkeypatch.setattr(chat_helpers, "extract_preset", fake_extract_preset)
monkeypatch.setattr(chat_helpers, "add_user_message", fake_add_user_message)
monkeypatch.setattr(chat_helpers, "load_prefs_for_user", lambda user: {})
monkeypatch.setattr(chat_helpers, "get_current_user", lambda request: "tester")
monkeypatch.setattr(chat_helpers, "effective_user", lambda request: "tester")
monkeypatch.setattr(chat_helpers, "normalize_model_id", lambda endpoint_url, model, **kwargs: None)
monkeypatch.setattr(chat_helpers, "maybe_compact", fake_maybe_compact)
monkeypatch.setattr(chat_helpers, "trim_for_context", lambda messages, context_length: messages)
+1 -1
View File
@@ -52,6 +52,6 @@ def test_chat_endpoint_recovery_paths_are_owner_scoped():
assert "def _clear_orphaned_session_endpoint(sess, owner:" in chat_routes
assert "def _recover_empty_session_model(sess, session_id: str, owner:" in chat_routes
assert "q = owner_filter(q, ModelEndpoint, owner)" in chat_routes
assert "resolve_session_auth(sess, session, owner=get_current_user(request))" in chat_routes
assert "resolve_session_auth(sess, session, owner=effective_user(request))" in chat_routes
assert "def resolve_session_auth(sess, session_id: str, owner:" in chat_helpers
assert "update_q = update_q.filter(DBSession.owner == owner)" in chat_helpers
@@ -0,0 +1,54 @@
"""Guard: every public webhook emitter goes through the manager.
Public emitters in `routes/` must schedule their fire through
`webhook_manager.fire_and_forget(...)` (or `_spawn_tracked`). A bare
`asyncio.create_task(webhook_manager.fire(...))` escapes
`WebhookManager._bg_tasks`, so asyncio only holds a weak reference to the
delivery task and the GC can collect it before it sends silently dropping
the webhook. Catching this with a scan stops a regression from sneaking
back in via a copy-paste.
"""
import ast
from pathlib import Path
ROUTES_DIR = Path(__file__).resolve().parent.parent / "routes"
def _untracked_fire_calls(tree: ast.AST) -> list[tuple[int, str]]:
"""Return (lineno, snippet) for any asyncio.create_task(webhook_manager.fire(...))."""
hits: list[tuple[int, str]] = []
for node in ast.walk(tree):
if not isinstance(node, ast.Call):
continue
func = node.func
if not (isinstance(func, ast.Attribute) and func.attr == "create_task"):
continue
if not (isinstance(func.value, ast.Name) and func.value.id == "asyncio"):
continue
if not node.args:
continue
inner = node.args[0]
if not isinstance(inner, ast.Call):
continue
inner_func = inner.func
if (
isinstance(inner_func, ast.Attribute)
and inner_func.attr == "fire"
and isinstance(inner_func.value, ast.Name)
and inner_func.value.id == "webhook_manager"
):
hits.append((node.lineno, ast.unparse(node)))
return hits
def test_no_untracked_webhook_fire_in_routes():
offenders: list[str] = []
for path in ROUTES_DIR.rglob("*.py"):
tree = ast.parse(path.read_text(), filename=str(path))
for lineno, snippet in _untracked_fire_calls(tree):
offenders.append(f"{path.relative_to(ROUTES_DIR.parent)}:{lineno}: {snippet}")
assert not offenders, (
"Public webhook emitters must use webhook_manager.fire_and_forget(...) "
"so the delivery task is tracked in WebhookManager._bg_tasks. Found "
"untracked emitter(s):\n " + "\n ".join(offenders)
)
@@ -0,0 +1,574 @@
#!/usr/bin/env python3
"""Build the oversized test-file split plan for issue #3983.
The output is a planning document only. It does not move tests, rewrite
assertions, extract helpers, or change CI.
"""
from __future__ import annotations
import ast
import json
import os
import re
import subprocess
import sys
from collections import Counter
from dataclasses import dataclass
from pathlib import Path
ROOT = Path(__file__).resolve().parents[2]
TESTS_DIR = ROOT / "tests"
OUTPUT = TESTS_DIR / "OVERSIZED_TEST_SPLIT_PLAN.md"
RAW_OUTPUT = Path("/tmp/oversized-test-file-metrics.json")
LARGE_LINE_THRESHOLD = 300
LARGE_NODE_THRESHOLD = 20
TOP_LIMIT = 30
HIGH_RISK_SIGNALS = {"route/api", "db/session", "import-state", "security"}
@dataclass(frozen=True)
class FileMetric:
path: str
lines: int
nonblank: int
test_defs: int
test_classes: int
collected: int
area: str
sub_area: str
signals: tuple[str, ...]
def read_text(path: Path) -> str:
return path.read_text(encoding="utf-8", errors="replace")
def count_ast_tests(text: str) -> tuple[int, int]:
tree = ast.parse(text)
test_defs = 0
test_classes = 0
for node in ast.walk(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
if node.name.startswith("test_"):
test_defs += 1
elif isinstance(node, ast.ClassDef):
if node.name.startswith("Test"):
test_classes += 1
return test_defs, test_classes
def load_taxonomy_classifier():
sys.path.insert(0, str(ROOT))
from tests._taxonomy import classify_test_path
return classify_test_path
def classify(path: Path, classify_test_path) -> tuple[str, str]:
rel_path = Path(path.relative_to(ROOT).as_posix())
try:
result = classify_test_path(rel_path)
except Exception:
return "unknown", "unknown"
return getattr(result, "area", "unknown"), getattr(result, "sub_area", "unknown")
def collect_node_counts() -> Counter[str]:
cmd = [
sys.executable,
"-m",
"pytest",
"--collect-only",
"-q",
"tests",
]
env = dict(os.environ)
env["PY_COLORS"] = "0"
result = subprocess.run(
cmd,
cwd=ROOT,
env=env,
text=True,
capture_output=True,
)
if result.returncode != 0:
print(result.stdout)
print(result.stderr, file=sys.stderr)
raise SystemExit(result.returncode)
counts: Counter[str] = Counter()
for line in result.stdout.splitlines():
line = line.strip()
if "::" not in line:
continue
if not line.startswith("tests/"):
continue
file_path = line.split("::", 1)[0]
counts[file_path] += 1
return counts
def detect_signals(text: str, path: str) -> tuple[str, ...]:
signal_patterns = {
"route/api": [
r"\bTestClient\b",
r"\bapp\.",
r"\broutes\.",
r"\bfrom routes\b",
r"\bimport routes\b",
],
"db/session": [
r"\bSessionLocal\b",
r"\bsqlite\b",
r"\bDATABASE_URL\b",
r"\bcore\.database\b",
r"\bdb\.query\b",
r"\bcommit\(",
],
"import-state": [
r"\bsys\.modules\b",
r"\bimportlib\b",
r"\bclear_module\b",
r"\bpreserve_import_state\b",
r"\bmonkeypatch\.setitem\b",
],
"security": [
r"\bsecurity\b",
r"\bssrf\b",
r"\bpath traversal\b",
r"\bcsrf\b",
r"\bpermission\b",
],
"filesystem": [
r"\btmp_path\b",
r"\bTemporaryDirectory\b",
r"\bPath\(",
r"\bmkdir\b",
r"\bwrite_text\b",
r"\bread_text\b",
],
"subprocess/script": [
r"\bsubprocess\b",
r"\brunpy\b",
r"\bload_script\b",
r"\bsys\.argv\b",
],
"async/threading": [
r"\basyncio\b",
r"\bthreading\b",
r"\bconcurrent\.futures\b",
r"\bThreadPoolExecutor\b",
],
"ui/static": [
r"\bstatic/",
r"\bjsdom\b",
r"\bnode\b",
r"\.js\b",
],
}
signals = []
for name, patterns in signal_patterns.items():
if any(re.search(pattern, text, flags=re.IGNORECASE) for pattern in patterns):
signals.append(name)
if path.startswith("tests/cli/"):
signals.append("cli-directory")
return tuple(signals)
def metric_for(path: Path, node_counts: Counter[str], classify_test_path) -> FileMetric:
rel = path.relative_to(ROOT).as_posix()
text = read_text(path)
lines = len(text.splitlines())
nonblank = sum(1 for line in text.splitlines() if line.strip())
test_defs, test_classes = count_ast_tests(text)
area, sub_area = classify(path, classify_test_path)
return FileMetric(
path=rel,
lines=lines,
nonblank=nonblank,
test_defs=test_defs,
test_classes=test_classes,
collected=node_counts.get(rel, 0),
area=area,
sub_area=sub_area,
signals=detect_signals(text, rel),
)
def test_files() -> list[Path]:
return sorted(TESTS_DIR.rglob("test_*.py"))
def as_metric_row(metric: FileMetric) -> str:
signals = ", ".join(metric.signals) if metric.signals else "-"
return (
f"| `{metric.path}` | {metric.lines} | {metric.collected} | "
f"{metric.test_defs} | {metric.test_classes} | "
f"{metric.area} | {metric.sub_area} | {signals} |"
)
def metric_table(title: str, metrics: list[FileMetric]) -> list[str]:
lines = [
f"## {title}",
"",
"| File | Lines | Collected tests | Test defs | Test classes | Area | Sub-area | Signals |",
"|---|---:|---:|---:|---:|---|---|---|",
]
lines.extend(as_metric_row(metric) for metric in metrics)
lines.append("")
return lines
def candidate_metrics(metrics: list[FileMetric]) -> list[FileMetric]:
return [
metric
for metric in metrics
if metric.lines >= LARGE_LINE_THRESHOLD
or metric.collected >= LARGE_NODE_THRESHOLD
]
def include_reasons(metric: FileMetric) -> str:
reasons = []
if metric.lines >= LARGE_LINE_THRESHOLD:
reasons.append(f"{metric.lines} lines")
if metric.collected >= LARGE_NODE_THRESHOLD:
reasons.append(f"{metric.collected} collected tests")
return ", ".join(reasons)
def risk_notes(metric: FileMetric) -> str:
if not metric.signals:
return "No obvious setup signals from static scan."
return ", ".join(metric.signals)
def suggested_handling(metric: FileMetric) -> str:
if HIGH_RISK_SIGNALS.intersection(metric.signals):
return "Defer mechanical split until setup/risk boundaries are mapped."
if metric.collected >= LARGE_NODE_THRESHOLD:
return "Good first manual-review candidate if test themes are cohesive."
return "Plan split boundaries before editing."
def candidate_section(metrics: list[FileMetric]) -> list[str]:
lines = [
"## Split planning candidates",
"",
"This section is generated from metrics, not from manual judgement.",
"Files are included when they meet at least one threshold:",
"",
f"- at least {LARGE_LINE_THRESHOLD} physical lines; or",
f"- at least {LARGE_NODE_THRESHOLD} collected pytest items.",
"",
"These are planning candidates only. A later split PR still needs a focused manual review of each file before moving tests.",
"",
"| File | Why included | Setup/risk signals | Suggested handling |",
"|---|---|---|---|",
]
for metric in metrics:
lines.append(
f"| `{metric.path}` | {include_reasons(metric)} | "
f"{risk_notes(metric)} | {suggested_handling(metric)} |"
)
lines.append("")
return lines
def first_manual_review_section(metrics: list[FileMetric]) -> list[str]:
low_risk = [
metric
for metric in metrics
if metric.area != "uncategorized"
and not HIGH_RISK_SIGNALS.intersection(metric.signals)
]
low_risk = sorted(low_risk, key=lambda m: (m.collected, m.lines), reverse=True)
lines = [
"## Suggested first manual-review candidates",
"",
"These are not automatic split approvals. They are categorized candidates with enough size/collection value and no route/API, DB/session, import-state, or security signal from the static scan.",
"",
"Files still in the `uncategorized` taxonomy area are listed separately below so taxonomy review does not get mixed into the first split decision.",
"",
"| File | Lines | Collected tests | Area | Sub-area | Signals | Why this is a candidate |",
"|---|---:|---:|---|---|---|---|",
]
if not low_risk:
lines.append("| _None_ | - | - | - | - | - | - |")
for metric in low_risk[:10]:
signals = ", ".join(metric.signals) if metric.signals else "-"
lines.append(
f"| `{metric.path}` | {metric.lines} | {metric.collected} | "
f"{metric.area} | {metric.sub_area} | {signals} | {include_reasons(metric)} |"
)
lines.append("")
return lines
def taxonomy_gap_section(metrics: list[FileMetric]) -> list[str]:
uncategorized = [
metric
for metric in metrics
if metric.area == "uncategorized"
]
uncategorized = sorted(
uncategorized,
key=lambda m: (m.collected, m.lines),
reverse=True,
)
lines = [
"## Taxonomy coverage gaps among split candidates",
"",
"`uncategorized` is a current taxonomy area, not a builder failure.",
"This plan does not reclassify tests because taxonomy changes should be reviewed separately from oversized-file split planning.",
"",
"Before using any of these files as a split target, first decide whether the taxonomy should be refined in a separate focused issue/PR.",
"",
"| File | Lines | Collected tests | Sub-area | Signals | Suggested follow-up |",
"|---|---:|---:|---|---|---|",
]
if not uncategorized:
lines.append("| _None_ | - | - | - | - | - |")
for metric in uncategorized:
signals = ", ".join(metric.signals) if metric.signals else "-"
follow_up = "Review taxonomy mapping before using as a split target."
if HIGH_RISK_SIGNALS.intersection(metric.signals):
follow_up = "Review taxonomy and setup/risk boundaries before any split."
lines.append(
f"| `{metric.path}` | {metric.lines} | {metric.collected} | "
f"{metric.sub_area} | {signals} | {follow_up} |"
)
lines.append("")
return lines
def deferred_section(metrics: list[FileMetric]) -> list[str]:
deferred = [
metric
for metric in metrics
if HIGH_RISK_SIGNALS.intersection(metric.signals)
]
deferred = sorted(deferred, key=lambda m: (m.collected, m.lines), reverse=True)
lines = [
"## High-risk candidates to defer first",
"",
"These files may still be split later, but not as the first implementation slice without a separate manual boundary review.",
"",
"| File | Lines | Collected tests | High-risk signals |",
"|---|---:|---:|---|",
]
for metric in deferred[:15]:
signals = ", ".join(sorted(HIGH_RISK_SIGNALS.intersection(metric.signals)))
lines.append(
f"| `{metric.path}` | {metric.lines} | {metric.collected} | {signals} |"
)
lines.append("")
return lines
def write_distribution(
lines: list[str],
title: str,
values: Counter[str],
*,
min_count: int = 1,
) -> None:
displayed = [
(value, count)
for value, count in sorted(values.items())
if count >= min_count
]
omitted_values = sum(1 for count in values.values() if count < min_count)
omitted_files = sum(count for count in values.values() if count < min_count)
lines.extend([
f"{title}:",
"",
"| Value | Files |",
"|---|---:|",
])
for value, count in displayed:
lines.append(f"| {value} | {count} |")
if omitted_values:
lines.extend([
"",
f"Values below {min_count} files: {omitted_values} values covering {omitted_files} files.",
])
lines.append("")
def write_report(metrics: list[FileMetric], node_count_total: int) -> None:
by_lines = sorted(metrics, key=lambda m: (m.lines, m.collected), reverse=True)
by_collected = sorted(metrics, key=lambda m: (m.collected, m.lines), reverse=True)
candidates = sorted(
candidate_metrics(metrics),
key=lambda m: (m.collected, m.lines),
reverse=True,
)
areas = Counter(metric.area for metric in metrics)
sub_areas = Counter(metric.sub_area for metric in metrics)
lines = [
"# Oversized Test File Split Plan",
"",
"## Purpose",
"",
"This document plans future oversized test-file splits using current repo data.",
"It does not move files, rewrite assertions, extract helpers, or change CI.",
"",
"## Roadmap context",
"",
"- Issue: #3983",
"- Parent tracker: #2523",
"- Follows #3973 / #3982, the report-only order-sensitivity diagnostics slice.",
"",
"## Methodology",
"",
"Metrics were generated from the current test tree using:",
"",
"- physical line counts for every recursive `test_*.py` file under `tests/`;",
"- AST counts for `test_*` functions and `Test*` classes;",
"- one `pytest --collect-only -q tests` run to count collected items per file;",
"- current taxonomy classification from `tests._taxonomy.classify_test_path`; and",
"- static setup-signal scans for route/API, DB/session, import-state, security, filesystem, subprocess/script, async/threading, and UI/static indicators.",
"",
"Static signals are not proof of risk. They are review prompts.",
"Future split PRs must still inspect each file manually before editing.",
"",
"## Current summary",
"",
f"- test files scanned: {len(metrics)}",
f"- collected pytest items counted: {node_count_total}",
f"- large-file threshold: {LARGE_LINE_THRESHOLD} lines",
f"- large-collected threshold: {LARGE_NODE_THRESHOLD} collected items",
"",
]
write_distribution(lines, "Area distribution", areas)
write_distribution(lines, "Sub-area distribution", sub_areas, min_count=2)
lines.extend(metric_table("Top files by collected pytest items", by_collected[:TOP_LIMIT]))
lines.extend(metric_table("Top files by physical line count", by_lines[:TOP_LIMIT]))
lines.extend(candidate_section(candidates))
lines.extend(taxonomy_gap_section(candidates))
lines.extend(first_manual_review_section(candidates))
lines.extend(deferred_section(candidates))
lines.extend([
"## Rules for future split PRs",
"",
"- One file or one coherent file-family per PR.",
"- No assertion rewrites mixed with file moves.",
"- No helper extraction mixed with file moves.",
"- No production code changes.",
"- No CI workflow changes.",
"- Preserve existing markers and taxonomy unless the split issue explicitly says otherwise.",
"- Validate the original file's collected tests before and after the split.",
"- Validate any neighboring taxonomy/focused-runner behavior if paths change.",
"- Treat files with route/API, DB/session, import-state, or security signals as higher-risk until manually reviewed.",
"",
"## Suggested next step",
"",
"Use this plan to choose the first actual oversized-file split issue.",
"The first split should prefer a file with high review value and low setup risk.",
"Do not start a split PR from this planning issue alone if the file's boundaries are still ambiguous.",
"",
"## Reproduction command",
"",
"This document was generated with:",
"",
"```bash",
".venv/bin/python tests/tools/build_oversized_test_split_plan.py",
"```",
"",
"## Freshness check",
"",
"After editing the builder or rebasing the branch, regenerate the plan and confirm no unexpected plan drift:",
"",
"```bash",
".venv/bin/python tests/tools/build_oversized_test_split_plan.py",
"git diff --exit-code -- tests/OVERSIZED_TEST_SPLIT_PLAN.md",
"```",
"",
])
OUTPUT.write_text("\n".join(lines), encoding="utf-8")
def write_raw(metrics: list[FileMetric]) -> None:
raw = [
{
"area": metric.area,
"collected": metric.collected,
"lines": metric.lines,
"nonblank": metric.nonblank,
"path": metric.path,
"signals": list(metric.signals),
"sub_area": metric.sub_area,
"test_classes": metric.test_classes,
"test_defs": metric.test_defs,
}
for metric in metrics
]
RAW_OUTPUT.write_text(json.dumps(raw, indent=2, sort_keys=True), encoding="utf-8")
def assert_taxonomy_worked(metrics: list[FileMetric]) -> None:
if not metrics:
raise SystemExit("ERROR: no test files were scanned")
unknown = sum(1 for metric in metrics if metric.area == "unknown")
if unknown == len(metrics):
raise SystemExit("ERROR: taxonomy classification returned unknown for every file")
def main() -> int:
if not TESTS_DIR.exists():
print("ERROR: tests/ directory not found", file=sys.stderr)
return 1
classify_test_path = load_taxonomy_classifier()
node_counts = collect_node_counts()
metrics = [metric_for(path, node_counts, classify_test_path) for path in test_files()]
assert_taxonomy_worked(metrics)
write_report(metrics, sum(node_counts.values()))
write_raw(metrics)
print(f"Wrote {OUTPUT.relative_to(ROOT)}")
print(f"Wrote {RAW_OUTPUT}")
return 0
if __name__ == "__main__":
raise SystemExit(main())