* feat: add NVIDIA as an AI provider (integrate.api.nvidia.com) * feat: add NVIDIA option to provider settings dropdown and aliases * test: add NVIDIA provider detection and endpoint tests * Add NVIDIA to _HOST_TO_CURATED and expand non-chat model filtering - nvidia.com -> 'nvidia' curated key for proper provider routing - _NON_CHAT_PREFIXES: bge, snowflake/arctic-embed, nvidia/nv-embed - _NON_CHAT_CONTAINS: content-safety, -safety, -reward, nvclip, kosmos, fuyu, deplot, vila, neva, gliner, riva, -parse, -embedqa, -nemoretriever * Expand non-chat model filtering for NVIDIA embedding/guard/video models Add _NON_CHAT_PREFIXES: embed, recurrent Add _NON_CHAT_CONTAINS: topic-control, guard, calibration, ai-synthetic-video, cosmos-reason2 Catches remaining unfiltered non-chat models from NVIDIA catalog: embedding (llama-nemotron-embed, embed-qa), guard (llama-guard, nemoguard-topic-control), calibration (ising-calibration), video (ai-synthetic-video-detector, cosmos-reason2), recurrent (recurrentgemma-2b) * Filter non-chat models in _probe_endpoint via _is_chat_model() Previously _is_chat_model() was only used in the per-model probe and _first_chat_model(), so non-chat models still appeared in the model picker even though they were filtered in those specific paths. Applying the filter at _probe_endpoint() return ensures non-chat models (embeddings, safety guards, reward, calibration, video detectors, CLIP, VLM, translation, parsing, recurrent, etc.) never enter cached_models and never appear in the picker. * Fix _NON_CHAT_CONTAINS to catch org-prefixed embedding models Prefix checks (mid.startswith) miss models with org prefixes like baai/bge-m3, nvidia/embed-qa-4, google/recurrentgemma-2b, etc. Adding the same terms to _NON_CHAT_CONTAINS ensures they are caught regardless of the org prefix. Adds: embed, bge, recurrent, starcoder, gemma-2b * fix(model-routes): drop collision-prone substrings from global non-chat filter The NVIDIA PR added several substrings to the shared _NON_CHAT_PREFIXES and _NON_CHAT_CONTAINS tuples. These are intended to filter out embedding, retrieval, safety, and vision models from NVIDIA's catalog that are not chat-completions-capable. However, four of the added substrings collide with legitimate chat models served by other providers: - gemma-2b matches google/gemma-2b-it (instruct chat model) - starcoder matches bigcode/starcoder2-15b (code completion model) - recurrent matches google/recurrentgemma-2b (language model) - guard matches meta-llama/Llama-Guard-3-8B (safety classifier) Removing these four from the global tuples keeps the NVIDIA-specific filtering intact (safety, embedding, retrieval, and vision models are still caught by other tokens such as content-safety, -safety, -reward, embed, bge, -embedqa, -nemoretriever, nvclip, deplot, etc.) while preventing false negatives for instruct/code models on other providers. Tests added for gemma-2b-it, google/gemma-2b-it, and bigcode/starcoder2-15b-instruct asserting they are recognized as chat models. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): remove duplicate bge/embed tokens from _NON_CHAT_CONTAINS Tokens already present in _NON_CHAT_PREFIXES, making the CONTAINS entries redundant since the prefix check runs first. Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * fix(nvidia): move bge to CONTAINS, add llama-guard, remove stray blanks Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be> * style: fix indentation of groq and xai test cases in test_provider_endpoints.py --------- Co-authored-by: Kenny Van de Maele <kenny@kvandemaele.be>
Test Suite Notes
Purpose
This file documents the shared test helpers and the review expectations that go
with them. The suite is being refactored incrementally, so this is a working
reference for that effort - not a claim that the suite is already fully
organized. Read it before adding a new helper or before reviewing a PR that
touches tests/helpers/.
For the broader rules - test taxonomy, determinism/isolation rules, the
behavioral-vs-source-text policy, and helper/factory extraction rules - see
TESTING_STANDARD.md. This file is the concrete helper
reference; that file is the standard the refactor works toward.
Running focused subsets (taxonomy markers)
tests/conftest.py tags every test at collection time with two markers derived
from its filename by tests/_taxonomy.py: an area_* marker (e.g.
area_security) and a finer sub_* marker (e.g. sub_owner_scope). This adds
markers only - it moves no files and changes no test behavior. Use them to run a
focused slice:
python3 -m pytest -m area_security
python3 -m pytest -m "area_services and sub_cookbook"
Areas are security, routes, services, cli, js, helpers, unit, and
uncategorized. Classification is conservative and token-based: a file that
matches no area keyword falls back to area_uncategorized with its filename as
the sub-area. The area_* names are registered in pyproject.toml; the dynamic
sub_* names are registered before collection by pytest_configure in
tests/conftest.py, so unknown-mark warnings still flag genuine typos.
Core principles
- Keep PRs small and homogeneous: one kind of change per PR.
- Prefer explicit local setup over hidden global fixtures.
- Avoid expanding the root
conftest.pyunless absolutely necessary. - Do not mix file moves with logic changes in the same PR.
- Do not weaken tests with
skip/xfailjust to make CI pass. - Validate the focused files you changed, plus any neighboring or order-sensitive groups they interact with.
Helper conventions
The helpers below live under tests/helpers/. They exist to remove repeated
boilerplate that already appeared across multiple tests. Reach for one only when
your test matches its intended use; do not stretch a helper to cover a new case.
tests.helpers.cli_loader.load_script
Use when a test needs to import a script under scripts/ without repeating
SourceFileLoader / importlib.util boilerplate.
- Intended for script/CLI tests that load a single file from
scripts/. - Not for arbitrary package imports - use a normal
importfor those. - When migrating an existing test to it, keep the existing stubs and assertions
unchanged. Any
sys.modulesstubs the script needs at import time must still be injected (e.g. viamonkeypatch) before callingload_script.
tests.helpers.import_state.clear_module
Use when a test must drop one cached module and its parent-package attribute before a fresh import.
- Clears
sys.modules[name]. - Clears the parent-package attribute when present.
- Good replacement for local
sys.modules.pop(...)+delattr(parent, child)blocks.
tests.helpers.import_state.preserve_import_state
Use when a test temporarily installs stubs into sys.modules and needs
deterministic cleanup afterward.
- Context manager: restores both
sys.modulesentries and parent-package attributes on exit (normal or exception). - Useful around module-level stubs or temporary imports.
- Prefer narrow, explicit module names over broad ones.
tests.helpers.import_state.clear_fake_database_modules
Use only for the guarded fake/stub database cleanup pattern.
- Preserves a real-looking
core.database(one with a string__file__). - Removes a fake/stub
core.databaseand the relatedsrc.databasestate. - Do not use as a general database reset fixture.
tests.helpers.import_state.clear_fake_endpoint_resolver_modules
Use only for the guarded fake/stub src.endpoint_resolver cleanup pattern.
- Preserves real resolver modules (those with a truthy
__file__). - Evicts fake/stub resolver modules and the dependent route modules that were cached against them.
- Accepts explicit extra dependent module names to evict alongside the defaults.
tests.helpers.sqlite_db.make_temp_sqlite
Use for the repeated file-backed temp sqlite setup in tests.
- Only constructs
(SessionLocal, engine, tmpfile)from the repeated block. - Does not patch modules and does not clean up the temp file.
- The caller must bind
SessionLocalexplicitly onto whatever module the code under test reads, and must keep the returned objects alive. - Do not use it as a general DB fixture framework.
What not to abstract yet
Some remaining patterns should stay as-is for now rather than being forced into helpers:
- Large mixed files such as security/review regression files.
- Setup-oriented
sys.modulesstub installers. - One-off custom module patching.
- DB/session/route setup, until it has been audited separately.
Validation expectations
Run validation locally before opening or approving a PR. Practical checks:
git diff --check- catch whitespace and conflict-marker errors.python3 -m py_compile <changed files>- confirm changed files compile.- Focused
pyteston the changed test files. pyteston neighboring or order-sensitive test groups that share import state with the changed files.grepfor the old boilerplate when replacing it, to confirm no stragglers remain.- A fresh audit worktree when changing the helpers themselves, so stale
__pycache__or import state cannot mask a regression.
Current roadmap
- Import-state cleanup - complete.
- Document helper conventions (this file).
- Audit fake DB /
SessionLocal/ route setup duplication. - Add tiny helpers only when the repeated semantics are clear.
- Start low-risk file moves only after helper conventions are documented.
- Avoid moving high-risk security/route regression files first.