feat(discovery): detect llama.cpp servers and label local providers (#4729)

* feat(discovery): detect llama.cpp servers and label local providers

Scan port 8080 (llama-server) and 11435 (APFEL) during discovery, fingerprint
llama.cpp via its native /props endpoint, and label well-known local serving
ports (8080 llama.cpp, 8000 vLLM, 1234 LM Studio, 11434 Ollama) consistently
in both the Python provider helper and the JS endpoint UI. Adds a llama.cpp
hint to the /setup slash command.

* fix(discovery): don't infer the serving tool from the port alone

Per review: vLLM, SGLang, llama.cpp and plain OpenAI-compatible servers all
share 8000/8080, so labeling by port mislabels real setups (a vLLM box on 8080
shown as llama.cpp). Drop the port->tool assertions from _provider_label and
providerLabel; the authoritative signal is the /props fingerprint done during
discovery, which is unchanged. Loopback now reads a neutral 'local endpoint' /
'Local'. Tests updated to assert the neutral labels.
This commit is contained in:
Joel Alejandro Escareño Fernández
2026-06-23 23:39:56 +02:00
committed by GitHub
parent 72c0bde8a9
commit e0ccf250a4
9 changed files with 330 additions and 15 deletions
+8 -1
View File
@@ -777,10 +777,17 @@ def _provider_label(url: str) -> str:
pass
if _is_ollama_native_url(url): return "Ollama"
try:
host = (urlparse(url).hostname or "").lower()
_parsed_local = urlparse(url)
host = (_parsed_local.hostname or "").lower()
port = _parsed_local.port
except Exception:
return "provider"
if host in {"localhost", "127.0.0.1", "::1", "0.0.0.0"}:
# A port alone is not authoritative: vLLM, SGLang, llama.cpp and plain
# OpenAI-compatible servers all routinely share 8000/8080, so naming the
# serving tool from the port here would mislabel real setups. The tool is
# identified by probing llama-server's native /props endpoint during
# discovery (see ModelDiscovery._fingerprint_provider); this stays neutral.
return "local endpoint"
return host or "provider"