* fix(llm): suppress thinking for qwen3/gemma4 on Ollama /v1 compat endpoint
When using qwen3, QwQ, gemma4, or other thinking models via Ollama's
OpenAI-compatible /v1 endpoint, the model routes all output into its
<think>...</think> reasoning block. Since Odysseus strips thinking
content from round_response and only accumulates native tool_calls,
this produces a round with 0 chars, 0 native calls, 0 tool blocks —
the agent appears to silently do nothing.
Root cause: Odysseus classifies the /v1 endpoint as provider="openai"
(not "ollama"), so the payload is built as a standard OpenAI payload
without any Ollama-specific options. Ollama's /v1 endpoint accepts
"think": false as a top-level parameter to suppress extended thinking,
but this was never sent.
Fix:
- Add _is_ollama_openai_compat_url() to detect local Ollama /v1 URLs
- Inject "think": false in both stream_llm and llm_call_async for
thinking models (qwen3, QwQ, gemma4, DeepSeek-R1, etc.) on this
endpoint
Verified with qwen3:14b on Ollama 0.24: with think=False the model
correctly emits native tool_calls in a single streaming chunk and
the agent executes bash/file/web tools as expected.
* fix(llm): extend _is_ollama_openai_compat_url to match localhost on any port
Per reviewer feedback on PR #3228:
1. Generalize host detection to mirror _is_ollama_native_url: match any
localhost/127.0.0.1/0.0.0.0/::1 host (not just port 11434) so that
custom OLLAMA_HOST ports and container remaps are also covered.
2. Add tests/test_llm_core_ollama_thinking.py covering:
- _is_ollama_openai_compat_url for all positive/negative URL cases
including IPv6, non-default port, native /api path, and real OpenAI
- Payload injection: think:false set for Ollama /v1 thinking model,
not set for non-thinking model, not set for real OpenAI endpoint,
and set for localhost on a non-default port (the new case)