odysseus

Salastil/odysseus

Fork 0

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-15 17:25:26 -04:00

Commit Graph

Author	SHA1	Message	Date
onemorethan0	8ae2b5f58c	fix(llm): suppress thinking mode for qwen3/gemma4 on Ollama /v1 endpoint (#3228 ) * fix(llm): suppress thinking for qwen3/gemma4 on Ollama /v1 compat endpoint When using qwen3, QwQ, gemma4, or other thinking models via Ollama's OpenAI-compatible /v1 endpoint, the model routes all output into its <think>...</think> reasoning block. Since Odysseus strips thinking content from round_response and only accumulates native tool_calls, this produces a round with 0 chars, 0 native calls, 0 tool blocks — the agent appears to silently do nothing. Root cause: Odysseus classifies the /v1 endpoint as provider="openai" (not "ollama"), so the payload is built as a standard OpenAI payload without any Ollama-specific options. Ollama's /v1 endpoint accepts "think": false as a top-level parameter to suppress extended thinking, but this was never sent. Fix: - Add _is_ollama_openai_compat_url() to detect local Ollama /v1 URLs - Inject "think": false in both stream_llm and llm_call_async for thinking models (qwen3, QwQ, gemma4, DeepSeek-R1, etc.) on this endpoint Verified with qwen3:14b on Ollama 0.24: with think=False the model correctly emits native tool_calls in a single streaming chunk and the agent executes bash/file/web tools as expected. * fix(llm): extend _is_ollama_openai_compat_url to match localhost on any port Per reviewer feedback on PR #3228: 1. Generalize host detection to mirror _is_ollama_native_url: match any localhost/127.0.0.1/0.0.0.0/::1 host (not just port 11434) so that custom OLLAMA_HOST ports and container remaps are also covered. 2. Add tests/test_llm_core_ollama_thinking.py covering: - _is_ollama_openai_compat_url for all positive/negative URL cases including IPv6, non-default port, native /api path, and real OpenAI - Payload injection: think:false set for Ollama /v1 thinking model, not set for non-thinking model, not set for real OpenAI endpoint, and set for localhost on a non-default port (the new case)	2026-06-09 07:35:15 +02:00

Author

SHA1

Message

Date

onemorethan0

8ae2b5f58c

fix(llm): suppress thinking mode for qwen3/gemma4 on Ollama /v1 endpoint (#3228 )

* fix(llm): suppress thinking for qwen3/gemma4 on Ollama /v1 compat endpoint

When using qwen3, QwQ, gemma4, or other thinking models via Ollama's
OpenAI-compatible /v1 endpoint, the model routes all output into its
<think>...</think> reasoning block. Since Odysseus strips thinking
content from round_response and only accumulates native tool_calls,
this produces a round with 0 chars, 0 native calls, 0 tool blocks —
the agent appears to silently do nothing.

Root cause: Odysseus classifies the /v1 endpoint as provider="openai"
(not "ollama"), so the payload is built as a standard OpenAI payload
without any Ollama-specific options. Ollama's /v1 endpoint accepts
"think": false as a top-level parameter to suppress extended thinking,
but this was never sent.

Fix:
- Add _is_ollama_openai_compat_url() to detect local Ollama /v1 URLs
- Inject "think": false in both stream_llm and llm_call_async for
  thinking models (qwen3, QwQ, gemma4, DeepSeek-R1, etc.) on this
  endpoint

Verified with qwen3:14b on Ollama 0.24: with think=False the model
correctly emits native tool_calls in a single streaming chunk and
the agent executes bash/file/web tools as expected.

* fix(llm): extend _is_ollama_openai_compat_url to match localhost on any port

Per reviewer feedback on PR #3228:

1. Generalize host detection to mirror _is_ollama_native_url: match any
   localhost/127.0.0.1/0.0.0.0/::1 host (not just port 11434) so that
   custom OLLAMA_HOST ports and container remaps are also covered.

2. Add tests/test_llm_core_ollama_thinking.py covering:
   - _is_ollama_openai_compat_url for all positive/negative URL cases
     including IPv6, non-default port, native /api path, and real OpenAI
   - Payload injection: think:false set for Ollama /v1 thinking model,
     not set for non-thinking model, not set for real OpenAI endpoint,
     and set for localhost on a non-default port (the new case)

2026-06-09 07:35:15 +02:00

1 Commits