mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-28 07:35:27 -04:00
c098355778
* fix(security): prevent ReDoS in LLM-output tool/think parsers The regexes that parse untrusted model output in text_helpers.py and tool_parsing.py are delimiter-bounded with a lazy [\s\S]*? (or an ambiguous (\s+[^>]*)?). Applied with re.sub/re.finditer over a whole response, they degrade to O(n^2) when the closing delimiter is absent: the engine rescans to end-of-string from every opener. Model output is untrusted, so a prompt-injected or malicious model can stall the agent loop with many unclosed openers (measured ~25s on a 60KB <thought flood). - text_helpers.py: replace ambiguous <thought(\s+[^>]*)?> with <thought([^>]*)> (identical capture, no \s+/[^>]* overlap); skip the Gemma <|channel>...<channel|> subs when no <channel|> closer is present. - tool_parsing.py: gate _TOOL_CALL_RE, _XML_TOOL_CALL_RE and _TOOL_CODE_RE (in parse_tool_blocks and strip_tool_blocks) on a cheap presence check for their closing delimiter. With no closer the regex cannot match, so skipping is equivalent; only the wasted O(n^2) rescan is removed. Resolves CodeQL py/polynomial-redos #230, #231, #232, #233, #235, #236, #524. The _XML_OPEN_TOOL_CALL_RE alerts (#234, #477) are false positives (its greedy [\s\S]*\Z is linear) and left untouched. * fix(security): close ReDoS gaps in tool/think parsers from review Addresses two review findings on the closer-guard approach: - Whole-string "closer exists?" checks were bypassable: a stale closer before an opener flood, or a closer with no reachable inner `}`, kept the guard true while every opener still rescanned to end-of-string (O(n^2)). Replace the substring guards with `_iter_delimited`, a forward-only scan that pairs each opener with a *later* closer and stops once none is reachable (O(n)). `parse_tool_blocks` and `strip_tool_blocks` (via `_strip_delimited`) both use it for the [TOOL_CALL], <tool_call>/<function_call>, and <tool_code> formats. Verified equivalent to the original regexes on well-formed inputs. - `<thought([^>]*)>` dropped the tag-name boundary and corrupted unrelated tags (`<thoughtful>` -> `<thinkful>`). Use `<thought(\s[^>]*)?>`: the single fixed `\s` keeps the pattern linear (no `\s+`/`[^>]*` overlap) while restoring the boundary; capture is byte-for-byte identical for real `<thought ...>` openers. Adds regressions for stale-closer-before-opener, closer-present-without- inner-brace, and the <thoughtful>/<thoughts> passthrough. * fix(security): close Gemma channel ReDoS guard flagged in review vdmkenny noted the same bypassable whole-string guard remained in text_helpers.py: `if "<channel|>" in out.lower()` gating the Gemma thought/response channel subs. A stale `<channel|>` before a `<|channel>thought` opener flood keeps the guard true while every opener still rescans to end-of-string (measured ~7.3s at 4k openers). Replace it with `_sub_delimited`, the same forward-only scan used for the tool-call parsers: pair each opener with a later closer, stop when none is reachable (O(n)). Verified output-equivalent to the original capture regexes on well-formed multi-channel inputs; the stale-closer case now runs in <2ms. Adds a regression for stale-closer-before-opener on the Gemma path. * fix(security): harden strip_think() think-tag ReDoS flagged in review The earlier fixes hardened normalize_thinking_markup and the delimiter scanners, but the production entrypoint strip_think() still ran _THINK_CLOSED_RE / _THINK_ATTR_RE / _THINK_OPEN_RE (and the stray-tag _THINK_TAG_RE) over untrusted model output. Those kept the same ReDoS shapes: the lazy `<open>[\s\S]*?</close>` rescanned to end-of-string from every opener, and `(?:\s+[^>]*)?` / `[^>]*` attribute scans ran to end-of-string from every opener on a "many openers, no closer" flood. On the prior head, malformed `<think` / `<thinking` / `<thought` floods took 6-14s through strip_think(). The shipped `<thought>` normalization had the same residual: the single-opener case was linear but an opener flood was still O(n^2) (~4.4s). - Replace the lazy multi-pass _THINK_CLOSED_RE loop with the existing forward-only _sub_delimited scan (pair each opener with the first reachable closer, stop when none is reachable). One pass collapses sequential and nested blocks as before. - Bound every opener/stray-tag attribute scan at `<` (`[^<>]` not `[^>]`) so a no-`>` opener flood can't drive a single match attempt to end-of-string. Identical capture for well-formed think/thought tags. - email_helpers._strip_think: compute had_think from the single linear _THINK_TAG_RE instead of the lazy closed/open `.search()` calls, which had the same O(n^2) on the email reply/summary/extraction paths. All flood variants now finish in <10ms (were 6-14s). Output verified byte-for-byte identical to the prior implementation over a 34-case corpus (nested, mismatched, attr, uppercase, Gemma, prose, prompt-echo). Adds strip_think() timing regressions for malformed openers, opener floods (all three tag names), the closed-opener flood, and the malformed-closer flood. * docs: trim verbose comments in think-tag ReDoS fix