mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-30 00:22:10 -04:00
fix(security): prevent ReDoS in verdict-prose and continuation matchers (#4943)
Two py/polynomial-redos sinks ran regexes with two adjacent \s-matching
quantifiers over untrusted model text, backtracking O(n^2) when the tail failed
on a whitespace flood:
- routes/skills_routes.py: the last-resort verdict-from-prose extractor used
`["\'\s:]*\s*` — the class already matches \s, so the trailing \s* was a
redundant second quantifier. Dropped it (extracted to a documented module
constant _VERDICT_PROSE_RE); the matched text is identical, the scan linear.
- src/agent_loop.py _EXPLICIT_CONTINUATION_RE: `\s*[.!?]*\s*$` put two \s*
around `[.!?]*`. Rewrote as `\s*(?:[.!?]+\s*)?$` — same accepted tails (no
two \s* adjacent), linear. Portable form (no possessive quantifiers).
Both verified output-equivalent to the originals across a fuzz corpus. Adds
tests/test_redos_verdict_continuation.py pinning the unchanged match sets and
bounding the flood inputs (old patterns took seconds at 40k whitespace chars).
This commit is contained in:
+11
-1
@@ -22,6 +22,16 @@ from core.middleware import require_admin
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Last-resort verdict extraction from a teacher/verifier model's prose (run when
|
||||
# JSON parsing fails). `["\'\s:]*` already consumes whitespace, so the original
|
||||
# trailing `\s*` made two adjacent \s-matching quantifiers that backtrack O(n^2)
|
||||
# on a `verdict` + whitespace flood in untrusted model output (CodeQL
|
||||
# py/polynomial-redos). Without it a single unbounded quantifier remains — the
|
||||
# matched text is identical, and the scan is linear.
|
||||
_VERDICT_PROSE_RE = re.compile(
|
||||
r'verdict["\'\s:]*["\']?(pass|needs_work|fail|inconclusive)', re.I
|
||||
)
|
||||
|
||||
|
||||
class SkillAddRequest(BaseModel):
|
||||
# New schema (preferred)
|
||||
@@ -196,7 +206,7 @@ async def _eval_skill_run(skill_md: str, task: str, transcript: str,
|
||||
# Last resort: pull the verdict keyword straight out of the prose so a
|
||||
# clearly-decided run isn't thrown away as "unparseable".
|
||||
if v not in _VERDICTS:
|
||||
km = _re.search(r'verdict["\'\s:]*\s*["\']?(pass|needs_work|fail|inconclusive)', text, _re.I)
|
||||
km = _VERDICT_PROSE_RE.search(text)
|
||||
if km:
|
||||
v = km.group(1).lower()
|
||||
if data is None:
|
||||
|
||||
Reference in New Issue
Block a user