odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-28 23:52:09 -04:00

T

nopoz c098355778 fix(security): prevent ReDoS in LLM-output tool/think parsers (#4704 )

* fix(security): prevent ReDoS in LLM-output tool/think parsers

The regexes that parse untrusted model output in text_helpers.py and
tool_parsing.py are delimiter-bounded with a lazy [\s\S]*? (or an
ambiguous (\s+[^>]*)?). Applied with re.sub/re.finditer over a whole
response, they degrade to O(n^2) when the closing delimiter is absent:
the engine rescans to end-of-string from every opener. Model output is
untrusted, so a prompt-injected or malicious model can stall the agent
loop with many unclosed openers (measured ~25s on a 60KB <thought flood).

- text_helpers.py: replace ambiguous <thought(\s+[^>]*)?> with
  <thought([^>]*)> (identical capture, no \s+/[^>]* overlap); skip the
  Gemma <|channel>...<channel|> subs when no <channel|> closer is present.
- tool_parsing.py: gate _TOOL_CALL_RE, _XML_TOOL_CALL_RE and _TOOL_CODE_RE
  (in parse_tool_blocks and strip_tool_blocks) on a cheap presence check
  for their closing delimiter. With no closer the regex cannot match, so
  skipping is equivalent; only the wasted O(n^2) rescan is removed.

Resolves CodeQL py/polynomial-redos #230, #231, #232, #233, #235, #236,
#524. The _XML_OPEN_TOOL_CALL_RE alerts (#234, #477) are false positives
(its greedy [\s\S]*\Z is linear) and left untouched.

* fix(security): close ReDoS gaps in tool/think parsers from review

Addresses two review findings on the closer-guard approach:

- Whole-string "closer exists?" checks were bypassable: a stale closer
  before an opener flood, or a closer with no reachable inner `}`, kept
  the guard true while every opener still rescanned to end-of-string
  (O(n^2)). Replace the substring guards with `_iter_delimited`, a
  forward-only scan that pairs each opener with a *later* closer and
  stops once none is reachable (O(n)). `parse_tool_blocks` and
  `strip_tool_blocks` (via `_strip_delimited`) both use it for the
  [TOOL_CALL], <tool_call>/<function_call>, and <tool_code> formats.
  Verified equivalent to the original regexes on well-formed inputs.

- `<thought([^>]*)>` dropped the tag-name boundary and corrupted
  unrelated tags (`<thoughtful>` -> `<thinkful>`). Use `<thought(\s[^>]*)?>`:
  the single fixed `\s` keeps the pattern linear (no `\s+`/`[^>]*`
  overlap) while restoring the boundary; capture is byte-for-byte
  identical for real `<thought ...>` openers.

Adds regressions for stale-closer-before-opener, closer-present-without-
inner-brace, and the <thoughtful>/<thoughts> passthrough.

* fix(security): close Gemma channel ReDoS guard flagged in review

vdmkenny noted the same bypassable whole-string guard remained in
text_helpers.py: `if "<channel|>" in out.lower()` gating the Gemma
thought/response channel subs. A stale `<channel|>` before a
`<|channel>thought` opener flood keeps the guard true while every opener
still rescans to end-of-string (measured ~7.3s at 4k openers).

Replace it with `_sub_delimited`, the same forward-only scan used for the
tool-call parsers: pair each opener with a later closer, stop when none is
reachable (O(n)). Verified output-equivalent to the original capture regexes
on well-formed multi-channel inputs; the stale-closer case now runs in <2ms.
Adds a regression for stale-closer-before-opener on the Gemma path.

* fix(security): harden strip_think() think-tag ReDoS flagged in review

The earlier fixes hardened normalize_thinking_markup and the delimiter
scanners, but the production entrypoint strip_think() still ran
_THINK_CLOSED_RE / _THINK_ATTR_RE / _THINK_OPEN_RE (and the stray-tag
_THINK_TAG_RE) over untrusted model output. Those kept the same ReDoS
shapes: the lazy `<open>[\s\S]*?</close>` rescanned to end-of-string from
every opener, and `(?:\s+[^>]*)?` / `[^>]*` attribute scans ran to
end-of-string from every opener on a "many openers, no closer" flood. On
the prior head, malformed `<think` / `<thinking` / `<thought` floods took
6-14s through strip_think(). The shipped `<thought>` normalization had the
same residual: the single-opener case was linear but an opener flood was
still O(n^2) (~4.4s).

- Replace the lazy multi-pass _THINK_CLOSED_RE loop with the existing
  forward-only _sub_delimited scan (pair each opener with the first
  reachable closer, stop when none is reachable). One pass collapses
  sequential and nested blocks as before.
- Bound every opener/stray-tag attribute scan at `<` (`[^<>]` not `[^>]`)
  so a no-`>` opener flood can't drive a single match attempt to
  end-of-string. Identical capture for well-formed think/thought tags.
- email_helpers._strip_think: compute had_think from the single linear
  _THINK_TAG_RE instead of the lazy closed/open `.search()` calls, which
  had the same O(n^2) on the email reply/summary/extraction paths.

All flood variants now finish in <10ms (were 6-14s). Output verified
byte-for-byte identical to the prior implementation over a 34-case corpus
(nested, mismatched, attr, uppercase, Gemma, prose, prompt-echo). Adds
strip_think() timing regressions for malformed openers, opener floods
(all three tag names), the closed-opener flood, and the malformed-closer
flood.

* docs: trim verbose comments in think-tag ReDoS fix

2026-06-27 10:12:28 -07:00

.github

chore(deps): bump actions/checkout in the actions group (#4559 )

2026-06-18 20:49:58 +02:00

companion

fix(companion): require chat scope for model inventory (#4319 )

2026-06-16 01:15:05 +02:00

config/searxng

Generate SearXNG secret on first boot

2026-06-01 11:03:02 +09:00

core

Fix _parse_msg_content corrupting JSON-array-like text messages on reload (#2060 )

2026-06-27 14:31:51 +01:00

docker

fix: Real-ESRGAN install + Cookbook deps-panel crash on the Python 3.14 image (#4694 )

2026-06-23 19:31:00 +02:00

docs

docs(setup): add a self-host troubleshooting cookbook of common traps (#4834 )

2026-06-26 20:24:02 +02:00

integrations

Add Codex and Claude document draft integration

2026-06-09 14:27:53 +09:00

licenses

feat(a11y): add a Text size control and an OpenDyslexic font option (#4210 )

2026-06-22 13:53:46 +02:00

mcp_servers

fix(email): enforce MCP owner boundaries (#4335 )

2026-06-16 04:31:24 +01:00

routes

fix(security): prevent ReDoS in LLM-output tool/think parsers (#4704 )

2026-06-27 10:12:28 -07:00

scripts

Fix odysseus-calendar list dropping in-progress / multi-day events (#2065 )

2026-06-16 14:04:56 +02:00

services

feat(catalog): add Gemma 4 12B/QAT entries and RTX 3050 bandwidth (#4728 )

2026-06-23 18:23:46 +02:00

specs

docs(architecture): add Phase 0 runtime inventory document (#4148 )

2026-06-16 04:57:24 +01:00

src

fix(security): prevent ReDoS in LLM-output tool/think parsers (#4704 )

2026-06-27 10:12:28 -07:00

static

fix(notes): allow inline editing of checklist items (#4832 )

2026-06-27 17:37:28 +02:00

tests

fix(security): prevent ReDoS in LLM-output tool/think parsers (#4704 )

2026-06-27 10:12:28 -07:00

.dockerignore

fix(devops): harden docker config defaults (#4349 )

2026-06-16 04:03:43 +01:00

.env.example

Parameterize Docker Compose volume host paths (#3907 )

2026-06-15 20:30:18 +09:00

.gitattributes

Add native Windows compatibility layer

2026-06-01 15:09:47 +09:00

.gitignore

pwa missing icons added (#428 )

2026-06-15 16:00:13 +09:00

ACKNOWLEDGMENTS.md

feat(a11y): add a Text size control and an OpenDyslexic font option (#4210 )

2026-06-22 13:53:46 +02:00

app.py

fix(routes): 500 (not 404) when the app-shell index.html is missing (#4791 )

2026-06-23 19:47:22 +02:00

build-macos-app.sh

macOS app: force native arm64 uvicorn on Apple Silicon

2026-06-02 20:56:53 +09:00

build-windows-portable.ps1

feat(launcher): add portable windows launcher (#976 )

2026-06-16 04:58:16 +01:00

CONTRIBUTING.md

Change host from 0.0.0.0 to 127.0.0.1 in CONTRIBUTING.md (#4422 )

2026-06-16 13:40:47 +00:00

docker-compose.gpu-amd.yml

CI fixes for cookbook workflow sync

2026-06-22 02:08:25 +00:00

docker-compose.gpu-nvidia.yml

CI fixes for cookbook workflow sync

2026-06-22 02:08:25 +00:00

docker-compose.yml

Merge origin/dev into main

2026-06-21 11:08:50 +00:00

Dockerfile

fix(docker): install python-magic and libmagic for upload MIME sniffing

2026-06-27 17:31:46 +01:00

install-service.sh

Odysseus v1.0

2026-05-31 23:58:26 +09:00

launch-windows.ps1

feat(launcher): add portable windows launcher (#976 )

2026-06-16 04:58:16 +01:00

launcher.py

feat(launcher): add portable windows launcher (#976 )

2026-06-16 04:58:16 +01:00

LICENSE

chore: backport main-only changes to dev AGPL relicense + Cookbook serve fix (#3704 )

2026-06-09 23:20:34 +02:00

odysseus-ui.service

fix: systemd service should serve on port 7000 to match Docker/setup/README (#1297 )

2026-06-03 02:04:37 +09:00

Odysseus.spec

feat(launcher): add portable windows launcher (#976 )

2026-06-16 04:58:16 +01:00

package-lock.json

chore(deps): remove unused @anthropic-ai/sdk dependency (#4566 )

2026-06-19 09:40:35 +02:00

package.json

chore(deps): remove unused @anthropic-ai/sdk dependency (#4566 )

2026-06-19 09:40:35 +02:00

pyproject.toml

test: add fast lane and duration visibility (#3659 )

2026-06-09 20:11:47 +02:00

README.md

Refresh README screenshot

2026-06-22 04:54:15 +00:00

requirements-optional.txt

chore(deps): bump the python group with 3 updates (#3991 )

2026-06-15 19:25:15 +09:00

requirements.txt

chore(deps): bump the python group with 3 updates (#3991 )

2026-06-15 19:25:15 +09:00

ROADMAP.md

Fix typos in the ROADMAP intro (#1421 )

2026-06-03 14:12:10 +09:00

SECURITY.md

Clarify private deployment hardening docs

2026-06-02 13:01:12 +09:00

setup.py

fix(setup): load .env so a pre-seeded admin password is honored on native installs (#4787 )

2026-06-23 20:08:05 +02:00

start-macos.sh

fix(macos): rebuild incomplete venv instead of failing on re-run (#3106 )

2026-06-15 16:12:19 +09:00

THREAT_MODEL.md

docs: add THREAT_MODEL.md (#1111 )

2026-06-02 22:40:37 +09:00

update_windows.bat

Windows: add Docker update script

2026-06-02 20:45:32 +09:00

README.md

A self-hosted AI workspace for chat, agents, research, documents, email, notes, calendar, and local model workflows.

Quick Start · Setup Guide · Contributing · Roadmap

Quick Start

dev is the default branch and gets the newest changes first. Use main if you want the more curated branch.

git clone https://github.com/pewdiepie-archdaemon/odysseus.git
cd odysseus
cp .env.example .env
docker compose up -d --build

Open http://localhost:7000 when the containers are healthy. The first admin password is printed in docker compose logs odysseus.

Native installs, GPU notes, Windows/macOS instructions, HTTPS, and configuration live in the setup guide.

Features

Chat + Agents — local/API models, tools, MCP, files, shell, skills, and memory.
Cookbook — hardware-aware model recommendations, downloads, and serving.
Deep Research — multi-step web research with source reading and report generation.
Compare — blind side-by-side model testing and synthesis.
Documents — writing-first editor with AI edits, suggestions, Markdown, HTML, CSV, and syntax highlighting.
Email — IMAP/SMTP inbox with triage, tags, summaries, reminders, and reply drafts.
Notes, Tasks + Calendar — reminders, todos, scheduled agent tasks, and CalDAV sync.
Extras — gallery/image editor, themes, uploads, web search, presets, sessions, and 2FA.

Demo

A full hover-to-play tour lives on the landing page: docs/index.html.

Contributing

Help is welcome. The best entry points are fresh-install testing, provider setup bugs, mobile/editor polish, docs, and small focused refactors. See CONTRIBUTING.md and ROADMAP.md.

Security

Odysseus is a self-hosted workspace with powerful local tools. Keep auth enabled, keep private data out of Git, and do not expose raw model/service ports publicly. Deployment details are in the setup guide.

Star History

License

AGPL-3.0-or-later -- see LICENSE and ACKNOWLEDGMENTS.md.

Languages

Python 50.9%

JavaScript 38.8%

CSS 8.1%

HTML 1.7%

Shell 0.4%