odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-18 02:35:23 -04:00

T

Kenny Van de Maele 074a1e6eff fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 )

* fix(search): add download budgets to web_fetch with truncation notice and hard ceiling

MAX_OUTPUT_CHARS only trims what the agent sees; fetch_webpage_content
buffered and cached the entire response body first, so a large or hostile
URL could pull arbitrarily many bytes into memory and the content cache.

The fetch is now a capped streaming GET (SSRF redirect guard unchanged):
a soft default budget (WEB_FETCH_SOFT_MAX_BYTES, 2 MB), a per-call
override via full/max_bytes on the web_fetch tool, and a hard ceiling
(WEB_FETCH_HARD_MAX_BYTES, 20 MB) that the override can never exceed.
When Content-Length already declares a body over the ceiling the fetch
is refused before any body bytes are buffered. Truncated results carry
truncated/fetched_bytes/total_bytes, the tool output leads with a
partial-content notice telling the model how to re-fetch with full=true,
and the tool schema documents the flag. A truncated PDF is reported as
a budget error since a cut PDF is unparseable. The effective cap is part
of the content-cache key so a truncated fetch is never served to a
full-budget request.

Existing tests that faked httpx.get or the old _get_public_url signature
are adapted to the streaming interface; behavior pins are unchanged.

Fixes #3812

* fix(search): close compressed-body cap bypass and protect the partial notice

Addresses RaresKeY's review on #3955:

- Force Accept-Encoding: identity for the capped fetch. With gzip/deflate the
  wire bytes (and Content-Length) can be a fraction of the decoded body, so a
  tiny compressed response could pass the hard-cap preflight and then expand
  past the ceiling in a single decoded chunk before the streamed cap could
  slice it. Identity makes Content-Length the true body size and keeps each
  streamed chunk bounded by the network read, so the hard ceiling actually
  bounds memory.
- Lead web_fetch output with the partial-content notice and cap the page
  title. The notice is the user-facing contract for partial fetches, but the
  title is untrusted, uncapped page content; placed ahead of the notice a giant
  title could push it past MAX_OUTPUT_CHARS and drop it. The notice now leads
  and the title is capped as a second guard.

Adds regressions: the fetch advertises identity encoding, and a truncated
result with an oversized title still surfaces the partial notice.

* fix(search): reject compressed responses that ignore the identity request

Requesting Accept-Encoding: identity is not enough on its own: a server can
ignore it and still return Content-Encoding: gzip, and httpx.iter_bytes would
decode that, so a tiny compressed body could balloon into one decoded chunk
far past the hard cap before the streamed loop slices it (and Content-Length,
the compressed wire length, makes the preflight and size metadata unreliable).

Refuse a non-identity Content-Encoding before reading the body. Adds a
regression where the server ignores the identity request and returns gzip;
the fetch is refused before any body is decoded.

2026-06-15 17:38:09 +00:00

.github

Remove duplicate CodeQL workflow

2026-06-15 22:53:29 +09:00

companion

refactor(constants): single source of truth for data dir (#3368 )

2026-06-08 09:58:52 +02:00

config/searxng

Generate SearXNG secret on first boot

2026-06-01 11:03:02 +09:00

core

feat(paths): abstract runtime path logic for frozen distribution packages (#969 )

2026-06-15 17:44:10 +01:00

docker

fix(docker): invoke setup.py on first container start (#1657 )

2026-06-03 14:12:20 +09:00

docs

fix(ci): avoid duplicate CodeQL setup (#4297 )

2026-06-15 16:39:13 +01:00

integrations

Add Codex and Claude document draft integration

2026-06-09 14:27:53 +09:00

licenses

Odysseus v1.0

2026-05-31 23:58:26 +09:00

mcp_servers

Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615

2026-06-15 21:20:15 +09:00

routes

fix(api): normalize non-object JSON bodies to empty dict in token PATCH (#3976 )

2026-06-15 18:05:15 +01:00

scripts

Merge remote-tracking branch 'origin/dev' into test-main-dev-merge-20260615

2026-06-15 21:20:15 +09:00

services

fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 )

2026-06-15 17:38:09 +00:00

src

fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 )

2026-06-15 17:38:09 +00:00

static

feat(email): add Google OAuth2 for Google Workspace / .edu IMAP & SMTP (#237 )

2026-06-15 17:02:58 +01:00

tests

fix(search): add download budgets to web_fetch with truncation notice and hard ceiling (#3955 )

2026-06-15 17:38:09 +00:00

.dockerignore

chore: align secrets env ignore patterns

2026-06-15 14:49:46 +09:00

.env.example

Parameterize Docker Compose volume host paths (#3907 )

2026-06-15 20:30:18 +09:00

.gitattributes

Add native Windows compatibility layer

2026-06-01 15:09:47 +09:00

.gitignore

pwa missing icons added (#428 )

2026-06-15 16:00:13 +09:00

ACKNOWLEDGMENTS.md

docs: fix stale documentation references (#1769 )

2026-06-03 13:23:21 +09:00

app.py

chore: add warnings to silent except Exception blocks (#3212 )

2026-06-15 17:49:27 +01:00

build-macos-app.sh

macOS app: force native arm64 uvicorn on Apple Silicon

2026-06-02 20:56:53 +09:00

CONTRIBUTING.md

refactor(constants): single source of truth for data dir (#3368 )

2026-06-08 09:58:52 +02:00

docker-compose.gpu-amd.yml

Parameterize Docker Compose volume host paths (#3907 )

2026-06-15 20:30:18 +09:00

docker-compose.gpu-nvidia.yml

Parameterize Docker Compose volume host paths (#3907 )

2026-06-15 20:30:18 +09:00

docker-compose.yml

Parameterize Docker Compose volume host paths (#3907 )

2026-06-15 20:30:18 +09:00

Dockerfile

chore(deps): bump python from 3.12-slim to 3.14-slim (#3988 )

2026-06-15 19:23:27 +09:00

install-service.sh

Odysseus v1.0

2026-05-31 23:58:26 +09:00

launch-windows.ps1

fix(windows): detect installed CUDA toolkit on launch (#2639 )

2026-06-15 20:26:07 +09:00

LICENSE

Change project license to AGPL-3.0-or-later

2026-06-09 14:25:04 +09:00

odysseus-ui.service

fix: systemd service should serve on port 7000 to match Docker/setup/README (#1297 )

2026-06-03 02:04:37 +09:00

package-lock.json

chore(deps): bump the npm group with 2 updates (#3989 )

2026-06-15 20:21:04 +09:00

package.json

chore(deps): bump the npm group with 2 updates (#3989 )

2026-06-15 20:21:04 +09:00

pyproject.toml

test: add fast lane and duration visibility (#3659 )

2026-06-09 20:11:47 +02:00

README.md

Refresh README presentation

2026-06-15 23:26:10 +09:00

requirements-optional.txt

chore(deps): bump the python group with 3 updates (#3991 )

2026-06-15 19:25:15 +09:00

requirements.txt

chore(deps): bump the python group with 3 updates (#3991 )

2026-06-15 19:25:15 +09:00

ROADMAP.md

Fix typos in the ROADMAP intro (#1421 )

2026-06-03 14:12:10 +09:00

SECURITY.md

Clarify private deployment hardening docs

2026-06-02 13:01:12 +09:00

setup.py

refactor(constants): single source of truth for data dir (#3368 )

2026-06-08 09:58:52 +02:00

start-macos.sh

fix(macos): rebuild incomplete venv instead of failing on re-run (#3106 )

2026-06-15 16:12:19 +09:00

THREAT_MODEL.md

docs: add THREAT_MODEL.md (#1111 )

2026-06-02 22:40:37 +09:00

update_windows.bat

Windows: add Docker update script

2026-06-02 20:45:32 +09:00

README.md

A self-hosted AI workspace for chat, agents, research, documents, email, notes, calendar, and local model workflows.

Quick Start · Setup Guide · Contributing · Roadmap

Quick Start

dev is the default branch and gets the newest changes first. Use main if you want the more curated branch.

git clone https://github.com/pewdiepie-archdaemon/odysseus.git
cd odysseus
cp .env.example .env
docker compose up -d --build

Open http://localhost:7000 when the containers are healthy. The first admin password is printed in docker compose logs odysseus.

Native installs, GPU notes, Windows/macOS instructions, HTTPS, and configuration live in the setup guide.

Features

Chat + Agents — local/API models, tools, MCP, files, shell, skills, and memory.
Cookbook — hardware-aware model recommendations, downloads, and serving.
Deep Research — multi-step web research with source reading and report generation.
Compare — blind side-by-side model testing and synthesis.
Documents — writing-first editor with AI edits, suggestions, Markdown, HTML, CSV, and syntax highlighting.
Email — IMAP/SMTP inbox with triage, tags, summaries, reminders, and reply drafts.
Notes, Tasks + Calendar — reminders, todos, scheduled agent tasks, and CalDAV sync.
Extras — gallery/image editor, themes, uploads, web search, presets, sessions, and 2FA.

Demo

A full hover-to-play tour lives on the landing page: docs/index.html.

Contributing

Help is welcome. The best entry points are fresh-install testing, provider setup bugs, mobile/editor polish, docs, and small focused refactors. See CONTRIBUTING.md and ROADMAP.md.

Security

Odysseus is a self-hosted workspace with powerful local tools. Keep auth enabled, keep private data out of Git, and do not expose raw model/service ports publicly. Deployment details are in the setup guide.

Star History

License

AGPL-3.0-or-later -- see LICENSE and ACKNOWLEDGMENTS.md.

Languages

Python 49.8%

JavaScript 39.6%

CSS 8.3%

HTML 1.8%

Shell 0.4%