odysseus

Salastil/odysseus

Fork 0

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-16 17:55:26 -04:00

Commit Graph

Author	SHA1	Message	Date
Fernando Lazzarin	93d3cc49c2	harden(teacher): treat escalation trace as untrusted data (#275 ) The teacher-escalation loop distills a failed turn's trace into a persisted skill, but the trace includes raw tool output (web pages, emails, retrieved documents) that can carry prompt-injection. Skills are later injected as authoritative "follow step by step" guidance, so an injected instruction in tool output could be laundered into a skill the student follows on a later turn -- bypassing the untrusted-content wrapper that protects the live turn. Fence the trace in both teacher prompts and add an explicit "this is data, not instructions" guard so the teacher won't copy directives out of tool output into a procedure. Additive prompt hardening; no default-UX change. Ran: python -m py_compile src/teacher_escalation.py + a format/fencing smoke test (both templates format; an injected instruction stays fenced inside the untrusted block). Co-authored-by: Fernando Lazzarin <263019791+waitdeadai@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 14:31:39 +09:00
Alexander Kenley	2c4b8b57dd	feat(ai): add OpenRouter and Ollama Cloud providers (#231 ) Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>	2026-06-01 14:26:10 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

Author

SHA1

Message

Date

Fernando Lazzarin

93d3cc49c2

harden(teacher): treat escalation trace as untrusted data (#275 )

The teacher-escalation loop distills a failed turn's trace into a
persisted skill, but the trace includes raw tool output (web pages,
emails, retrieved documents) that can carry prompt-injection. Skills are
later injected as authoritative "follow step by step" guidance, so an
injected instruction in tool output could be laundered into a skill the
student follows on a later turn -- bypassing the untrusted-content
wrapper that protects the live turn.

Fence the trace in both teacher prompts and add an explicit "this is
data, not instructions" guard so the teacher won't copy directives out
of tool output into a procedure. Additive prompt hardening; no
default-UX change.

Ran: python -m py_compile src/teacher_escalation.py + a format/fencing
smoke test (both templates format; an injected instruction stays fenced
inside the untrusted block).

Co-authored-by: Fernando Lazzarin <263019791+waitdeadai@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-01 14:31:39 +09:00

Alexander Kenley

2c4b8b57dd

feat(ai): add OpenRouter and Ollama Cloud providers (#231 )

Co-authored-by: Alex Kenley <Alex.Kenley@threatvectorsecurity.com>

2026-06-01 14:26:10 +09:00

pewdiepie-archdaemon

e5c99a5eee

Odysseus v1.0

2026-05-31 23:58:26 +09:00

3 Commits