mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-28 15:45:22 -04:00
Isolate untrusted context from visible user prompts (#3584)
Prevent untrusted source/context guard text from being merged into the current visible user request during provider message sanitization. Changes: - Detect untrusted context blocks during LLM message sanitization - Insert a short assistant boundary before the current user request - Keep the visible user prompt as its own user message - Preserve normal consecutive user-message merging for non-untrusted cases - Strengthen prompt-security wording to avoid mentioning guard wrappers - Add regression coverage for untrusted context followed by a user prompt Notes: - Untrusted context remains role:user for safety - This does not add prompt debug logging - This does not change frontend draft persistence
This commit is contained in:
@@ -38,6 +38,8 @@ def test_untrusted_context_policy_marks_sources_as_data():
|
||||
|
||||
assert "not instructions" in UNTRUSTED_CONTEXT_POLICY
|
||||
assert "overrides" in UNTRUSTED_CONTEXT_POLICY
|
||||
assert "Do not quote" in UNTRUSTED_CONTEXT_POLICY
|
||||
assert "acknowledge untrusted-source wrapper labels" in UNTRUSTED_CONTEXT_POLICY
|
||||
|
||||
|
||||
# ── secret_storage ─────────────────────────────────────────────
|
||||
|
||||
Reference in New Issue
Block a user