Agent email safety: stage drafts for user approval instead of auto-send

Closes the auto-send hole that let earlier models invent signatures
(e.g. signing 'David' for a user named Felix) and SMTP them to real
recipients before the user could review.

New setting: agent_email_confirm (default True).

When on, the MCP send_email and reply_to_email tools no longer SMTP
directly — they write the composed email to scheduled_emails with a new
status 'agent_draft' (far-future send_at so the scheduled-send poller
ignores them) and return a {pending: true, pending_id, to, subject,
body, message: ...} payload. The model surfaces that to the user.

Backend endpoints to approve / cancel:
- GET    /api/email/pending          → list staged drafts for the owner
- POST   /api/email/pending/{id}/approve → flip status to 'pending' +
                                           backdate send_at so the
                                           existing scheduled-send
                                           poller delivers immediately
- DELETE /api/email/pending/{id}     → status = 'cancelled'

UI:
- Settings / AI Defaults gets a new 'Email Safety' card with the
  toggle, default on.
- Tool descriptions for send_email and reply_to_email now include the
  pending behavior + an explicit 'DO NOT invent a signature, do not
  type a person's name' guardrail.

Pass 2 (next): inline chat card with Send / Discard buttons so the user
doesn't have to type a confirmation reply. Today's prompt + the listing
endpoint give the model a clean path to surface drafts.
This commit is contained in:
pewdiepie-archdaemon
2026-06-11 08:50:06 +09:00
parent 2b1e2e9e20
commit bc2d934b94
6 changed files with 212 additions and 3 deletions
+6 -2
View File
@@ -416,7 +416,9 @@ Notes, checklists, AND user reminders. Use this for "create/add/write a note", t
```send_email
{"to": "recipient@example.com", "subject": "Re: Your question", "body": "Hi, ...", "account": "gmail"}
```
Send a new email via SMTP. Use `resolve_contact` first if you only have a name. If multiple email accounts exist, call `list_email_accounts` first and pass the chosen `account`.""",
Send a new email via SMTP. Use `resolve_contact` first if you only have a name. If multiple email accounts exist, call `list_email_accounts` first and pass the chosen `account`.
CRITICAL — signatures: DO NOT invent a sign-off name. End the body with just `Thanks,` or similar — never type a person's name unless the user explicitly told you what to sign as. When `agent_email_confirm` is on (default), the tool returns `{pending: true, pending_id: ...}` and stages the email for the user to approve in the chat UI instead of SMTPing immediately.""",
"list_emails": """\
```list_emails
{"folder": "INBOX", "max_results": 20, "unread_only": false, "account": "gmail"}
@@ -427,7 +429,9 @@ List recent emails from a folder, newest first, including read messages by defau
```reply_to_email
{"uid": "1234", "body": "Sounds good — talk Friday.", "account": "gmail"}
```
SEND a reply email immediately by UID. Do not use this for "open a reply" or "start a reply" — those should use `ui_control` with `open_email_reply <uid> <folder> reply` to open the email draft document. For follow-up requests like "reply ..." after reading/listing email where the user clearly wants to send now, use the exact UID and account from the latest `read_email`/`list_emails` result. Never invent UID `1`. Threads automatically (In-Reply-To/References handled).""",
SEND a reply email immediately by UID. Do not use this for "open a reply" or "start a reply" — those should use `ui_control` with `open_email_reply <uid> <folder> reply` to open the email draft document. For follow-up requests like "reply ..." after reading/listing email where the user clearly wants to send now, use the exact UID and account from the latest `read_email`/`list_emails` result. Never invent UID `1`. Threads automatically (In-Reply-To/References handled).
CRITICAL — signatures: DO NOT invent a sign-off name. End the body with just `Thanks,` or similar — never type a person's name unless the user explicitly told you what to sign as. When `agent_email_confirm` is on (default), the tool returns `{pending: true, pending_id: ...}` and stages the email for the user to approve in the chat UI instead of SMTPing immediately.""",
"bulk_email": """\
```bulk_email
{"action": "delete", "uids": ["10997", "10998"], "folder": "INBOX", "account": "Gmail"}
+8
View File
@@ -29,6 +29,14 @@ def _invalidate_caches():
# ── Default values ──
DEFAULT_SETTINGS = {
# Agent email safety: when True, the MCP send_email / reply_to_email
# tools don't SMTP directly. They stage the composed message into the
# scheduled_emails table with status='agent_draft' and return a
# pending_id + the rendered email so the user can review and approve
# (or cancel) before it actually goes out. Default ON because models
# have been observed inventing signatures and sending to real
# recipients without confirmation.
"agent_email_confirm": True,
"image_gen_enabled": False,
"image_model": "",
"image_quality": "medium",