Scope email calendar extraction to account owner

The email auto-calendar pass (settings.email_auto_calendar / the
extract_email_events task) scans recently received mail and lets an LLM
create / update / cancel calendar events. Two problems made it a cross-tenant,
remotely triggerable hole:

1. No owner scoping. _auto_summarize_pass(account_id=None) fans out over EVERY
   enabled account of EVERY user. For each message it fetched an upcoming-events
   snapshot with NO owner filter (all tenants' events) and handed those uids +
   titles to the extraction LLM, then executed the model's ops via
   do_manage_calendar(...) with owner=None. do_manage_calendar only filters by
   owner when owner is not None, so create/update/delete ran across ALL users'
   calendars. Net: every user's event titles/times were disclosed to the model,
   and the model could cancel/move/duplicate any tenant's events by uid.

2. No prompt-injection wrapping. The raw email From/Subject/body were
   interpolated straight into an instruction-shaped extraction prompt (unlike
   the chat path, which wraps external text via src/prompt_security). Anyone
   who can email a user whose instance has auto-calendar enabled could inject
   operations: create attacker-controlled "meeting" events (the path even
   auto-harvests URLs from the body into the event location/description — a
   phishing primitive) or cancel/modify the victim's real events, with zero
   human in the loop.

Fix:
- Add core.database.get_upcoming_events(owner) and use it for the snapshot, so
  the LLM only ever sees the processed account owner's events.
- Look up the EmailAccount owner in _auto_summarize_pass_single and pass owner=
  to every do_manage_calendar call, so create/update/delete are scoped to that
  user (owner=None stays the single-user / legacy escape hatch).
- Tell the extraction model the email is untrusted data and not to follow
  instructions inside it (defense-in-depth against injection).

Add tests/test_calendar_owner_scope.py: get_upcoming_events returns only the
given owner's events (and everything when owner is None). Fails against the old
unscoped query.
This commit is contained in:
Collin
2026-06-01 10:12:32 -04:00
committed by GitHub
parent 11c2931efb
commit 70a71f603c
3 changed files with 101 additions and 26 deletions
+26
View File
@@ -1787,6 +1787,32 @@ def get_session_by_id(session_id: str):
with get_db_session() as db:
return db.query(Session).filter(Session.id == session_id).first()
def get_upcoming_events(owner, horizon_days: int = 60, limit: int = 40):
"""Upcoming, non-cancelled events as {uid, title, start} dicts, soonest first.
owner=None means NO owner scoping (single-user / legacy). Multi-user callers
MUST pass the owning username — otherwise they read every tenant's events.
The autonomous email->calendar pass relies on this to avoid disclosing (and
acting on) other users' calendars."""
from datetime import timedelta
now = datetime.utcnow()
with get_db_session() as db:
q = db.query(CalendarEvent).join(CalendarCal).filter(
CalendarEvent.dtstart >= now,
CalendarEvent.dtstart <= now + timedelta(days=horizon_days),
CalendarEvent.status != "cancelled",
)
if owner is not None:
q = q.filter(CalendarCal.owner == owner)
return [
{
"uid": e.uid,
"title": e.summary or "",
"start": e.dtstart.isoformat() if e.dtstart else "",
}
for e in q.order_by(CalendarEvent.dtstart).limit(limit).all()
]
def archive_session(session_id: str):
"""Archive a session"""
with get_db_session() as db: