odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-15 17:25:26 -04:00

Author	SHA1	Message	Date
Hriday Ranka	270b8570fc	feat(email): add Google OAuth2 for Google Workspace / .edu IMAP & SMTP (#237 ) * feat(email): add Google OAuth2 for Google Workspace / .edu IMAP & SMTP Google deprecated basic-auth (password) access for Google Workspace accounts in May 2025. This means any .edu or org Google email account could no longer connect via IMAP/SMTP with a username + password — the email feature was silently broken for a large class of users. This PR adds full OAuth2 (XOAUTH2) support for Google accounts so Workspace / .edu emails work out of the box. ## What changed ### Backend - `core/database.py`: add `oauth_provider`, `oauth_access_token`, `oauth_refresh_token`, `oauth_token_expiry`, and `display_name` columns to `EmailAccount` + idempotent migration - `routes/email_helpers.py`: XOAUTH2 auth in `_imap_connect()` and `_send_smtp_message()`, automatic token refresh, OAuth fields in `_get_email_config()` - `routes/email_routes.py`: OAuth authorize + callback routes, `_smtp_ready()` fix, OAuth fields through `_deliver()` closure, `display_name` in `From:` header ### Frontend - `static/js/settings.js`: "Google Workspace / .edu" provider preset, "Connect with Google" button, success/error banner, display name field - `static/js/document.js`: `_accountCanSend()` recognises OAuth accounts as SMTP-capable * security: sign OAuth state, scope callback by owner, fix quotes & logs Addresses reviewer feedback on the email OAuth2 PR: - OAuth state is now HMAC-SHA256 signed (keyed with the app secret from secret_storage) encoding account_id + owner + a random nonce, and is verified with constant-time comparison in the callback before any token write. Replaces the bare account_id state, closing the CSRF / state-guessing gap. - Callback extracts the owner from the verified state and re-checks it against EmailAccount.owner before writing tokens, matching the ownership guards used elsewhere in the email routes. Single-user mode (owner == "") still accepts any account, consistent with _assert_owns_account. - Replaced curly/smart quotes in the Name/Email/Display Name input rows with plain ASCII so getElementById lookups and event wiring work. - Stripped account name, SMTP host/user, owner, and raw provider error text from send-config and OAuth logs; failures now surface as generic error codes in the redirect instead of raw exception strings. * test(email): add OAuth2 state, _smtp_ready, and XOAUTH2 tests Move the OAuth state sign/verify helpers out of the setup_email_routes closure into module-level make_oauth_state/verify_oauth_state in email_helpers.py so they can be unit-tested, then add tests/test_email_oauth.py: - signed state round-trips account_id + owner, nonce is unique per call - tampered account_id, forged signature, and garbage states are rejected - _smtp_ready treats an OAuth account (no password) as send-capable, and still rejects host+user-only accounts with neither password nor OAuth - _xoauth2_string / _xoauth2_bytes produce the correct SASL XOAUTH2 framing 14 new tests; existing test_security_regressions.py still passes (28). * refactor(email): single XOAUTH2 frame helper, use RuntimeError Polish from self-review before merge: - Collapse the XOAUTH2 framing to one source of truth: _xoauth2_raw() returns the unencoded SASL string used by both the SMTP and IMAP auth callbacks (each library base64-encodes it), and _xoauth2_bytes() is just its .encode(). Removes the unused base64 _xoauth2_string helper and the duplicated inline frame in _send_smtp_message. - Raise RuntimeError (not bare Exception) for the "OAuth token unavailable" path, matching the convention used across src/. - Update tests accordingly. All 14 OAuth tests + 28 security regressions pass; SMTP/IMAP XOAUTH2 verified live against a real Workspace account. * tests(email-oauth): cover the security-sensitive OAuth paths before merge The previous tests only exercised pure helpers (state signing, _smtp_ready, XOAUTH2 framing). This adds coverage for the actual token-custody and ownership behaviour, pinning the real route handlers rather than re-implementations of their logic. Real OAuth callback route (pulled live from setup_email_routes()): - missing code -> generic missing_code redirect, no account id / owner in URL - provider error -> generic google_error redirect, raw error not echoed - tampered/invalid state -> invalid_state redirect, auth code never leaked - signed state with owner mismatch -> token write refused (ownership_error), DB row left untouched - signed state with matching owner -> tokens written encrypted, and only to the intended account (a second account stays untouched) Real accounts-list route: - exposes oauth_provider status but never the access/refresh token values, encrypted or otherwise Token storage / refresh helpers (isolated in-memory SQLite, mocked HTTP): - refreshed access token stored encrypted; expiry is a timestamp, not a token - fresh token uses cache (no refresh call); expired token triggers refresh - refresh HTTP failure returns None silently, no exception or secret surfaced - missing client credentials short-circuits to None Password-account regression: - password IMAP accounts call conn.login(); OAuth accounts call XOAUTH2 authenticate() and never login() 28 tests pass (14 prior + 14 new). * fix(email-oauth): drop raw exception text from token-refresh log Google token refresh failures now log the account id only, matching the conservative logging used elsewhere on the OAuth path — no raw provider/exception details surfacing in logs. * fix(email-oauth): bring OAuth UI parity to the Integrations email form The Google Workspace / .edu provider preset, Display Name field, and Connect-with-Google flow were only wired into the Email-tab account form. The Integrations-tab form (a separate code path for the same account type) was missing all three, so the OAuth option was invisible from that entry point. Mirrors the same PROVIDERS entry, OAuth section, and connect handler so both forms behave identically. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-15 17:02:58 +01:00
Achilleas90	ffc0f1dccc	Harden CalDAV write-back with retries (#1193 ) Co-authored-by: Alexandre Teixeira <111787685+alteixeira20@users.noreply.github.com>	2026-06-15 15:59:31 +09:00
Shashwat Deep	e384c5a2a6	fix(db): close sqlite migration connections on exception paths (#3600 ) The _migrate_* startup helpers in core/database.py opened a raw sqlite3.connect() inside a try and called conn.close() as the last statement in that try. If any earlier statement raised (locked DB, unexpected schema, a failed ALTER), close() was skipped and the bare except only logged the error — leaking the connection (file handle + lock) for the lifetime of the process. These migrations run on every startup. Wrap each in the conn = None + try/except/finally pattern already used by _migrate_chat_messages_fts in this same file, so the connection is closed on all exit paths. 25 functions; no change on the success path. Helpers that already close safely are left untouched: _migrate_chat_messages_fts and _migrate_backfill_task_folders (the latter uses SQLAlchemy's engine.connect() context manager). Same bug class as the previously merged DB-connection-leak fix (#64) and the IMAP logout-on-all-paths fix (#1530).	2026-06-10 17:03:01 +02:00
stocky789	1e0d9b92af	feat: add ChatGPT Subscription provider (#2876 ) * feat: Add ChatGPT Subscription support and related features - Introduced a new provider option for ChatGPT Subscription in the endpoint selection UI. - Implemented OAuth flow for ChatGPT Subscription sign-in, including polling for authorization status. - Updated admin interface to handle ChatGPT Subscription, including disabling API key input and providing user guidance. - Enhanced cost tracking logic to differentiate between subscription and non-subscription endpoints. - Added new slash commands for managing skills, including listing, searching, and invoking skills. - Implemented caching for skill catalog to optimize performance. - Updated tests to cover new ChatGPT Subscription functionality and ensure proper endpoint probing. - Refactored existing code to accommodate new features and improve maintainability. * refactor: share provider device-flow setup - reuse one device-flow backend for Copilot and ChatGPT Subscription - add one frontend device-flow helper for Settings and /setup - put GitHub Copilot back into Add Models, now as a dropdown option - make provider selection just select; clicking Add starts sign-in - stop ChatGPT Subscription setup from opening auth tabs automatically - make /setup copilot and /setup chatgpt-subscription work from chat - show ChatGPT Subscription in the /setup suggestions - show the real error message when setup fails - add focused tests for the shared flow and setup UI * feat(chatgpt-subscription): harden credential lifecycle and streamline auth UX Backend: - Resolve runtime bearer for provider-auth endpoints at probe time via a shared _resolve_probe_key() that delegates to resolve_endpoint_runtime, applied across all probe/refresh call sites. - Skip live completion probes and health pings for discovery-only providers (centralized behind _is_discovery_only_provider) — the Codex/Responses API has no such endpoints, so status is derived from cached models. - Never persist the short lived ChatGPT bearer to the plaintext sessions table; proactively clear any stale bearer left by an earlier code path. - Revoke orphaned ProviderAuthSession credentials when the last endpoint backing them is deleted (_delete_orphaned_provider_auth), surfaced via cleared_provider_auth in the delete response. Frontend (admin.js): - Auto-start the device-auth flow on provider selection so the authorization panel (code + Authorize) shows immediately instead of behind a "Sign in" click. - Remove the redundant top button for device auth providers, move retry into the panel via an inline "Try again". - Drop the self-evident hint text and add an execCommand clipboard fallback so Copy works in non-secure (HTTP/LAN) contexts. * fix: harden chatgpt subscription provider * chore: remove PR media from branch * Fix chatgpt subscription recovery and token handling --------- Co-authored-by: 5p00kyy <admin@5p00ky.dev>	2026-06-08 10:19:18 +02:00
Mike	ac94885c84	refactor(constants): single source of truth for data dir (#3368 ) * refactor(constants): single source of truth for data dir + merge core/src constants Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(contributing): use named src.constants for data paths, drop core/constants references Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 09:58:52 +02:00
danielroytel	5d3e3c7053	feat(tasks): assign folder='Tasks' at creation + backfill migration (#2834 ) * feat: assign folder='Tasks' to task sessions at creation Task sessions (LLM, action, research) now set folder='Tasks' on their DbSession row, matching the pattern used by the Assistant folder. This enables sidebar lens filtering without changing existing session behaviour. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add backfill script for task session folders One-shot script to set folder='Tasks' on existing [Task]/[Research] sessions that predate the folder assignment in task_scheduler.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: replace standalone backfill script with automatic migration Convert scripts/backfill_task_folders.py into _migrate_backfill_task_folders() in core/database.py, called from init_db(). The migration is idempotent (only touches rows where folder IS NULL/empty) and runs automatically on upgrade, so operators no longer need a manual step to tag pre-existing task sessions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 15:33:17 +02:00
Nicholai	463713c2c6	feat(search): unify session transcript search (#2877 )	2026-06-05 18:08:31 -06:00
Logan Davis	ad82ee1c83	feat(calendar): support multiple CalDAV accounts (#2942 ) * feat(calendar): support multiple CalDAV accounts Replaces the single CalDAV credential slot with a named account list so users can sync both a personal and work calendar simultaneously. - Add `account_id` column to `CalendarCal` + startup migration - `_load_caldav_accounts()` in caldav_sync.py reads `caldav_accounts` list from prefs, auto-migrating the legacy single `caldav` key on first use (no user action required) - `sync_caldav()` iterates all accounts and aggregates counts/errors - `writeback_event()` resolves credentials via `CalendarCal.account_id`, falling back to the first account for legacy rows - New REST endpoints: GET/POST/PUT/DELETE `/api/calendar/config/accounts` - Legacy GET/POST `/api/calendar/config` preserved for backward compat - Settings UI: one card per account with Label, URL, Username, Password fields; Test button works for both unsaved (inline creds) and saved (by account_id) accounts; delete removes only that account - Update test_caldav_url_hardening.py mock to include `_save_for_user` and updated `_sync_blocking` signature * fix(calendar): restore #2765 PK scoping and #2819 writeback URL validation Two regressions introduced by the multi-account refactor: 1. PK collision (#2765): _stable_cal_id was back to hashing only the URL, so two users — or one user with two accounts on the same server — would collide on the primary key. Restore owner+account_id in the hash key (format: "{owner}\n{account_id}\n{url}") and thread both values through _sync_blocking → _writeback_blocking → push_event → find_remote_calendar so the hash round-trips correctly on write-back. 2. URL validation dropped (#2819): _load_caldav_accounts imported _save_for_user at function scope, causing an ImportError on test mocks that only provide _load_for_user, which prevented writeback_event from reaching the validate_caldav_url call. Move the import inside the migration branch and wrap in try/except (best-effort save; next call re-migrates from the still-present legacy key). Update fake_writeback_blocking in test_caldav_writeback.py to accept the new owner/account_id optional params.	2026-06-05 20:32:50 +02:00
L1	f8cf791491	fix(caldav): don't prune locally-created events on sync (#2706 ) The CalDAV pull prunes events in the synced calendar+window whose UID the server didn't just return, to propagate upstream deletions. But CalendarEvent had no field distinguishing a server-pulled row from a locally-created one, so the prune also deleted events that were never on the server: events created by the agent / email triage (which never write back to the server) and UI events whose best-effort write-back failed. Result: silent, unrecoverable loss of the user's appointments (hard db.delete, no soft-delete). Add an 'origin' column to calendar_events (lightweight idempotent migration, mirroring _migrate_add_calendar_is_utc), set origin='caldav' on rows the sync inserts/updates, and gate the prune on origin == 'caldav'. Locally-created events carry origin NULL and are never pruned. On the first sync after the migration nothing is pruned (all rows NULL until re-marked), erring toward keeping data. Fixes #2704	2026-06-05 02:48:03 +02:00
Abylaikhan Zulbukharov	1d80bf5e65	feat(mcp): add Streamable HTTP transport with OAuth 2.0 (#1033 ) * feat(mcp): add Streamable HTTP transport with OAuth 2.0 Odysseus could only reach MCP servers over stdio and SSE, so modern remote servers like https://mcp.higgsfield.ai/mcp (Streamable HTTP, gated behind OAuth) could not be connected. Add an `http` transport that connects via the SDK's streamablehttp_client and authenticates with the SDK's OAuthClientProvider: RFC 9728 protected-resource discovery, RFC 8414 authorization-server metadata, Dynamic Client Registration, authorization-code + PKCE, and token refresh. A small bridge (src/mcp_oauth.py) connects the SDK's blocking callback to the existing web callback route via an asyncio.Future keyed by the OAuth `state`, and the dynamic client registration plus tokens persist per-server in a new encrypted `oauth_tokens` column. The connect runs as a bounded background task so the "Add server" request returns immediately; redirect_handler publishes needs_auth + auth_url to connection state as soon as discovery/DCR completes (which can exceed the bounded wait), and the UI polls until connected. Remote users finish via the existing paste-back flow. The Google OAuth path is left unchanged. - core/database.py: encrypted oauth_tokens column + migration - src/mcp_oauth.py: OAuth provider, DB-backed TokenStorage, state registry - src/mcp_manager.py: http dispatch, background connect, _connect_http - routes/mcp_routes.py: http validation, needs_auth/auth_url, callback bridge - static/js/settings.js: Streamable HTTP option + OAuth flow with polling - tests: 5 new unit tests (transport dispatch, registry, token storage) Verified against the live Higgsfield server: discovery, DCR (client_id issued), loopback redirect accepted, and a PKCE authorization URL with needs_auth status. No regressions (full suite delta is only the 5 added passing tests). * fix(mcp): address PR #1033 review feedback - mcp_oauth: derive redirect URI from OAUTH_REDIRECT_BASE_URL/APP_PUBLIC_URL (default http://localhost:7000) instead of hardcoding the port - mcp_oauth: leave OAuth scope unset so the SDK derives it from the server's WWW-Authenticate/protected-resource metadata; hardcoding an OIDC scope broke non-OpenID MCP servers (verified: Higgsfield still gets its server-derived scope) - mcp_oauth: prune abandoned OAuth flows (_prune_stale + _pending_ts) so the module-level registries can't grow unbounded - mcp_oauth: persist tokens/client-info in a single DB session/commit (_update) instead of a load+save double round-trip - mcp_manager: cancel and drop the background connect task in disconnect_server so a deleted server stops publishing status - database: document why the oauth_tokens migration uses TEXT while the model declares EncryptedText (encryption is applied at the Python layer) - settings.js: surface persistent OAuth-poll failures and an explicit timeout message instead of silently swallowing errors - tests: cover the stale-flow pruning * static/js/settings.js now shows an in-flight loading state on the buttons that fire requests:	2026-06-05 02:40:52 +02:00
Yuri	a2e691da2b	fix(models): stabilize proxy endpoint refresh behavior * fix: support large proxy model endpoint refresh Large OpenAI-compatible proxy endpoints can expose hundreds of models and make /v1/models slow. Treating those endpoints like local model servers caused model picker opens and background probes to repeatedly hit /models, producing timeouts and making otherwise usable endpoints appear offline. Make model endpoint discovery cached-first for normal UI usage, add explicit proxy/API classification and refresh policy fields, exclude proxy/API endpoints from aggressive local probing, and preserve cached models when refresh fails. Manual Test/Add/Refresh actions still fetch the full model list with longer timeouts so users can intentionally import large proxy model lists without blocking normal model picker usage. * fix: preserve endpoint ping status semantics	2026-06-04 04:56:11 +01:00
ghreprimand	82fcec6bb6	Replace core database utcnow defaults (#1457 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-04 02:50:19 +01:00
pewdiepie-archdaemon	6861c41580	Reapply "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus" This reverts commit `cc8fe2f6e3`.	2026-06-03 22:47:00 +09:00
pewdiepie-archdaemon	cc8fe2f6e3	Revert "Merge branch 'main' of github.com:pewdiepie-archdaemon/odysseus" This reverts commit `8161c1253d`, reversing changes made to `8c2705b42a`.	2026-06-03 22:46:19 +09:00
Alexandre Teixeira	145f4fd2b4	feat(models): support pinned endpoint model IDs	2026-06-03 13:00:07 +01:00
Tatlatat	8ad436d25a	DB: enable SQLite foreign key cascades * fix(db): enable SQLite foreign keys so ondelete cascades actually fire core/database.py declares DB-level FK actions throughout (ondelete="CASCADE" / "SET NULL"), but SQLite disables foreign-key enforcement per connection by default and the engine had no connect-event listener turning it on. So every one of those ondelete actions was dead. Concrete impact: cleanup_old_sessions() in src/cleanup_service.py removes old sessions with a bulk `query(Session).delete()`, which bypasses the ORM-level relationship cascade and relies solely on the DB-level ondelete="CASCADE" on ChatMessage.session_id. With foreign keys off, the messages are never deleted — they pile up as orphaned rows on every cleanup cycle. Add the standard SQLAlchemy connect listener issuing `PRAGMA foreign_keys=ON`, guarded by `isinstance(conn, sqlite3.Connection)` so it only affects SQLite and leaves other backends untouched. tests/test_sqlite_foreign_keys.py inserts a Session + ChatMessage, deletes the Session via bulk `query().delete()`, and asserts the ChatMessage is cascade-deleted. Fails before this change (orphan remains). * docs(db): clarify FK pragma scope per review; trim test comments Address review feedback on the foreign_keys PRAGMA change: - Note that the class-level connect listener fires for every Engine in the process and is a no-op on non-SQLite backends (isinstance guard). - Warn near init_db() that FK enforcement is now global, so a migration that temporarily violates FK constraints must disable foreign_keys around that work. - Drop the step-by-step narration comments from the regression test. No behavior change.	2026-06-02 20:36:13 +09:00
mechramc	9d0a18a5b5	Email: add explicit SMTP security mode	2026-06-02 13:15:06 +09:00
Collin	70a71f603c	Scope email calendar extraction to account owner The email auto-calendar pass (settings.email_auto_calendar / the extract_email_events task) scans recently received mail and lets an LLM create / update / cancel calendar events. Two problems made it a cross-tenant, remotely triggerable hole: 1. No owner scoping. _auto_summarize_pass(account_id=None) fans out over EVERY enabled account of EVERY user. For each message it fetched an upcoming-events snapshot with NO owner filter (all tenants' events) and handed those uids + titles to the extraction LLM, then executed the model's ops via do_manage_calendar(...) with owner=None. do_manage_calendar only filters by owner when owner is not None, so create/update/delete ran across ALL users' calendars. Net: every user's event titles/times were disclosed to the model, and the model could cancel/move/duplicate any tenant's events by uid. 2. No prompt-injection wrapping. The raw email From/Subject/body were interpolated straight into an instruction-shaped extraction prompt (unlike the chat path, which wraps external text via src/prompt_security). Anyone who can email a user whose instance has auto-calendar enabled could inject operations: create attacker-controlled "meeting" events (the path even auto-harvests URLs from the body into the event location/description — a phishing primitive) or cancel/modify the victim's real events, with zero human in the loop. Fix: - Add core.database.get_upcoming_events(owner) and use it for the snapshot, so the LLM only ever sees the processed account owner's events. - Look up the EmailAccount owner in _auto_summarize_pass_single and pass owner= to every do_manage_calendar call, so create/update/delete are scoped to that user (owner=None stays the single-user / legacy escape hatch). - Tell the extraction model the email is untrusted data and not to follow instructions inside it (defense-in-depth against injection). Add tests/test_calendar_owner_scope.py: get_upcoming_events returns only the given owner's events (and everything when owner is None). Fails against the old unscoped query.	2026-06-01 23:12:32 +09:00
pewdiepie-archdaemon	0888a3b3e6	Add native Windows compatibility layer	2026-06-01 15:09:47 +09:00
Collin	0a7de1fdf4	fix: stop leaking DB connections when persisting session mode (#64 ) chat_routes.py persisted a session's "mode" in three best-effort spots — reading the current mode, writing the effective mode, and setting research_pending on the stream path. Each opened a session with SessionLocal() and called .close() as the LAST statement inside a try/except, so if anything before close() raised (e.g. a SQLite "database is locked" under concurrent chat streams) the except only logged and the connection was never returned to the pool. DATABASE_URL defaults to file-backed SQLite, whose engine uses SQLAlchemy's default QueuePool (5 connections + 10 overflow). Repeated leaks on these hot paths exhaust the pool; later requests then block for pool_timeout and fail with "QueuePool limit ... reached", taking the app down until restart. Move the logic into two best-effort helpers in core.database, next to the existing session helpers (update_session_last_accessed, get_session_by_id): - get_session_mode(session_id) -> Optional[str] - set_session_mode(session_id, mode) -> bool Both route through the existing get_db_session() context manager, which commits on success, rolls back on error, and always closes in a finally, so the connection is returned to the pool on every path. chat_routes.py now calls these instead of hand-rolling sessions, also removing three copies of the same try/except. Add tests/test_session_mode_helpers.py: the helpers commit+close on success and, on a mid-operation DB error, swallow + roll back + close (no leak). The error-path tests fail against the old close()-inside-try pattern.	2026-06-01 13:57:48 +09:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

21 Commits