Commit Graph

19 Commits

Author SHA1 Message Date
Lucas Daniel f4e8990635 chore: add warnings to silent except Exception blocks (#3212)
* log(app): add warnings to silent except Exception blocks

- Internal tool auth header failure now logs a warning instead of
  silently passing, making auth bypass easier to spot in logs.
- Token last_used_at update failure now logs at DEBUG (fire-and-forget,
  non-critical, but useful when debugging token tracking issues).
- Image ownership verification failure now logs a warning so unexpected
  access-check errors surface instead of silently allowing the request.

* log(chat_routes): add warnings to silent except Exception blocks

- clear_orphaned_session_endpoint: log before rollback so failures
  appear in traces when users see stale/deleted model options.
- _endpoint_has_model (JSON parse): log malformed cached_models instead
  of silently treating endpoint as valid.
- _has_any_visible_model (JSON parse): log malformed cached_models
  instead of silently returning empty list.
- timezone header parse: log failure so time-zone-related tool bugs
  (wrong scheduled times, calendar events) are traceable.
- attachments JSON parse: log failure so silently-dropped attachments
  are visible in server logs.

* log(email_routes): add warnings to silent except Exception blocks

- Email alias resolution failure now logs a warning instead of silently
  returning an empty list, making broken account configs diagnosable.

* log(document_routes): add warnings to silent except Exception blocks

- Export ZIP request body parse failure now logs a warning so empty
  exports caused by malformed requests are diagnosable.
- clear_active_document failure on detach now logs a warning to help
  trace doc re-injection bugs like #1160.

* log(agent_loop): add warnings to silent except Exception blocks

- builtin tool overrides load failure now logs a warning so misconfigured
  settings don't silently fall back to defaults without a trace.
- Timezone context injection failure now logs a warning to help debug
  incorrect scheduled times in agent-created tasks.
- PDF form-backed document detection failure now logs a warning so
  broken form-doc UI is traceable to the root cause.

* log(llm_core): add warnings to silent except Exception blocks

- Malformed URL in _is_ollama_native_url now logs a warning so bad
  endpoint configs are traceable instead of silently returning False.
- Model list fetch failure now logs a warning with the endpoint URL so
  endpoints that silently vanish from the model picker are diagnosable.

* log: pass exception via exc_info instead of string interpolation

* fix(logging): avoid logging raw URLs in llm_core error paths

Drop the raw url/base_chat_url from the Ollama-detection and
model-list-fetch warning logs added by this sweep, since these values
can contain private hostnames, internal IPs, credentials, or other
deployment details.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-15 17:49:27 +01:00
Yeoh Ing Ji 3e49658204 refactor(tools): extract document tools to handle registry (#3666)
* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* feat(tools): add document management tool handlers to the agent_tools module

* feat(tools): extraced document tools for create, update, edit, suggest, and manage from tool_implementations.py

* feat(tests): refactor document tool tests to use TOOL_HANDLERS and document_tools

* refactor(tools): add document tool dispatcher and updated tool calling path

* refactor(tools): remove duplicated document management functions

* refactor(tools): removing unused functions and adding new import paths

* refactor(tools): update document tool execute methods to use context dictionary

* refactor(tests): update import paths for document tools in test files

* refactor(tests): update owner parameter format in document management tests

* refactor(tests): update import path for _owned_document_query

* refactor: update import paths for document tools

* fix(tests): correct source path for document ID test
2026-06-10 10:41:52 +02:00
Mike ac94885c84 refactor(constants): single source of truth for data dir (#3368)
* refactor(constants): single source of truth for data dir + merge core/src constants

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(contributing): use named src.constants for data paths, drop core/constants references

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 09:58:52 +02:00
nubs 1a0e1c5d69 fix(documents): restore PDF library metadata and preview (#2483)
PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf.

Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs.

The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none.

Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker.

Fixes #2468
2026-06-07 23:23:27 +02:00
Kenny Van de Maele 613bbb0dba fix: port main-only fixes to dev (#2761 sharpen auth, #2762 doc version 404) (#3303)
* fix(gallery): add auth check to /api/image/sharpen endpoint (#2761)

Every other image-processing endpoint (denoise, upscale, remove-bg,
enhance-face, inpaint, harmonize) calls require_privilege(request,
"can_generate_images"). The sharpen endpoint was missing this check,
allowing unauthenticated users to trigger CPU-intensive image processing.

* fix(document): add 404 guard to version list/get endpoints (#2762)

list_versions and get_version used a soft 'if doc:' guard that skipped
ownership verification when the Document row was missing (e.g. after
hard delete). Orphaned DocumentVersion rows would be returned to any
caller without auth. Now raises 404 when the parent document is gone,
matching the pattern already used in restore_version.

---------

Co-authored-by: Ernest Hysa <59969602+ErnestHysa@users.noreply.github.com>
2026-06-07 17:19:24 +02:00
Vykos 06d28e23ac Scope document session links by owner (#3005) 2026-06-07 12:47:20 +02:00
Vykos 3cff06781e Scope model helper endpoint resolution (#3007) 2026-06-07 12:40:23 +02:00
Vykos ff4508d396 Scope vision model resolution by owner (#3009) 2026-06-07 12:39:02 +02:00
ooovenenoso c9d0c6db18 fix: quote IMAP mailbox arguments (#2170)
* fix: quote IMAP mailbox arguments

* fix: quote MCP move destinations

---------

Co-authored-by: Kevin <120500656+oooindefatigable@users.noreply.github.com>
2026-06-05 16:00:20 +02:00
Kenny Van de Maele e9dfd1a747 Remove unused UPLOAD_DIR imports in document_routes
routes/document_routes.py imports UPLOAD_DIR from src.constants in 8
separate function bodies but never uses it (pyflakes: 'imported but
unused' ×8). Drop the dead imports — no behaviour change.
2026-06-04 13:17:21 +02:00
Afonso Coutinho 992866e167 fix: document library language facet undercounts text documents (#1758) 2026-06-03 13:28:38 +09:00
Paulo Victor Cordeiro 441905bc13 fix: guard AI tidy verdict against non-string LLM output (#1486)
The AI document-tidy endpoint parses verdicts from LLM JSON output
and calls .lower().strip() directly. If the model returns null or a
non-string element, this crashes with AttributeError. Coerce to str
so malformed output is treated as 'keep' instead of crashing.
2026-06-03 08:14:10 +09:00
lekt8 adde94e430 fix: closed document stays active & leaks into new chats (#1160) (#1238)
* fix: closed document no longer stays active and leaks into new chats (#1160)

Closing a document tab calls _detachDocFromSession: a doc with content is
PATCHed to session_id="" (unlinked, session_id -> NULL, is_active stays True),
an empty one is DELETEd. But the in-memory active-document pointer
(tool_implementations._active_document_id) was never cleared on either path.

The chat doc-injection last-resort looks up that pointer by id and injects it
when `not cand.session_id or cand.session_id == session`. An unlinked doc has
session_id NULL, so the stale pointer re-surfaced a closed document in later,
unrelated chats — the agent kept reading/suggesting edits to a doc the user
had closed.

Fix: add clear_active_document(doc_id) and call it when a document is unlinked
(PATCH session_id="") or deleted, so the pointer no longer resurrects a closed
document. clear_active_document only clears when the id matches (or no id), so a
different active doc is left untouched.

Covered by tests/test_active_document_clear.py (4 cases).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: add route-level regression for #1160 (detach/delete clears active doc)

Per review: prove the actual API path, not just the helper. Drives
PATCH /api/document/{id} (session_id="") and DELETE /api/document/{id}
through TestClient against a temp SQLite DB under real owner routing, and
asserts get_active_document() is cleared (and untouched when a different
document is closed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: make #1160 route regression hang-proof and dev-DB-independent

The route test could hang in other environments: it set DATABASE_URL at import
time, which is ignored if core.database was already imported, so it fell back to
the real dev DB and could contend for its locks (maintainer saw it hang, exit
124).

Rebind to a DEDICATED temporary SQLite engine (NullPool) and patch the document
route module's SessionLocal to it via an autouse fixture — so the test never
touches the dev DB and is independent of import order. Runs in ~0.3s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test: drive #1160 route regression without TestClient (fixes local hang)

The route test used Starlette TestClient (middleware app + threadpool), which
hung in the maintainer's environment. Rework it to call the async route handlers
directly — extracted from the router — with a minimal fake request against a
temp-SQLite-patched SessionLocal. Same real coverage (handler + DB + owner
routing), but it completes reliably (~0.3s) with no TestClient/threadpool.

Verified the maintainer's exact batch now passes:
  pytest tests/test_document_close_clears_active_route.py \
         tests/test_active_document_clear.py \
         tests/test_document_tool_owner_scope.py  -> 14 passed

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 01:47:13 +09:00
SurprisedDuck 78747b56ca Documents: strip PDF marker without corrupting text
_process_pdf prepends "\n\n[PDF content]:" to extracted text, and two
call sites in document_routes.py stripped it with .lstrip("\n[PDF content]:").
str.lstrip(chars) treats its argument as a *set of characters*, so it keeps
eating into the page text that follows the marker — e.g. a body starting
with "to the board" loses its leading "to" because 't'/'o' are in the
marker's character set. Replace both sites with a shared
strip_pdf_content_marker() helper that uses str.removeprefix.
2026-06-02 20:35:27 +09:00
Duarte Antunes 448401a0fc Harden PDF document markers against cross-owner upload access (#445)
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-01 22:38:14 +09:00
Miles df7d32c70c Require document privilege for PDF imports 2026-06-01 18:28:15 +09:00
red person 2f87dbcfbc Show a clear message when PyMuPDF is missing 2026-06-01 18:27:17 +09:00
Duarte Antunes e77d87fa80 Enforce owner checks for upload attachments 2026-06-01 16:47:48 +09:00
pewdiepie-archdaemon e5c99a5eee Odysseus v1.0 2026-05-31 23:58:26 +09:00