* refactor(constants): single source of truth for data dir + merge core/src constants
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* docs(contributing): use named src.constants for data paths, drop core/constants references
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf.
Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs.
The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none.
Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker.
Fixes#2468
* fix(gallery): add auth check to /api/image/sharpen endpoint (#2761)
Every other image-processing endpoint (denoise, upscale, remove-bg,
enhance-face, inpaint, harmonize) calls require_privilege(request,
"can_generate_images"). The sharpen endpoint was missing this check,
allowing unauthenticated users to trigger CPU-intensive image processing.
* fix(document): add 404 guard to version list/get endpoints (#2762)
list_versions and get_version used a soft 'if doc:' guard that skipped
ownership verification when the Document row was missing (e.g. after
hard delete). Orphaned DocumentVersion rows would be returned to any
caller without auth. Now raises 404 when the parent document is gone,
matching the pattern already used in restore_version.
---------
Co-authored-by: Ernest Hysa <59969602+ErnestHysa@users.noreply.github.com>
routes/document_routes.py imports UPLOAD_DIR from src.constants in 8
separate function bodies but never uses it (pyflakes: 'imported but
unused' ×8). Drop the dead imports — no behaviour change.
The AI document-tidy endpoint parses verdicts from LLM JSON output
and calls .lower().strip() directly. If the model returns null or a
non-string element, this crashes with AttributeError. Coerce to str
so malformed output is treated as 'keep' instead of crashing.
* fix: closed document no longer stays active and leaks into new chats (#1160)
Closing a document tab calls _detachDocFromSession: a doc with content is
PATCHed to session_id="" (unlinked, session_id -> NULL, is_active stays True),
an empty one is DELETEd. But the in-memory active-document pointer
(tool_implementations._active_document_id) was never cleared on either path.
The chat doc-injection last-resort looks up that pointer by id and injects it
when `not cand.session_id or cand.session_id == session`. An unlinked doc has
session_id NULL, so the stale pointer re-surfaced a closed document in later,
unrelated chats — the agent kept reading/suggesting edits to a doc the user
had closed.
Fix: add clear_active_document(doc_id) and call it when a document is unlinked
(PATCH session_id="") or deleted, so the pointer no longer resurrects a closed
document. clear_active_document only clears when the id matches (or no id), so a
different active doc is left untouched.
Covered by tests/test_active_document_clear.py (4 cases).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: add route-level regression for #1160 (detach/delete clears active doc)
Per review: prove the actual API path, not just the helper. Drives
PATCH /api/document/{id} (session_id="") and DELETE /api/document/{id}
through TestClient against a temp SQLite DB under real owner routing, and
asserts get_active_document() is cleared (and untouched when a different
document is closed).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: make #1160 route regression hang-proof and dev-DB-independent
The route test could hang in other environments: it set DATABASE_URL at import
time, which is ignored if core.database was already imported, so it fell back to
the real dev DB and could contend for its locks (maintainer saw it hang, exit
124).
Rebind to a DEDICATED temporary SQLite engine (NullPool) and patch the document
route module's SessionLocal to it via an autouse fixture — so the test never
touches the dev DB and is independent of import order. Runs in ~0.3s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* test: drive #1160 route regression without TestClient (fixes local hang)
The route test used Starlette TestClient (middleware app + threadpool), which
hung in the maintainer's environment. Rework it to call the async route handlers
directly — extracted from the router — with a minimal fake request against a
temp-SQLite-patched SessionLocal. Same real coverage (handler + DB + owner
routing), but it completes reliably (~0.3s) with no TestClient/threadpool.
Verified the maintainer's exact batch now passes:
pytest tests/test_document_close_clears_active_route.py \
tests/test_active_document_clear.py \
tests/test_document_tool_owner_scope.py -> 14 passed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_process_pdf prepends "\n\n[PDF content]:" to extracted text, and two
call sites in document_routes.py stripped it with .lstrip("\n[PDF content]:").
str.lstrip(chars) treats its argument as a *set of characters*, so it keeps
eating into the page text that follows the marker — e.g. a body starting
with "to the board" loses its leading "to" because 't'/'o' are in the
marker's character set. Replace both sites with a shared
strip_pdf_content_marker() helper that uses str.removeprefix.
Route PDF lookups through UploadHandler.resolve_upload, reject poisoned pdf_source markers on document create/update, and add regression tests.
Co-authored-by: Cursor <cursoragent@cursor.com>