odysseus

mirror of https://github.com/pewdiepie-archdaemon/odysseus.git synced 2026-06-18 18:55:28 -04:00

Author	SHA1	Message	Date
nubs	1a0e1c5d69	fix(documents): restore PDF library metadata and preview (#2483 ) PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf. Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs. The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none. Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker. Fixes #2468	2026-06-07 23:23:27 +02:00
Vykos	ff4508d396	Scope vision model resolution by owner (#3009 )	2026-06-07 12:39:02 +02:00
ooovenenoso	e163384015	fix: treat Nix files as readable uploads (#2249 )	2026-06-04 12:06:24 +02:00
ghreprimand	8c4ea484a9	Cap inline attachment context across files (#1498 ) Co-authored-by: ghreprimand <203024559+ghreprimand@users.noreply.github.com>	2026-06-03 14:23:43 +09:00
Wes Huber	49885ff9e7	fix(documents): use strip_pdf_content_marker instead of lstrip for PDF auto-open (#1727 ) lstrip("\n[PDF content]:") treats the argument as a character set, not a prefix, so it chews into the following [Page N text]: marker — e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper strip_pdf_content_marker (which uses removeprefix) already exists in the same file and is used by other call sites. Fixes #1663 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-06-03 13:30:04 +09:00
SurprisedDuck	78747b56ca	Documents: strip PDF marker without corrupting text _process_pdf prepends "\n\n[PDF content]:" to extracted text, and two call sites in document_routes.py stripped it with .lstrip("\n[PDF content]:"). str.lstrip(chars) treats its argument as a set of characters, so it keeps eating into the page text that follows the marker — e.g. a body starting with "to the board" loses its leading "to" because 't'/'o' are in the marker's character set. Replace both sites with a shared strip_pdf_content_marker() helper that uses str.removeprefix.	2026-06-02 20:35:27 +09:00
Marius Oppedal Ringsby	f58fbc8b85	Add optional markitdown extraction for Office/EPUB documents (#766 ) Office documents were dropped server-side: .docx fell through to "[Attached document file]", .xlsx/.pptx weren't recognized at all, and the personal-docs RAG index only covered txt/md/json/pdf. Wire the optional markitdown dependency (MIT, Microsoft) into both the chat-attachment path (build_user_content) and the RAG indexer (personal_docs), converting .docx/.xlsx/.pptx/.xls/.epub to Markdown. It is lazy-imported with graceful fallback (mirrors src/pdf_runtime.py): without it those formats show an "install to extract" banner and the MIT core is unaffected. pypdf stays the default PDF path. - src/markitdown_runtime.py: optional-dep loader + convert_to_markdown - upload_handler: recognize Office/EPUB extensions + MIME types - document_processor: extract Office docs in the chat else-branch - personal_docs: index Office docs (DEFAULT_EXTENSIONS + dispatch) - requirements-optional.txt + ACKNOWLEDGMENTS.md: pinned markitdown 0.1.5 - tests: markitdown_runtime + office index coverage Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 11:28:52 +09:00
Duarte Antunes	e77d87fa80	Enforce owner checks for upload attachments	2026-06-01 16:47:48 +09:00
Juan Pablo Jiménez	4a04068818	Fix vision attachment timeout and stale cache Increase local vision model timeout and avoid caching transient VL failure placeholders.\n\nCloses #202.	2026-06-01 02:04:46 +00:00
pewdiepie-archdaemon	e5c99a5eee	Odysseus v1.0	2026-05-31 23:58:26 +09:00

10 Commits