Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:
Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
with a cross-link to AI Defaults. Provider hints use real clickable
anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
carries only Endpoint+Model with a cross-link to Search; Vision /
Default / Utility fallbacks unified under one numbered-row design
matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
Permissions toggle that expands the scope-edit panel, in-field
copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
under the button (drop-up on tight viewports); each integration
form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
row at the bottom that loops every category.
AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
persona's system prompt; empty/unknown falls back to warm-neutral.
Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
so dismissing a popup no longer closes the whole Settings modal.
Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
search/integrations/reminders for card h2 icons + the Added Models
sub-section icons.
Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
blocks; shorter subtitle wording.
Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
every integration form.
- Provider logos render inline next to the search fallback selects
and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
button.
- Image generation default flipped to off.
PDF uploads are stored as markdown wrappers with pdf_source or pdf_form_source markers so the editor can preserve extracted text, form fields, and annotations. The library exposed that internal wrapper: auto-created PDF documents used the hashed storage filename as the title, and row/facet language reported markdown instead of pdf.
Derive chat-upload PDF titles from the original upload name, derive document-library display language from the PDF source marker for rows, filters, and facets, and keep markdown wrappers excluded from the markdown facet when they represent PDFs.
The expanded library card already renders PDF-backed documents through /api/document/{id}/render-pdf. Allow only that inline PDF preview endpoint to be framed by same-origin app pages while leaving normal routes on X-Frame-Options: DENY and frame-ancestors none.
Also tighten the existing PDF marker regression assertion so it matches the actual historical corruption signature instead of contradicting the preserved [Page 1 text]: marker.
Fixes#2468
lstrip("\n[PDF content]:") treats the argument as a character set,
not a prefix, so it chews into the following [Page N text]: marker —
e.g. turning [Page 1 text]: into "age 1 text]:". The correct helper
strip_pdf_content_marker (which uses removeprefix) already exists in
the same file and is used by other call sites.
Fixes#1663
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_process_pdf prepends "\n\n[PDF content]:" to extracted text, and two
call sites in document_routes.py stripped it with .lstrip("\n[PDF content]:").
str.lstrip(chars) treats its argument as a *set of characters*, so it keeps
eating into the page text that follows the marker — e.g. a body starting
with "to the board" loses its leading "to" because 't'/'o' are in the
marker's character set. Replace both sites with a shared
strip_pdf_content_marker() helper that uses str.removeprefix.
Office documents were dropped server-side: .docx fell through to
"[Attached document file]", .xlsx/.pptx weren't recognized at all, and
the personal-docs RAG index only covered txt/md/json/pdf.
Wire the optional markitdown dependency (MIT, Microsoft) into both the
chat-attachment path (build_user_content) and the RAG indexer
(personal_docs), converting .docx/.xlsx/.pptx/.xls/.epub to Markdown.
It is lazy-imported with graceful fallback (mirrors src/pdf_runtime.py):
without it those formats show an "install to extract" banner and the
MIT core is unaffected. pypdf stays the default PDF path.
- src/markitdown_runtime.py: optional-dep loader + convert_to_markdown
- upload_handler: recognize Office/EPUB extensions + MIME types
- document_processor: extract Office docs in the chat else-branch
- personal_docs: index Office docs (DEFAULT_EXTENSIONS + dispatch)
- requirements-optional.txt + ACKNOWLEDGMENTS.md: pinned markitdown 0.1.5
- tests: markitdown_runtime + office index coverage
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>