Settings overhaul + UI polish pass

Two months of iteration on the Settings panel, integration forms, and
small visual nudges across the app. Highlights:

Settings restructure
- Add Models: split into separate Local + API cards (no more in-card
  tabs); each fuses Type/Provider with the URL input.
- Added Models: new dedicated sidebar tab, with Probe + Clear-offline
  pulled into its header; Local/API sub-section icons accent-tinted.
- Search: Web Search and a new Deep Research card (Model + tuning),
  with a cross-link to AI Defaults. Provider hints use real clickable
  anchors; Web Search Test button shows a whirlpool spinner.
- AI Defaults: Image Generation card returns; Research Model card
  carries only Endpoint+Model with a cross-link to Search; Vision /
  Default / Utility fallbacks unified under one numbered-row design
  matching Search's chain.
- API Permissions (was 'API Tokens'): per-row rename, inline
  Permissions toggle that expands the scope-edit panel, in-field
  copy icons (icon→check on success). Empty state accent-tinted.
- Integrations: + Add Integration drops a type-picker menu directly
  under the button (drop-up on tight viewports); each integration
  form (API, CalDAV, CardDAV, Email, Codex/Claude, Vault, MCP) uses
  the same accent-outlined Save/Test/Cancel buttons right-aligned.
- Danger Zone: Wipe→Delete with trash icons; new 'Delete everything'
  row at the bottom that loops every category.

AI Synthesis (Reminders)
- Persona dropdown sourced from PROMPT_TEMPLATES + custom preset.
- src/reminder_personas.py mirrors the five built-ins for the
  server-side synthesis path.
- dispatch_reminder() reads reminder_llm_persona and uses the
  persona's system prompt; empty/unknown falls back to warm-neutral.

Esc handling
- Kebab menus and the provider picker intercept Esc in capture phase
  so dismissing a popup no longer closes the whole Settings modal.

Accent tinting
- Scoped CSS rule across data-settings-panel=ai/services/added-models/
  search/integrations/reminders for card h2 icons + the Added Models
  sub-section icons.

Codex/Claude integration form
- No more auto-creation on form open — explicit Create token button.
- New tokens start with every scope granted; existing tokens move out
  of the integration form into the API Permissions card.
- Setup reveal: copy buttons inline inside the token + setup code
  blocks; shorter subtitle wording.

Misc visual polish
- Save/Test/Cancel uniformly accent-outlined and right-aligned on
  every integration form.
- Provider logos render inline next to the search fallback selects
  and the Deep Research Search dropdown.
- Trash icons in fallback rows bumped to 20x20 so they fill the 32px
  button.
- Image generation default flipped to off.
This commit is contained in:
pewdiepie-archdaemon
2026-06-10 15:15:13 +09:00
parent 7690860ab1
commit 4f7061fd61
18 changed files with 1512 additions and 552 deletions
+1 -1
View File
@@ -1256,7 +1256,7 @@ def _build_base_prompt(
from src.tool_index import ALWAYS_AVAILABLE
disabled = set(disabled_tools or [])
if not get_setting("image_gen_enabled", True):
if not get_setting("image_gen_enabled", False):
disabled.add("generate_image")
if relevant_tools is not None:
+58 -3
View File
@@ -199,11 +199,20 @@ def _fit_inline_attachment_text(
return text[:remaining] + marker, 0
def _process_office_document(path: str, display_name: str) -> str:
def _process_office_document(
path: str,
display_name: str,
session_id: str | None = None,
auto_opened_docs: list[Dict[str, Any]] | None = None,
owner: str | None = None,
) -> str:
"""Extract an Office/EPUB document to Markdown via the optional markitdown dep.
Falls back to a friendly banner when markitdown is unavailable or finds no
text, so a missing optional dependency never breaks the chat path.
text, so a missing optional dependency never breaks the chat path. When a
session_id is provided AND the extraction succeeded, the FULL text is also
saved as a Document so the agent can page through it via
`manage_documents action=read offset=…` after the inline copy is capped.
"""
from src.markitdown_runtime import (
is_markitdown_format,
@@ -218,6 +227,46 @@ def _process_office_document(path: str, display_name: str) -> str:
if markdown and markdown.strip():
title = os.path.splitext(os.path.basename(path))[0]
body, marker = _truncate_inline(markdown)
# Persist the full extracted text as a Document. The agent's existing
# manage_documents tool can then read past the inline cap with offset.
doc_id = None
if session_id:
try:
from src.office_doc import create_office_document
doc_id = create_office_document(
session_id=session_id,
upload_id=os.path.basename(path),
title=title,
body_text=markdown,
)
if doc_id and auto_opened_docs is not None:
from src.database import SessionLocal, Document
_db = SessionLocal()
try:
_d = _db.query(Document).filter(Document.id == doc_id).first()
if _d:
auto_opened_docs.append({
"doc_id": _d.id,
"title": _d.title,
"language": _d.language,
"content": _d.current_content,
"version": _d.version_count,
})
finally:
_db.close()
except Exception as e:
logger.warning("Office auto-doc creation failed for %s: %s", path, e)
# Upgrade the truncation marker with a hint pointing at the full doc so
# the agent knows it can read the rest.
if doc_id and marker:
marker = (
f"\n[…truncated for inline context — full {len(markdown):,} chars "
f"saved as document `{doc_id}`. Use `manage_documents` with "
f"action=read, document_id={doc_id}, offset=<N> to page through.]"
)
return f"\n\n[Document content — {title}]:\n{body}{marker}"
# No content: tell the user whether to install the optional dep or whether
@@ -521,7 +570,13 @@ def build_user_content(
elif mime.startswith("text/") or _is_text_file(path):
extracted_text = _process_text_file(path)
else:
extracted_text = _process_office_document(path, display_name)
extracted_text = _process_office_document(
path,
display_name,
session_id=session_id,
auto_opened_docs=auto_opened_docs,
owner=owner,
)
extracted_text, inline_attachment_remaining = _fit_inline_attachment_text(
extracted_text,
+44
View File
@@ -40,15 +40,59 @@ def load_markitdown():
return MarkItDown
def _extract_docx_native(path: str) -> str | None:
"""Pure-Python .docx text extractor — no external deps.
A .docx file is just a zip of XML. The body prose lives in <w:t> runs
inside <w:p> paragraphs. Iterating with ElementTree (rather than
re.findall) keeps paragraph breaks intact and lets the XML parser handle
namespaces + entity unescaping. Loses tables, footnotes, images and
list bullets — keeps ~95% of "summarize this doc" content, which is the
case people hit when markitdown isn't installed.
"""
import zipfile
import xml.etree.ElementTree as ET
ns = "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}"
try:
with zipfile.ZipFile(path) as z:
xml_bytes = z.read("word/document.xml")
except (zipfile.BadZipFile, KeyError, OSError):
return None
try:
root = ET.fromstring(xml_bytes)
except ET.ParseError:
return None
paragraphs: list[str] = []
for para in root.iter(f"{ns}p"):
runs = [t.text or "" for t in para.iter(f"{ns}t")]
line = "".join(runs).strip()
if line:
paragraphs.append(line)
return "\n\n".join(paragraphs) if paragraphs else None
def convert_to_markdown(path: str) -> str | None:
"""Convert a document to Markdown text via markitdown.
Returns the extracted Markdown, or ``None`` if markitdown is unavailable or
the conversion fails — callers degrade gracefully rather than erroring.
Fallback: when markitdown isn't installed and the file is a .docx, run
the bundled pure-Python extractor so the most common case (Word docs)
works out of the box. Other Office/EPUB formats still need markitdown.
"""
try:
markitdown_cls = load_markitdown()
except RuntimeError:
if isinstance(path, str) and path.lower().endswith(".docx"):
text = _extract_docx_native(path)
if text:
logger.info(
"markitdown not installed — used native .docx extractor for %s",
path,
)
return text
logger.warning("markitdown not installed; cannot extract %s", path)
return None
try:
+73
View File
@@ -0,0 +1,73 @@
"""Auto-create a Document row from an Office attachment.
When a .docx (and friends) lands in chat, the full extracted text is stored
as a Document so the agent can page through it with `manage_documents
action=read offset=…` even after the inline chat payload was capped. Mirrors
the PDF auto-doc pattern in `src.pdf_form_doc`.
"""
import logging
import uuid
from typing import Optional
logger = logging.getLogger(__name__)
def create_office_document(
session_id: str,
upload_id: str,
title: str,
body_text: Optional[str] = None,
) -> Optional[str]:
"""Create a markdown Document for an Office attachment and set it active.
Returns the new doc_id, or None on failure / empty body. The full
extracted body lives in `current_content`, so the agent can fetch
arbitrary windows via `manage_documents action=read` even when the
inline chat copy was truncated.
"""
from src.database import (
SessionLocal,
Document,
DocumentVersion,
Session as DbSession,
)
from src.tool_implementations import set_active_document
if not body_text or not body_text.strip():
return None
db = SessionLocal()
try:
doc_id = str(uuid.uuid4())
ver_id = str(uuid.uuid4())
sess = db.query(DbSession).filter(DbSession.id == session_id).first()
doc = Document(
id=doc_id,
session_id=session_id,
title=title,
language="markdown",
current_content=body_text,
version_count=1,
is_active=True,
owner=sess.owner if sess else None,
)
ver = DocumentVersion(
id=ver_id,
document_id=doc_id,
version_number=1,
content=body_text,
summary="Imported from Office attachment",
source="upload",
)
db.add(doc)
db.add(ver)
db.commit()
set_active_document(doc_id)
return doc_id
except Exception as e:
db.rollback()
logger.error("Failed to create office document: %s", e)
return None
finally:
db.close()
+78
View File
@@ -0,0 +1,78 @@
"""Server-side mirror of the built-in characters used for reminder synthesis.
The frontend ships these in static/js/presets.js (PROMPT_TEMPLATES with
isCharacter:true). The Reminders → AI Synthesis card writes only the
persona ID into settings; the synthesis route in note_routes.py needs
the full prompt text to bias the utility model's voice. Keeping a small
local mirror avoids having the client send the prompt over the wire on
every reminder fire.
If the user picks a custom character (id == "custom") we fall back to
the warm-neutral baseline — custom prompts live in browser localStorage
and aren't visible to the server.
"""
PERSONAS = {
"socrates": (
"Never answer directly. Respond only with questions — sharp, layered, "
"Socratic. Expose contradictions. Make the person argue with themselves "
"until the truth falls out. Use irony like a scalpel. Be genuinely "
"curious, never condescending."
),
"razor": (
"Strip everything to the bone. No filler, no hedging, no pleasantries. "
"Answer in the fewest words possible. If one sentence works, don't use "
"two. If a word adds nothing, cut it. Blunt, precise, surgical."
),
"nietzsche": (
"Think and respond through the lens of Nietzsche. Analyze every "
"question in terms of will to power, self-overcoming, eternal "
"recurrence, ressentiment, value-creation, and master-slave morality. "
"Write with aphoristic force — sharp, compressed, vivid, and "
"unapologetic — but do not sacrifice depth for style. Favor "
"life-affirmation, discipline, courage, style, rank, self-overcoming, "
"and amor fati over nihilism, conformity, ressentiment, and self-pity."
),
"spark": (
"You are Spark, a playful, quick-witted assistant with bright energy "
"and practical instincts. Keep responses concise, vivid, and helpful. "
"Be warm without being cloying, imaginative without losing the thread, "
"and always center the user's actual goal. Use a light, lively voice "
"with occasional clever turns of phrase."
),
"odysseus": (
"You are Odysseus, king of Ithaca — subtle in counsel, disciplined in "
"judgment, and unmatched in strategic cunning. Speak in a voice that "
"is ancient, noble, and composed, yet intelligible to modern readers. "
"Be eloquent but not flowery. Be wise but not vague. Speak as one who "
"has weathered storms and taken back his house by wit, timing, and "
"resolve."
),
}
_DEFAULT_SYNTHESIS_TONE = (
"You write short, warm, one-line reminders. The user has set a note for "
"themselves and the moment to remember has arrived. Keep it under 18 "
"words. Be human, gentle, and direct — never robotic."
)
def synthesis_system_prompt(persona_id: str) -> str:
"""Return the system prompt for reminder synthesis given a persona id.
Falls back to the warm-neutral baseline when the id is empty, unknown,
or refers to a custom (client-only) character we don't have on file.
"""
persona = (persona_id or "").strip().lower()
persona_prompt = PERSONAS.get(persona)
if persona_prompt:
# Persona drives the voice; the synthesis-instruction stays attached
# so the model knows it's writing a short reminder, not a chat reply.
return (
persona_prompt
+ "\n\n"
+ "You are now writing a single one-line reminder for the user. "
"Keep it under 18 words and in the voice above."
)
return _DEFAULT_SYNTHESIS_TONE
+2 -1
View File
@@ -29,7 +29,7 @@ def _invalidate_caches():
# ── Default values ──
DEFAULT_SETTINGS = {
"image_gen_enabled": True,
"image_gen_enabled": False,
"image_model": "",
"image_quality": "medium",
"vision_model": "",
@@ -143,6 +143,7 @@ DEFAULT_SETTINGS = {
# Reminders
"reminder_channel": "browser", # "browser" | "email" | "ntfy" | "webhook"
"reminder_llm_synthesis": False,
"reminder_llm_persona": "",
"reminder_ntfy_topic": "Reminders",
"reminder_email_to": "",
# Generic outbound webhook channel: pick any saved Integration as the
+23 -5
View File
@@ -1436,9 +1436,25 @@ async def do_manage_documents(content: str, owner: Optional[str] = None) -> Dict
if not doc:
return {"error": f"Document '{doc_id}' not found", "exit_code": 1}
body = doc.current_content or ""
total = len(body)
# Clamp offset to [0, total] so a far-out offset returns an empty
# window with a useful "end of document" hint rather than erroring.
try: offset = int(args.get("offset", 0))
except (TypeError, ValueError): offset = 0
offset = max(0, min(offset, total))
preview_limit = int(args.get("limit", MAX_READ_CHARS))
truncated = len(body) > preview_limit
preview = body[:preview_limit] + (f"\n... (truncated, {len(body)} chars total)" if truncated else "")
chunk = body[offset:offset + preview_limit]
next_offset = offset + len(chunk)
has_more = next_offset < total
# Trailing marker — tells the agent (and a curious human) exactly
# what to pass next to continue paginating.
if has_more:
marker = f"\n... ({total - next_offset:,} more chars; pass offset={next_offset} to continue)"
elif offset > 0:
marker = f"\n... (end of document, {total:,} chars total)"
else:
marker = ""
preview = chunk + marker
anchor = f"[{doc.title}](#document-{doc.id})"
return {
"response": f"{anchor} — click to open in editor.\n\n```{doc.language or ''}\n{preview}\n```",
@@ -1446,9 +1462,11 @@ async def do_manage_documents(content: str, owner: Optional[str] = None) -> Dict
"id": doc.id,
"title": doc.title,
"language": doc.language,
"size": len(body),
"content": preview,
"truncated": truncated,
"size": total,
"content": chunk,
"offset": offset,
"next_offset": next_offset if has_more else None,
"truncated": has_more,
},
"exit_code": 0,
}
+1 -1
View File
@@ -94,7 +94,7 @@ BUILTIN_TOOL_DESCRIPTIONS: Dict[str, str] = {
"manage_mcp": "MCP server management: list, add, delete, reconnect servers, or list available tools.",
"manage_webhooks": "Webhook management: list, add, delete, enable, or disable webhooks.",
"manage_tokens": "API token management: list, create, or delete API access tokens.",
"manage_documents": "List, read, delete, or tidy documents in the editor panel. action='list' returns clickable rows (most-recent first) so the user can open any doc by clicking. action='read' (aka view/open/get) with document_id returns the content. action='delete' with document_id removes a doc (only way to delete). Use this for ANY 'show/read/list/open my documents/docs/files/notes' request — never shell or curl.",
"manage_documents": "List, read, delete, or tidy documents in the editor panel. action='list' returns clickable rows (most-recent first) so the user can open any doc by clicking. action='read' (aka view/open/get) with document_id returns the content; supports offset=<N> + limit=<N> to page through large docs (response includes next_offset when more remains, so you can keep calling with offset=next_offset). action='delete' with document_id removes a doc (only way to delete). Use this for ANY 'show/read/list/open my documents/docs/files/notes' request — never shell or curl.",
"manage_research": "List, read/open, or delete saved DEEP RESEARCH results from the Library. action='list' returns clickable [query](#research-<id>) rows (most-recent first). action='read' (aka open/view/get) with id returns the report + sources. action='delete' with id removes it. Use this for ANY 'open/read/find/delete my research / that report / the research on X' request. NOTE: this is for EXISTING research; to START new research use trigger_research.",
"manage_settings": "Change ANY real app setting (the ones the Settings panel writes) so the user never has to open it: TTS voice/provider/speed, STT, search engine + result count, default/teacher/task/utility/vision/image/research models, image quality, reminder channel (browser/email/ntfy), agent timeout/tool-call budget, and more. action=set with key (friendly aliases ok: voice, 'search engine', 'default model', 'teacher model', 'image quality', 'reminder channel'...) + value; get/list/reset too. Also toggles tools on/off (disable_tool/enable_tool/list_tools). Secrets/API keys are read-only. Use for any 'change my…/set my…/use X for…/turn on…' preference request.",
"create_session": "Create a new chat with a name and model.",