mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-16 09:45:24 -04:00
eda0f1258abaf824b9f95349616ce1fff33823bb
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d42e6a7acc |
Scope skill mutations to caller owner
SkillsManager.update_skill walks every SKILL.md on disk and matches by
slug only; the 'owner' key in its scalar_keys whitelist meant a caller
could pass updates={'owner': 'attacker', 'description': 'pwned'} and the
first matching file on disk got silently re-owned. Two users with the
same slug under different category directories (which is supported by
the on-disk layout <category>/<name>/SKILL.md) could each stomp the
other's skill via the manage_skills tool or the in-process callers in
tool_implementations.py (edit, patch, publish, delete).
update_skill and delete_skill now require the caller's owner and only
match a file whose parsed owner field matches. The default of None
means 'no scope' and only matches ownerless skills, so an unsafe call
without an explicit owner is now a no-op. 'owner' is also removed from
scalar_keys so the updates dict cannot be used to reassign ownership
even when the manager is called from an in-process path that didn't
supply the owner argument.
The in-process callers in tool_implementations.py are updated to pass
owner=owner (which was already in scope at every call site) so the
HTTP and agent paths both go through the scoped check. The HTTP route
at routes/skills_routes.py:1499 was already owner-scoped via
sm.load(owner=user); the fix brings the in-process path up to the
same standard.
|
||
|
|
9b1acf6612 |
Fix year extraction in research queries
* fix: extract full year in research query entities, not just the century * fix: same year capture-group bug in the services search copy * test: research query extracts the full year |
||
|
|
033852ab14 | fix: require GGUF sources for llama downloads (#368) | ||
|
|
9955f5bc95 |
Fix VRAM estimates for pre-quantized HF repos
The Cookbook fit scanner was reporting impossibly low VRAM requirements
for some pre-quantized models — e.g. cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit
shown as 7.1 GB ('perfect' on a 12 GB card) when the real load is ~40 GB.
Root cause is in the catalog builder. When _entry_from_modelinfo falls
back to safetensors metadata for the parameter count, it stored
safetensors.total directly. For pre-quantized repos that figure reflects
*packed* element counts: AWQ/GPTQ-Int4 pack 8x 4-bit weights into one
I32, AWQ-8bit/GPTQ-Int8/FP8 pack 4x. The catalog therefore recorded
~1/8 of the real parameter count, and min_vram_gb = packed * bpp
double-applied the quantization.
Fix the safetensors fallback:
* prefer the per-dtype parameters dict when available and unpack only the
I32/I64 entries (the F16/BF16 scale/zero tensors and embeddings are
already at their real element counts)
* fall back to total * pack_factor when only total is exposed
Patch the catalog entries that were affected by the old fallback so the
fit ratings reflect reality without waiting for a full catalog rebuild:
* cyankiwi/Qwen3-Coder-Next-REAM-AWQ-4bit 11.4B -> 79.7B (40.8 GB VRAM)
* stelterlab/Qwen3-Coder-30B-A3B-Instruct-AWQ 4.6B -> 30.5B
* stelterlab/NVIDIA-Nemotron-3-Nano-30B-A3B-AWQ 5.1B -> 30.5B
* warshanks/Qwen3-8B-abliterated-AWQ 2.2B -> 8.2B
* QuantTrio/sarvam-30b-AWQ 7B -> 30B
* QuantTrio/sarvam-105b-AWQ 19B -> 105B
Closes #377.
|
||
|
|
14e8cffa41 |
Fail closed on untrusted teacher draft confidence
Follow-up to #275. get_relevant_skills() treats a missing/unparseable confidence as 1.0, so it always clears the injection threshold. For teacher-escalation drafts -- auto-written from a possibly untrusted trace and then injected as authoritative guidance -- that means a draft can be auto-injected regardless of the configured confidence bar. Require teacher-escalation drafts to carry an explicit, parseable confidence that meets min_confidence; fail closed otherwise. Hand-authored legacy drafts keep the lenient "unset -> keep" behavior so they don't silently vanish, and published skills are unaffected. Ran: python -m py_compile services/memory/skills.py + a get_relevant_skills unit check (teacher drafts with None/garbage/0.8 excluded at min=0.85; 0.9 included; legacy + published unaffected; gate-off control unchanged). Co-authored-by: Fernando Lazzarin <263019791+waitdeadai@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
0888a3b3e6 | Add native Windows compatibility layer | ||
|
|
f1817fd560 |
Add macOS Apple Silicon Cookbook support
* Add Apple Silicon (Metal) GPU detection and unified-memory fit tuning hardware.py detects Apple Silicon locally and over SSH, reporting backend=metal, the chip name, and a RAM-scaled fraction of unified memory as the usable GPU budget. fit.py gains an M1-M4 memory-bandwidth table for realistic tok/s and drops vLLM-only formats (AWQ/GPTQ/FP8) that can't be served on Metal. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> (cherry picked from commit |
||
|
|
e5c99a5eee | Odysseus v1.0 |