mirror of
https://github.com/pewdiepie-archdaemon/odysseus.git
synced 2026-06-16 09:45:24 -04:00
4074e77d93
These models OOM on --kv-cache-dtype auto (≈bf16) at any usable context with current tensor-parallel layouts. _detectModelOptimizations now seeds opts.kvCacheDtype='fp8' for them, and the serve panel's KV Cache select picks that up as the default unless the user has a saved override on this skill.