diff --git a/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md b/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md index d1e33be..fca5bac 100644 --- a/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md +++ b/kischdle/llmux/docs/superpowers/specs/2026-04-03-llmux-design.md @@ -93,7 +93,7 @@ When a request arrives for a model whose physical model is not loaded: - Evict LLM first - Evict TTS second - Evict ASR only as last resort - - Never evict a higher-priority model to load a lower-priority one + - Never evict a higher-priority model to load a lower-priority one (e.g., never evict ASR to make room for TTS; in that case, evict the LLM instead) 4. Load the requested model. ### Concurrency