Files
DesTEngSsv006_swd/kischdle/llmux/tests/test_vram_manager.py
tlg d7a091df8c feat: VRAM manager with priority-based model eviction
Tracks GPU VRAM usage (16GB) and handles model loading/unloading with
priority-based eviction: LLM (lowest) -> TTS -> ASR (highest, protected).
Uses asyncio Lock for concurrency safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:14:41 +02:00

4.9 KiB