DesTEngSsv006_swd

SHA256

Files

tlg 7a0ff55eb5 fix: remove unsupported KV cache quantization in llama-cpp backend

GGML_TYPE_Q8_0 for type_k/type_v not supported in this llama-cpp-python
version. Keep reduced n_ctx=4096 for VRAM savings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-05 23:35:05 +02:00

llmux

fix: remove unsupported KV cache quantization in llama-cpp backend

2026-04-05 23:35:05 +02:00