feat: replace gpt-oss-20b-uncensored with HauhauCS MXFP4 GGUF
aoxo model had no quantization (BF16, ~40GB OOM). HauhauCS model uses MXFP4 GGUF format, loads at 11.9GB via llama-cpp backend. All three reasoning levels (Low/Medium/High) work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -36,8 +36,9 @@ physical_models:
|
||||
|
||||
gpt-oss-20b-uncensored:
|
||||
type: llm
|
||||
backend: transformers
|
||||
model_id: "aoxo/gpt-oss-20b-uncensored"
|
||||
backend: llamacpp
|
||||
model_id: "HauhauCS/GPT-OSS-20B-Uncensored-HauhauCS-Aggressive"
|
||||
model_file: "GPT-OSS-20B-Uncensored-HauhauCS-MXFP4-Aggressive.gguf"
|
||||
estimated_vram_gb: 13
|
||||
supports_vision: false
|
||||
supports_tools: true
|
||||
|
||||
Reference in New Issue
Block a user