The Local LLM Index / Quantization & Formats / #116
Anemll/anemll-flash-llama.cpp
by Anemll · Quantization & Formats · updated 28d ago
Flash-MoE sidecar slot-bank runtime for large GGUF MoE models on Apple Silicon — llama.cpp fork
55
momentum
106
stars
13
forks
#116
rank