The Local LLM Index / Quantization & Formats / #116

Anemll/anemll-flash-llama.cpp

by Anemll · Quantization & Formats · updated 28d ago

Flash-MoE sidecar slot-bank runtime for large GGUF MoE models on Apple Silicon — llama.cpp fork

55
momentum
106
stars
13
forks
#116
rank
View on GitHub →