The Local LLM Index / Quantization & Formats / #77
SharpAI/SwiftLM
by SharpAI · Quantization & Formats · updated 25d ago
⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, MACOS + iOS iPhone app.
63
momentum
689
stars
39
forks
#77
rank
apple-siliinferenceiosllmmetalmlxmoeon-device-aiopenai-apiswift
View on GitHub →