The Local LLM Index / Quantization & Formats / #77

SharpAI/SwiftLM

by SharpAI · Quantization & Formats · updated 25d ago

⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, MACOS + iOS iPhone app.

63
momentum
689
stars
39
forks
#77
rank
apple-siliinferenceiosllmmetalmlxmoeon-device-aiopenai-apiswift
View on GitHub →