The Local LLM Index / Inference Engines / #121
ferrumox/fox
by ferrumox · Inference Engines · updated 1mo ago
High-performance LLM inference engine — drop-in replacement for Ollama with faster multi-turn inference, lower TTFT, and higher throughput through prefix caching and continuous batching.
54
momentum
157
stars
22
forks
#121
rank