The Local LLM Index / Inference Engines / #121

ferrumox/fox

by ferrumox · Inference Engines · updated 1mo ago

High-performance LLM inference engine — drop-in replacement for Ollama with faster multi-turn inference, lower TTFT, and higher throughput through prefix caching and continuous batching.

54
momentum
157
stars
22
forks
#121
rank
View on GitHub →