The Local LLM Index / Inference Engines / #227

NEO-MLSys25/NEO

by NEO-MLSys25 · Inference Engines · updated 12mo ago

NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading

22
momentum
97
stars
24
forks
#227
rank
View on GitHub →