The Local LLM Index / Inference Engines / #147

xaskasdf/ntransformer

by xaskasdf · Inference Engines · updated 3mo ago

High-efficiency LLM inference engine in C++/CUDA. Run Llama 70B on RTX 3090.

43
momentum
461
stars
20
forks
#147
rank
View on GitHub →