The Local LLM Index / Inference Engines / #130

zhihu/ZhiLight

by zhihu · Inference Engines · updated 2mo ago

A highly optimized LLM inference acceleration engine for Llama and its variants.

50
momentum
905
stars
102
forks
#130
rank
cudadeepseek-r1gptinference-enginellamallmllm-inferencellm-servingmodel-servingpytorch
View on GitHub →