Local LLM Index / local

The Local LLM Index / Quantization & Formats / #209

intel/neural-speed

by intel · Quantization & Formats · updated 1y ago

An innovative library for efficient LLM inference via low-bit quantization

29

momentum

352

stars

36

forks

#209

rank

cpufp4fp8gaudi2gpuint1int2int3int4int5int6int7

View on GitHub →