The Local LLM Index / Quantization & Formats / #203

intel/neural-speed

by intel · Quantization & Formats · updated 1y ago

An innovative library for efficient LLM inference via low-bit quantization

29
momentum
353
stars
37
forks
#203
rank
cpufp4fp8gaudi2gpuint1int2int3int4int5int6int7
View on GitHub →