The Local LLM Index / Quantization & Formats / #203
intel/neural-speed
by intel · Quantization & Formats · updated 1y ago
An innovative library for efficient LLM inference via low-bit quantization
29
momentum
353
stars
37
forks
#203
rank
cpufp4fp8gaudi2gpuint1int2int3int4int5int6int7
View on GitHub →