The Local LLM Index / Quantization & Formats / #173

RahulSChand/gpu_poor

by RahulSChand · Quantization & Formats · updated 1y ago

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

35
momentum
1,402
stars
89
forks
#173
rank
ggmlgpuhuggingfacelanguage-modelllamallama2llamacppllmpytorchquantization
View on GitHub →