The Local LLM Index / Quantization & Formats / #173
RahulSChand/gpu_poor
by RahulSChand · Quantization & Formats · updated 1y ago
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
35
momentum
1,402
stars
89
forks
#173
rank
ggmlgpuhuggingfacelanguage-modelllamallama2llamacppllmpytorchquantization
View on GitHub →