The Local LLM Index / Quantization & Formats / #178

RahulSChand/gpu_poor

by RahulSChand · Quantization & Formats · updated 1y ago

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

momentum

1,403

stars

forks

#178

rank

ggmlgpuhuggingfacelanguage-modelllamallama2llamacppllmpytorchquantization

More in Quantization & Formats