The Local LLM Index / Inference Engines / #200

AI-Hypercomputer/JetStream

by AI-Hypercomputer · Inference Engines · updated 6mo ago

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

momentum

451

stars

forks

#200

rank

gemmagptgpuinferencejaxlarge-language-modelsllamallama2llmllm-inferencellmopsmlops

More in Inference Engines