Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published Mar 20, 2025 • 28
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache Paper • 2402.02750 • Published Feb 5, 2024 • 5
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10, 2024 • 28