Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment Paper • 2405.03594 • Published May 6, 2024 • 7
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Paper • 2203.07259 • Published Mar 14, 2022 • 4