Hydra: A 1.6B-Parameter State-Space Language Model with Sparse Attention, Mixture-of-Experts, and Memory Paper • 2508.15099 • Published Aug 20, 2025 • 1