Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models β’ 18 items β’ Updated 9 days ago β’ 45
Sparse Auto-Encoders (SAEs) for Mechanistic Interpretability Collection A compilation of sparse auto-encoders trained on large language models. β’ 37 items β’ Updated 25 days ago β’ 20
Running on CPU Upgrade Featured 2.82k The Smol Training Playbook π 2.82k The secrets to building world-class LLMs
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge Feb 7, 2025 β’ 270
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 β’ 1.16k