On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 28 days ago • 36
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 28 days ago • 75