GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning Paper • 2602.12099 • Published Feb 12 • 61
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning Paper • 2602.10560 • Published Feb 11 • 31
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design Paper • 2602.08253 • Published Feb 9 • 26
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression Paper • 2602.11008 • Published Feb 11 • 18
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 72
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 18 days ago • 136
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Paper • 2603.19235 • Published 16 days ago • 94
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning Paper • 2603.16929 • Published 22 days ago • 13
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents Paper • 2603.19685 • Published 15 days ago • 19
Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck Paper • 2603.08462 • Published 26 days ago • 21
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States Paper • 2603.19987 • Published 15 days ago • 9
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published 11 days ago • 35
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning Paper • 2602.06960 • Published Feb 6 • 14
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting Paper • 2603.25745 • Published 9 days ago • 13