Embodied-R1.5: Evolving Physical Intelligence via Embodied Foundation Models Paper • 2606.11324 • Published 18 days ago • 170
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published May 20 • 111
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published May 25 • 103
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published May 21 • 171
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 147