LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 2 days ago • 207
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 15 days ago • 100
OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks Paper • 2604.08539 • Published 15 days ago • 48
CREval: An Automated Interpretable Evaluation for Creative Image Manipulation under Complex Instructions Paper • 2603.26174 • Published 28 days ago • 5