EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published 23 days ago • 115
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 17 days ago • 103
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 16 days ago • 72
Omni-Video: Democratizing Unified Video Understanding and Generation Paper • 2507.06119 • Published Jul 8 • 2
Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision Paper • 2508.05606 • Published Aug 7
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7 • 122
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published Dec 6, 2024 • 46
MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization Paper • 2408.02555 • Published Aug 5, 2024 • 31
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models Paper • 2402.01118 • Published Feb 2, 2024 • 32
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 65