GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models Paper • 2512.15560 • Published 15 days ago • 22
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration Paper • 2510.10395 • Published Oct 12, 2025 • 30
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark Paper • 2509.24897 • Published Sep 29, 2025 • 46
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction Paper • 2410.21169 • Published Oct 28, 2024 • 30
MRAMG-Bench: A BeyondText Benchmark for Multimodal Retrieval-Augmented Multimodal Generation Paper • 2502.04176 • Published Feb 6, 2025