AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration Paper • 2510.10395 • Published Oct 12, 2025 • 30
llava-hf/llava-onevision-qwen2-72b-ov-chat-hf Image-Text-to-Text • 73B • Updated Jun 18, 2025 • 50 • 3