D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper β’ 2510.05684 β’ Published Oct 7, 2025 β’ 141 β’ 3
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language Paper β’ 2503.23730 β’ Published Mar 31, 2025 β’ 3 β’ 2
CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction Paper β’ 2410.01273 β’ Published Oct 2, 2024 β’ 12 β’ 2