Running on Zero Featured 344 Describe Anything β‘ 344 Describe masked regions in an image with natural language
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V Paper β’ 2310.11441 β’ Published Oct 17, 2023 β’ 29
IDEA-Research/grounding-dino-base Zero-Shot Object Detection β’ 0.2B β’ Updated May 12, 2024 β’ 2.59M β’ 159
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Paper β’ 2406.09403 β’ Published Jun 13, 2024 β’ 23
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper β’ 2505.21497 β’ Published May 27, 2025 β’ 109