Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper • 2601.06943 • Published 4 days ago • 198
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 9 items • Updated Nov 17, 2025 • 68
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning Paper • 2507.12841 • Published Jul 17, 2025 • 41
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11, 2025 • 42
view post Post 3397 Manic few days in open source AI, with game changing development all over the place. Here's a round up of the resources:- The science team at @huggingface reproduced and open source the seek r1. https://github.com/huggingface/open-r1- @qwen released a series of models with 1 million token context! https://qwenlm.github.io/blog/qwen2.5-1m/- SmolVLM got even smaller with completely new variants at 256m and 500m https://huggingface.co/blog/smolervlmThere's so much you could do with these developments. Especially combining them together into agentic applications or fine-tuning them on your use case. See translation 1 reply · 🚀 5 5 🤗 3 3 👍 1 1 + Reply
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 53
CSGO: Content-Style Composition in Text-to-Image Generation Paper • 2408.16766 • Published Aug 29, 2024 • 18
StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements Paper • 2408.15666 • Published Aug 28, 2024 • 11
Fine-tuning Large Language Models with Human-inspired Learning Strategies in Medical Question Answering Paper • 2408.07888 • Published Aug 15, 2024 • 13