tarn59/book_flatten_and_crop_qwen_image_edit_2509 Image-to-Image โข Updated Nov 18, 2025 โข 38 โข โข 39
Running on Zero Featured 166 ReconViaGen ๐ฅ 166 High-fidelity 3D Geometry Generation from multi-view images
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ โข 9 items โข Updated 20 days ago โข 207
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper โข 2505.23747 โข Published May 29, 2025 โข 69
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper โข 2505.17612 โข Published May 23, 2025 โข 81
Runtime error 61 TRELLIS - Multiple Imagen a 3D ๐ 61 Scalable and Versatile 3D Generation from images
Running Featured 100 Qwen3 WebGPU ๐ 100 A hybrid reasoning model that runs locally in your browser.
docling-project/SmolDocling-256M-preview Image-Text-to-Text โข 0.3B โข Updated Sep 17, 2025 โข 40.5k โข 1.61k
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 Sep 25, 2024 โข 191
view article Article SmolVLM Grows Smaller โ Introducing the 256M & 500M Models! +1 Jan 23, 2025 โข 191
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text โข 11B โข Updated Dec 4, 2024 โข 181k โข 1.56k
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 โข 15 items โข Updated Dec 6, 2024 โข 655
Running Featured 1.12k OpenVoice ๐ค 1.12k Generate speech in a chosen voice from a short audio sample