Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains Paper • 2507.17746 • Published Jul 23, 2025 • 3
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data +7 Jun 3, 2025 • 305
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods +3 Jan 18, 2024 • 75
MobileCLIP Models + DataCompDR Data Collection MobileCLIP: Mobile-friendly image-text models with SOTA zero-shot capabilities. DataCompDR: Improved datasets for training image-text SOTA models. • 25 items • Updated Aug 26, 2025 • 37
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 69
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch Paper • 2311.03099 • Published Nov 6, 2023 • 30
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks Paper • 2311.06242 • Published Nov 10, 2023 • 95