ShieldGemma 2: Robust and Tractable Image Content Moderation Paper • 2504.01081 • Published Apr 1, 2025 • 3
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents Paper • 2503.08026 • Published Mar 11, 2025
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7, 2025 • 67
Judging with Confidence: Calibrating Autoraters to Preference Distributions Paper • 2510.00263 • Published Sep 30, 2025 • 14
HEART: Emotionally-driven test-time scaling of Language Models Paper • 2509.22876 • Published Sep 26, 2025 • 3
LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science Paper • 2510.01285 • Published Sep 30, 2025
ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review Paper • 2601.22638 • Published Jan 30 • 1
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement Paper • 2603.12310 • Published Mar 12 • 8
TFRBench: A Reasoning Benchmark for Evaluating Forecasting Systems Paper • 2604.05364 • Published Apr 7
PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing Paper • 2604.05018 • Published Apr 6 • 3
Watch and Learn: Learning to Use Computers from Online Videos Paper • 2510.04673 • Published Oct 6, 2025 • 12
CoDA: Agentic Systems for Collaborative Data Visualization Paper • 2510.03194 • Published Oct 3, 2025 • 31
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published Jan 23, 2025 • 41
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack Paper • 2309.15807 • Published Sep 27, 2023 • 33