DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 16 days ago • 52
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published 15 days ago • 34
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents Paper • 2602.07035 • Published 24 days ago • 30
GISA: A Benchmark for General Information-Seeking Assistant Paper • 2602.08543 • Published 18 days ago • 26
SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training Paper • 2602.03411 • Published 24 days ago • 37
SWE-World: Building Software Engineering Agents in Docker-Free Environments Paper • 2602.03419 • Published 24 days ago • 40
MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching Paper • 2601.10712 • Published Jan 15 • 24
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback Paper • 2601.10156 • Published Jan 15 • 26
OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG Paper • 2601.09028 • Published Jan 13 • 34
MemoBrain: Executive Memory as an Agentic Brain for Reasoning Paper • 2601.08079 • Published Jan 12 • 38
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration Paper • 2601.06860 • Published Jan 11 • 16
TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning Paper • 2601.04698 • Published Jan 8 • 10
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis Paper • 2601.05808 • Published Jan 9 • 36
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published Jan 1 • 131