AACR-Bench: Evaluating Automatic Code Review with Holistic Repository-Level Context Paper • 2601.19494 • Published 6 days ago • 15 • 2
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning Paper • 2601.18150 • Published 7 days ago • 5 • 2
LoL: Longer than Longer, Scaling Video Generation to Hour Paper • 2601.16914 • Published 10 days ago • 16 • 2
MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models Paper • 2601.21181 • Published 4 days ago • 8 • 3
WorldBench: Disambiguating Physics for Diagnostic Evaluation of World Models Paper • 2601.21282 • Published 4 days ago • 2
Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units Paper • 2601.21996 • Published 4 days ago • 3 • 4
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 4 days ago • 11 • 3
Latent Adversarial Regularization for Offline Preference Optimization Paper • 2601.22083 • Published 4 days ago • 11 • 2
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods Paper • 2601.21821 • Published 4 days ago • 49 • 3
ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 4 days ago • 37 • 3
Segment Length Matters: A Study of Segment Lengths on Audio Fingerprinting Performance Paper • 2601.17690 • Published 8 days ago • 1 • 2
Reinforcement Learning from Meta-Evaluation: Aligning Language Models Without Ground-Truth Labels Paper • 2601.21268 • Published 4 days ago • 1 • 3
Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening Paper • 2601.21590 • Published 4 days ago • 11 • 11