Optimizing Agentic Reasoning with Retrieval via Synthetic Semantic Information Gain Reward Paper • 2602.00845 • Published 8 days ago • 1 • 1
MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling Paper • 2602.03359 • Published 6 days ago • 9 • 3
EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models Paper • 2602.05000 • Published 4 days ago • 1 • 3
SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs Paper • 2602.06040 • Published 3 days ago • 10 • 3
Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Paper • 2602.05261 • Published 4 days ago • 45 • 4
Pathwise Test-Time Correction for Autoregressive Long Video Generation Paper • 2602.05871 • Published 3 days ago • 3 • 3
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions Paper • 2602.06035 • Published 3 days ago • 20 • 3
Context Forcing: Consistent Autoregressive Video Generation with Long Context Paper • 2602.06028 • Published 3 days ago • 29 • 7
Assessing Domain-Level Susceptibility to Emergent Misalignment from Narrow Finetuning Paper • 2602.00298 • Published 9 days ago • 1 • 4
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers Paper • 2602.02016 • Published 6 days ago • 9 • 2
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks Paper • 2602.05547 • Published 3 days ago • 7 • 5
Failing to Explore: Language Models on Interactive Tasks Paper • 2601.22345 • Published 10 days ago • 2 • 3
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention Paper • 2602.04789 • Published 4 days ago • 2 • 3