User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published 3 days ago • 45
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training Paper • 2509.25758 • Published Sep 30, 2025 • 22
CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction Paper • 2508.03159 • Published Aug 5, 2025 • 22
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models Paper • 2506.19697 • Published Jun 24, 2025 • 44
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published Feb 20, 2025 • 26