Novel
updated
Redundancy Principles for MLLMs Benchmarks
Paper
• 2501.13953
• Published
• 29
Autonomy-of-Experts Models
Paper
• 2501.13074
• Published
• 44
Distillation Scaling Laws
Paper
• 2502.08606
• Published
• 47
Large Language Diffusion Models
Paper
• 2502.09992
• Published
• 126
I-Con: A Unifying Framework for Representation Learning
Paper
• 2504.16929
• Published
• 30
Parallel Scaling Law for Language Models
Paper
• 2505.10475
• Published
• 83
UMoE: Unifying Attention and FFN with Shared Experts
Paper
• 2505.07260
• Published
• 9
Scaling Law for Quantization-Aware Training
Paper
• 2505.14302
• Published
• 76
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed
Inference
Paper
• 2508.02193
• Published
• 136
Scaling Laws for Optimal Data Mixtures
Paper
• 2507.09404
• Published
• 37