Papers - RoPE
updated
Resonance RoPE: Improving Context Length Generalization of Large
Language Models
Paper
• 2403.00071
• Published
• 24
Scaling Laws of RoPE-based Extrapolation
Paper
• 2310.05209
• Published
• 8
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language
Models
Paper
• 2404.12387
• Published
• 39
OpenELM: An Efficient Language Model Family with Open-source Training
and Inference Framework
Paper
• 2404.14619
• Published
• 126
What needs to go right for an induction head? A mechanistic study of
in-context learning circuits and their formation
Paper
• 2404.07129
• Published
• 3
Round and Round We Go! What makes Rotary Positional Encodings useful?
Paper
• 2410.06205
• Published
• 2
ThunderKittens: Simple, Fast, and Adorable AI Kernels
Paper
• 2410.20399
• Published
• 2