Papers - Encoders
updated
Functional Interpolation for Relative Positions Improves Long Context
Transformers
Paper
• 2310.04418
• Published
• 4
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question
Answering over Knowledge Graphs
Paper
• 2106.09997
• Published
• 2
Neural Machine Translation of Rare Words with Subword Units
Paper
• 1508.07909
• Published
• 4
A Multimodal Approach to Device-Directed Speech Detection with Large
Language Models
Paper
• 2403.14438
• Published
• 2
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper
• 2402.04252
• Published
• 29
Mini-Gemini: Mining the Potential of Multi-modality Vision Language
Models
Paper
• 2403.18814
• Published
• 48
Training LLMs over Neurally Compressed Text
Paper
• 2404.03626
• Published
• 23
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper
• 1907.11692
• Published
• 10
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
Representation
Paper
• 2103.06874
• Published
• 2
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
• 2412.13663
• Published
• 161
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper
• 2412.09871
• Published
• 108