-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 13 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2403.03432
-
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1 -
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data
Paper • 2304.08247 • Published • 2
-
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Paper • 2308.10110 • Published • 2 -
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Paper • 2308.06093 • Published • 2 -
ConstitutionalExperts: Training a Mixture of Principle-based Prompts
Paper • 2403.04894 • Published • 2 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
Paper • 2402.18039 • Published • 11
-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 44 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 53 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
-
Adaptive sequential Monte Carlo by means of mixture of experts
Paper • 1108.2836 • Published • 2 -
Convergence Rates for Mixture-of-Experts
Paper • 1110.2058 • Published • 2 -
Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs
Paper • 2310.12008 • Published • 2 -
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Paper • 2308.11793 • Published • 2
-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 13 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 44 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 53 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
-
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1 -
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Paper • 2310.20587 • Published • 18 -
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data
Paper • 2304.08247 • Published • 2
-
Robust Mixture-of-Expert Training for Convolutional Neural Networks
Paper • 2308.10110 • Published • 2 -
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Paper • 2308.06093 • Published • 2 -
ConstitutionalExperts: Training a Mixture of Principle-based Prompts
Paper • 2403.04894 • Published • 2 -
Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models
Paper • 2403.03432 • Published • 1
-
Adaptive sequential Monte Carlo by means of mixture of experts
Paper • 1108.2836 • Published • 2 -
Convergence Rates for Mixture-of-Experts
Paper • 1110.2058 • Published • 2 -
Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs
Paper • 2310.12008 • Published • 2 -
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Paper • 2308.11793 • Published • 2
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 189 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 -
ResLoRA: Identity Residual Mapping in Low-Rank Adaption
Paper • 2402.18039 • Published • 11