MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Paper
•
2409.05840
•
Published
•
49
Here are the pretrained weights and instruction tuning weights
| Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download |
|---|---|---|---|---|---|
| MMEvol-LLaMA3-8B | mm_projector | LLaMA3-8B | LLaVA-Pretrain | MMEvol | ckpt |
| Model | MME_C | MMStar | HallBench | MathVista_mini | MMMU_val | AI2D | POPE | BLINK | RWQA |
|---|---|---|---|---|---|---|---|---|---|
| MMEvol-LLaMA3-8B | 47.8 | 50.1 | 62.3 | 50.0 | 40.8 | 73.9 | 86.8 | 46.4 | 62.6 |
| Model | VQA_v2 | GQA | MIA | MMSInst |
|---|---|---|---|---|
| MMEvol-LLaMA3-8B | 83.4 | 65.0 | 78.8 | 32.3 |
Llama 3 is licensed under the LLAMA 3 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
Base model
meta-llama/Llama-3.1-8B