Ai2

Team

non-profit

Verified

https://allenai.org/

allen_ai

allenai

Activity Feed

AI & ML interests

Building breatkthrough AI to solve the world's biggest problems.

Recent Activity

jamesp-allenai new activity 2 days ago

allenai/Molmo2-8B:Update README.md

sanghol updated a model 2 days ago

allenai/Molmo2-O-7B

sanghol updated a model 2 days ago

allenai/Molmo2-4B

View all activity

Papers

Bolmo: Byteifying the Next Generation of Language Models

Olmo 3

View all Papers

allenai 's collections 33

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets...

allenai/Olmo-3.1-32B-Think

Text Generation • 32B • Updated 10 days ago • 2.25k • • 57
allenai/Olmo-3.1-32B-Instruct-SFT

32B • Updated 13 days ago • 1.83k • 5
allenai/Olmo-3.1-32B-Instruct-DPO

Text Generation • 32B • Updated 13 days ago • 723 • 4
allenai/Olmo-3.1-32B-Instruct

Text Generation • 32B • Updated 12 days ago • 3.53k • • 35

Bolmo

Artifacts for the Bolmo release: https://allenai.org/papers/bolmo.

allenai/Bolmo-7B

Text Generation • 8B • Updated 2 days ago • 464 • 42
allenai/Bolmo-1B

Text Generation • 1B • Updated 2 days ago • 474 • 37
allenai/bolmo_mix

Updated 2 days ago • 532 • 6
Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published 7 days ago • 11

Olmo 3

Artifacts for the Olmo 3 release.

allenai/Olmo-3-1125-32B

Text Generation • 32B • Updated 22 days ago • 6.99k • 99
allenai/Olmo-3-32B-Think

Text Generation • 1.05M • Updated 12 days ago • 12.4k • • 163
allenai/Olmo-3-1025-7B

Text Generation • 7B • Updated 22 days ago • 37.7k • 40
allenai/Olmo-3-7B-Think

Text Generation • 528k • Updated 14 days ago • 15.8k • • 72

Olmo 3 Pre-training

All artifacts related to Olmo 3 pre-training

allenai/dolma3_pool

Viewer • Updated 13 days ago • 56.2M • 52.6k • 27
allenai/dolma3_dolmino_pool

Viewer • Updated about 1 month ago • 94.9M • 419k • 6
allenai/dolma3_longmino_pool

Updated 13 days ago • 11.7k • 8
allenai/dolma3_dolmino_mix-100B-1025

Viewer • Updated about 1 month ago • 17.5M • 11.8k • 1

OlmoEarth

OlmoEarth pre-trained and fine-tuned foundation models for remote sensing

allenai/OlmoEarth-v1-Base

Updated Nov 4 • 3.24k • 20
allenai/OlmoEarth-v1-Nano

Updated Nov 4 • 306 • 9
allenai/OlmoEarth-v1-Tiny

Updated Nov 4 • 49 • 4
allenai/OlmoEarth-v1-Large

Updated Nov 4 • 42 • 11

MolmoAct

All models for the MolmoAct (Multimodal Open Language Model for Action) release.

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11 • 44
allenai/MolmoAct-7B-D-0812

Robotics • 8B • Updated Oct 24 • 654 • 48
allenai/MolmoAct-7B-O-0812

Robotics • 8B • Updated Sep 2 • 68 • 5
allenai/MolmoAct-7B-D-Pretrain-0812

Robotics • 8B • Updated Sep 2 • 1.08k • 8

Reward Bench 2

Datasets, spaces, and models for Reward Bench 2 benchmark and paper!

allenai/reward-bench-2

Viewer • Updated Jun 4 • 1.87k • 2.61k • 28
Running

416

Reward Bench Leaderboard

📐

416

Display and analyze reward model evaluation results
allenai/reward-bench-2-results

Preview • Updated 14 days ago • 1.15k • 3
allenai/Llama-3.1-70B-Instruct-RM-RB2

Text Classification • Updated Jun 4 • 49 • 1

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org

allenai/olmOCR-2-7B-1025-FP8

Image-to-Text • 8B • Updated 15 days ago • 1.49M • 162
allenai/olmOCR-2-7B-1025

Image-to-Text • 8B • Updated Oct 22 • 58.1k • 105
allenai/olmOCR-mix-1025

Viewer • Updated Oct 21 • 270k • 3.11k • 20
allenai/olmOCR-synthmix-1025

Preview • Updated Oct 17 • 3.62k • 2

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app

allenai/OLMoE-1B-7B-0125

Text Generation • 7B • Updated Mar 16 • 16.8k • 34
allenai/OLMoE-1B-7B-0125-Instruct

Text Generation • 7B • Updated Feb 4 • 31.2k • 57
allenai/OLMoE-1B-7B-0125-Instruct-GGUF

7B • Updated Feb 13 • 461 • 19
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 6.53k • 51

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes.

allenai/Llama-3.1-Tulu-3.1-8B

Text Generation • 8B • Updated Feb 10 • 1.13k • 39
allenai/Llama-3.1-Tulu-3-8B

Text Generation • 8B • Updated Feb 13 • 164k • 176
allenai/Llama-3.1-Tulu-3-70B

Text Generation • 71B • Updated Feb 10 • 1.24k • • 61
allenai/Llama-3.1-Tulu-3-405B

Text Generation • 406B • Updated Feb 10 • 139 • 110

Molmo

Artifacts for open multimodal language models.

allenai/Molmo-72B-0924

Image-Text-to-Text • 73B • Updated Oct 9 • 791 • 295
allenai/Molmo-7B-D-0924

Image-Text-to-Text • 8B • Updated 9 days ago • 37.3k • 559
allenai/Molmo-7B-O-0924

Image-Text-to-Text • 8B • Updated Oct 9 • 1.65k • 162
allenai/MolmoE-1B-0924

Image-Text-to-Text • Updated Apr 24 • 1.53k • 155

OLMo Suite

Artifacts for the first set of OLMo models.

allenai/OLMo-1B-0724-hf

Text Generation • 1B • Updated Aug 5, 2024 • 7.3k • 23
allenai/OLMo-7B-0724-hf

Text Generation • 7B • Updated Jul 16, 2024 • 257 • 16
allenai/OLMo-7B-0724-SFT-hf

Text Generation • 7B • Updated Jul 14, 2024 • 78 • 4
allenai/OLMo-7B-0724-Instruct-hf

Text Generation • 7B • Updated Sep 24, 2024 • 865 • 6

Reward Bench

Datasets, spaces, and models for the reward model benchmark!

Running

416

Reward Bench Leaderboard

📐

416

Display and analyze reward model evaluation results
allenai/reward-bench

Viewer • Updated Sep 9, 2024 • 8.11k • 5.6k • 103
allenai/preference-test-sets

Viewer • Updated Mar 14, 2024 • 43.2k • 2.78k • 28
allenai/reward-bench-results

Updated May 7 • 8.9k • 3

Tulu V2 Suite

The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2"

allenai/tulu-v2-sft-mixture

Viewer • Updated May 24, 2024 • 326k • 980 • 134
allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.58k • 157
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.39k • • 20
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 2.46k • 20

SciRIFF

Data and models to enhance instruction-following for scientific literature understanding.

allenai/SciRIFF

Viewer • Updated Jun 13, 2024 • 433k • 373 • 46
allenai/SciRIFF-train-mix

Viewer • Updated Jun 13, 2024 • 70.7k • 58 • 9
allenai/scitulu-7b

Text Generation • Updated Jun 13, 2024 • 87 • 3

Zebra Logic Bench

ZebraLogic Bench: Testing the Limits of LLMs in Logical Reasoning

Running

90

Zebra Logic Bench

🦓

90

Display and explore a leaderboard for model evaluations
allenai/ZebraLogicBench

Viewer • Updated Jul 11, 2024 • 4.26k • 1.3k • 23
allenai/ZebraLogicBench-private

Viewer • Updated Jul 4, 2024 • 4.26k • 612 • 12
Faith and Fate: Limits of Transformers on Compositionality

Paper • 2305.18654 • Published May 29, 2023 • 7

ACE

Ai2 Climate Emulator (ACE) is a family of fast ML models that simulate global atmospheric variability over time scales ranging from hours to centuries

allenai/SamudrACE-CM4-piControl

Updated Oct 17 • 15 • 3
allenai/ACE2-ERA5

Updated Nov 18 • 77 • 15
allenai/ACE2-EAMv3

Updated Sep 8 • 17 • 2
allenai/ACE2-SOM

Updated Jul 16 • 42 • 1

Molmo2

Artifacts for the Molmo2 release

allenai/Molmo2-4B

Video-Text-to-Text • 5B • Updated 2 days ago • 1.1k • 29
allenai/Molmo2-8B

Video-Text-to-Text • 9B • Updated 2 days ago • 3.02k • 85
allenai/Molmo2-O-7B

Video-Text-to-Text • 8B • Updated 2 days ago • 222 • 15
allenai/Molmo2-VideoPoint-4B

Video-Text-to-Text • 5B • Updated 8 days ago • 35 • 15

Molmo2 Data

Artifacts for the Molmo2 data release

allenai/Molmo2-Cap

Viewer • Updated 8 days ago • 108k • 187 • 6
allenai/Molmo2-CapEval

Viewer • Updated 9 days ago • 693 • 276 • 1
allenai/Molmo2-VideoCapQA

Viewer • Updated 9 days ago • 951k • 88 • 2
allenai/Molmo2-VideoSubtitleQA

Viewer • Updated 9 days ago • 469k • 82 • 1

SAGE

Smart Any-Horizon Agent for Long Video Reasoning

allenai/SAGE-MM-Qwen3-VL-8B-SFT_RL

Video-Text-to-Text • 9B • Updated 8 days ago • 44 • 4
allenai/SAGE-MM-Molmo2-8B-SFT_RL

Video-Text-to-Text • 9B • Updated 8 days ago • 15 • 3
allenai/SAGE-MM-Qwen3-VL-4B-SFT_RL

Video-Text-to-Text • 5B • Updated 8 days ago • 204 • 2
allenai/SAGE-MM-Qwen2.5-VL-7B-SFT_RL

Video-Text-to-Text • 8B • Updated 8 days ago • 20 • 1

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them.

allenai/Olmo-3-7B-Think-SFT

Text Generation • 7B • Updated about 1 month ago • 63.4k • 7
allenai/Dolci-Think-SFT-7B

Viewer • Updated 29 days ago • 2.27M • 3.12k • 8
allenai/Olmo-3-7B-Think-DPO

Text Generation • 528k • Updated 29 days ago • 3.42k • 4
allenai/Dolci-Think-DPO-7B

Viewer • Updated Nov 20 • 150k • 1.2k • 8

MolmoAct Data Mixture

All datasets for the MolmoAct (Multimodal Open Language Model for Action) release.

allenai/MolmoAct-Dataset

Viewer • Updated Sep 3 • 1.11M • 14.7k • 24
allenai/MolmoAct-Pretraining-Mixture

Viewer • Updated Sep 10 • 24.2M • 5.57k • 9
allenai/MolmoAct-Midtraining-Mixture

Viewer • Updated Aug 18 • 5.93M • 3.09k • 4
allenai/libero

Viewer • Updated Aug 27 • 521k • 539 • 2

IFBench

Datasets for IFBench benchmark and paper!

allenai/IF_multi_constraints_upto5

Viewer • Updated Oct 2 • 95.4k • 736 • 18
allenai/IFBench_test

Viewer • Updated Oct 17 • 300 • 3.61k • 7
allenai/IFBench_multi-turn

Viewer • Updated Jul 3 • 3.16k • 1.29k • 7

OLMo 2

Artifacts for the OLMo 2 release.

allenai/OLMo-2-0425-1B-Instruct

Text Generation • 1B • Updated Apr 30 • 26.9k • 54
allenai/OLMo-2-0425-1B-Instruct-GGUF

1B • Updated May 1 • 800 • 14
allenai/OLMo-2-0425-1B

Text Generation • 1B • Updated May 28 • 570k • 68
allenai/OLMo-2-0325-32B-Instruct

Text Generation • 32B • Updated Mar 14 • 3.42k • 148

DataDecide

A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale.

allenai/DataDecide-eval-results

Viewer • Updated Apr 16 • 1.41M • 179 • 7
allenai/DataDecide-eval-instances

Viewer • Updated Mar 10 • 1.17k • 414 • 2
allenai/DataDecide-data-recipes

Updated May 6 • 2.39k • 8
allenai/DataDecide-falcon-and-cc-qc-tulu-10p-60M

76.4M • Updated Apr 8 • 43 • 1

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog

allenai/pixmo-docs

Viewer • Updated Feb 24 • 255k • 2.38k • 33
allenai/pixmo-cap

Viewer • Updated Nov 27, 2024 • 717k • 460 • 34
allenai/pixmo-points

Viewer • Updated Nov 27, 2024 • 2.38M • 566 • 39
allenai/pixmo-cap-qa

Viewer • Updated Dec 5, 2024 • 272k • 263 • 8

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes.

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 12.5k • 203
allenai/llama-3.1-tulu-3-8b-preference-mixture

Viewer • Updated Feb 4 • 273k • 1.67k • 25
allenai/llama-3.1-tulu-3-70b-preference-mixture

Viewer • Updated Feb 4 • 337k • 655 • 19
allenai/llama-3.1-tulu-3-405b-preference-mixture

Viewer • Updated Feb 5 • 361k • 183 • 6

OLMoE (November 2024)

Artifacts for open mixture-of-experts language models.

allenai/OLMoE-1B-7B-0924

Text Generation • 7B • Updated Oct 19, 2024 • 14k • 138
allenai/OLMoE-1B-7B-0924-SFT

7B • Updated Sep 4, 2024 • 400 • 19
allenai/OLMoE-1B-7B-0924-Instruct

Text Generation • 7B • Updated Sep 13, 2024 • 8.68k • 93
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 6.53k • 51

Tulu V2.5 Suite

A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more!

allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm

Text Generation • Updated Jun 14, 2024 • 74 • 6
allenai/tulu-2.5-preference-data

Viewer • Updated Jul 22, 2024 • 2.12M • 1.38k • 19
allenai/tulu-2.5-prompts

Viewer • Updated Jul 6, 2024 • 189k • 139 • 4
allenai/tulu-v2.5-ppo-13b-uf-mean

Text Generation • 13B • Updated Jun 14, 2024 • 83

Paloma

Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains

allenai/paloma

Viewer • Updated Jun 6, 2024 • 309k • 3.29k • 40
allenai/paloma-1b-baseline-dolma

Text Generation • Updated Dec 18, 2023 • 2
allenai/paloma-1b-baseline-pile

Text Generation • Updated Dec 19, 2023 • 1
allenai/paloma-1b-baseline-c4

Text Generation • Updated Dec 18, 2023 • 2

WildBench

Running

231

AI2 WildBench Leaderboard (V2)

🦁

231

Display and explore a leaderboard of language models
allenai/WildBench

Viewer • Updated Mar 4 • 2.3k • 2.15k • 37
allenai/WildBench-V2-Model-Outputs

Viewer • Updated Aug 1, 2024 • 62.5k • 3.6k • 2
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7, 2024 • 29

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs.

allenai/wildjailbreak

Viewer • Updated Aug 8, 2024 • 2.21k • 3.58k • 93
allenai/wildguard

Text Generation • 7B • Updated Jul 27 • 25.9k • 36
allenai/llama2-7b-WildJailbreak

Text Generation • Updated Jun 29, 2024 • 6
allenai/llama2-13b-WildJailbreak

Text Generation • Updated Jun 29, 2024 • 4 • 1

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions.

allenai/OLMo-2-1124-13B-Instruct-preview

Text Generation • 14B • Updated Jan 6 • 191 • 58
allenai/OLMo-2-1124-7B-Instruct-preview

Text Generation • 7B • Updated Jan 6 • 81 • 47
allenai/OLMo-2-1124-7B-SFT-Preview

Text Generation • Updated Jan 6 • 87 • 3
allenai/OLMo-2-1124-7B-DPO-Preview

Text Generation • Updated Jan 6 • 73 • 2

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets...

allenai/Olmo-3.1-32B-Think

Text Generation • 32B • Updated 10 days ago • 2.25k • • 57
allenai/Olmo-3.1-32B-Instruct-SFT

32B • Updated 13 days ago • 1.83k • 5
allenai/Olmo-3.1-32B-Instruct-DPO

Text Generation • 32B • Updated 13 days ago • 723 • 4
allenai/Olmo-3.1-32B-Instruct

Text Generation • 32B • Updated 12 days ago • 3.53k • • 35

Molmo2

Artifacts for the Molmo2 release

allenai/Molmo2-4B

Video-Text-to-Text • 5B • Updated 2 days ago • 1.1k • 29
allenai/Molmo2-8B

Video-Text-to-Text • 9B • Updated 2 days ago • 3.02k • 85
allenai/Molmo2-O-7B

Video-Text-to-Text • 8B • Updated 2 days ago • 222 • 15
allenai/Molmo2-VideoPoint-4B

Video-Text-to-Text • 5B • Updated 8 days ago • 35 • 15

Bolmo

Artifacts for the Bolmo release: https://allenai.org/papers/bolmo.

allenai/Bolmo-7B

Text Generation • 8B • Updated 2 days ago • 464 • 42
allenai/Bolmo-1B

Text Generation • 1B • Updated 2 days ago • 474 • 37
allenai/bolmo_mix

Updated 2 days ago • 532 • 6
Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published 7 days ago • 11

Molmo2 Data

Artifacts for the Molmo2 data release

allenai/Molmo2-Cap

Viewer • Updated 8 days ago • 108k • 187 • 6
allenai/Molmo2-CapEval

Viewer • Updated 9 days ago • 693 • 276 • 1
allenai/Molmo2-VideoCapQA

Viewer • Updated 9 days ago • 951k • 88 • 2
allenai/Molmo2-VideoSubtitleQA

Viewer • Updated 9 days ago • 469k • 82 • 1

Olmo 3

Artifacts for the Olmo 3 release.

allenai/Olmo-3-1125-32B

Text Generation • 32B • Updated 22 days ago • 6.99k • 99
allenai/Olmo-3-32B-Think

Text Generation • 1.05M • Updated 12 days ago • 12.4k • • 163
allenai/Olmo-3-1025-7B

Text Generation • 7B • Updated 22 days ago • 37.7k • 40
allenai/Olmo-3-7B-Think

Text Generation • 528k • Updated 14 days ago • 15.8k • • 72

SAGE

Smart Any-Horizon Agent for Long Video Reasoning

allenai/SAGE-MM-Qwen3-VL-8B-SFT_RL

Video-Text-to-Text • 9B • Updated 8 days ago • 44 • 4
allenai/SAGE-MM-Molmo2-8B-SFT_RL

Video-Text-to-Text • 9B • Updated 8 days ago • 15 • 3
allenai/SAGE-MM-Qwen3-VL-4B-SFT_RL

Video-Text-to-Text • 5B • Updated 8 days ago • 204 • 2
allenai/SAGE-MM-Qwen2.5-VL-7B-SFT_RL

Video-Text-to-Text • 8B • Updated 8 days ago • 20 • 1

Olmo 3 Pre-training

All artifacts related to Olmo 3 pre-training

allenai/dolma3_pool

Viewer • Updated 13 days ago • 56.2M • 52.6k • 27
allenai/dolma3_dolmino_pool

Viewer • Updated about 1 month ago • 94.9M • 419k • 6
allenai/dolma3_longmino_pool

Updated 13 days ago • 11.7k • 8
allenai/dolma3_dolmino_mix-100B-1025

Viewer • Updated about 1 month ago • 17.5M • 11.8k • 1

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them.

allenai/Olmo-3-7B-Think-SFT

Text Generation • 7B • Updated about 1 month ago • 63.4k • 7
allenai/Dolci-Think-SFT-7B

Viewer • Updated 29 days ago • 2.27M • 3.12k • 8
allenai/Olmo-3-7B-Think-DPO

Text Generation • 528k • Updated 29 days ago • 3.42k • 4
allenai/Dolci-Think-DPO-7B

Viewer • Updated Nov 20 • 150k • 1.2k • 8

OlmoEarth

OlmoEarth pre-trained and fine-tuned foundation models for remote sensing

allenai/OlmoEarth-v1-Base

Updated Nov 4 • 3.24k • 20
allenai/OlmoEarth-v1-Nano

Updated Nov 4 • 306 • 9
allenai/OlmoEarth-v1-Tiny

Updated Nov 4 • 49 • 4
allenai/OlmoEarth-v1-Large

Updated Nov 4 • 42 • 11

MolmoAct Data Mixture

All datasets for the MolmoAct (Multimodal Open Language Model for Action) release.

allenai/MolmoAct-Dataset

Viewer • Updated Sep 3 • 1.11M • 14.7k • 24
allenai/MolmoAct-Pretraining-Mixture

Viewer • Updated Sep 10 • 24.2M • 5.57k • 9
allenai/MolmoAct-Midtraining-Mixture

Viewer • Updated Aug 18 • 5.93M • 3.09k • 4
allenai/libero

Viewer • Updated Aug 27 • 521k • 539 • 2

MolmoAct

All models for the MolmoAct (Multimodal Open Language Model for Action) release.

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11 • 44
allenai/MolmoAct-7B-D-0812

Robotics • 8B • Updated Oct 24 • 654 • 48
allenai/MolmoAct-7B-O-0812

Robotics • 8B • Updated Sep 2 • 68 • 5
allenai/MolmoAct-7B-D-Pretrain-0812

Robotics • 8B • Updated Sep 2 • 1.08k • 8

IFBench

Datasets for IFBench benchmark and paper!

allenai/IF_multi_constraints_upto5

Viewer • Updated Oct 2 • 95.4k • 736 • 18
allenai/IFBench_test

Viewer • Updated Oct 17 • 300 • 3.61k • 7
allenai/IFBench_multi-turn

Viewer • Updated Jul 3 • 3.16k • 1.29k • 7

Reward Bench 2

Datasets, spaces, and models for Reward Bench 2 benchmark and paper!

allenai/reward-bench-2

Viewer • Updated Jun 4 • 1.87k • 2.61k • 28
Running

416

Reward Bench Leaderboard

📐

416

Display and analyze reward model evaluation results
allenai/reward-bench-2-results

Preview • Updated 14 days ago • 1.15k • 3
allenai/Llama-3.1-70B-Instruct-RM-RB2

Text Classification • Updated Jun 4 • 49 • 1

OLMo 2

Artifacts for the OLMo 2 release.

allenai/OLMo-2-0425-1B-Instruct

Text Generation • 1B • Updated Apr 30 • 26.9k • 54
allenai/OLMo-2-0425-1B-Instruct-GGUF

1B • Updated May 1 • 800 • 14
allenai/OLMo-2-0425-1B

Text Generation • 1B • Updated May 28 • 570k • 68
allenai/OLMo-2-0325-32B-Instruct

Text Generation • 32B • Updated Mar 14 • 3.42k • 148

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org

allenai/olmOCR-2-7B-1025-FP8

Image-to-Text • 8B • Updated 15 days ago • 1.49M • 162
allenai/olmOCR-2-7B-1025

Image-to-Text • 8B • Updated Oct 22 • 58.1k • 105
allenai/olmOCR-mix-1025

Viewer • Updated Oct 21 • 270k • 3.11k • 20
allenai/olmOCR-synthmix-1025

Preview • Updated Oct 17 • 3.62k • 2

DataDecide

A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale.

allenai/DataDecide-eval-results

Viewer • Updated Apr 16 • 1.41M • 179 • 7
allenai/DataDecide-eval-instances

Viewer • Updated Mar 10 • 1.17k • 414 • 2
allenai/DataDecide-data-recipes

Updated May 6 • 2.39k • 8
allenai/DataDecide-falcon-and-cc-qc-tulu-10p-60M

76.4M • Updated Apr 8 • 43 • 1

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app

allenai/OLMoE-1B-7B-0125

Text Generation • 7B • Updated Mar 16 • 16.8k • 34
allenai/OLMoE-1B-7B-0125-Instruct

Text Generation • 7B • Updated Feb 4 • 31.2k • 57
allenai/OLMoE-1B-7B-0125-Instruct-GGUF

7B • Updated Feb 13 • 461 • 19
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 6.53k • 51

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog

allenai/pixmo-docs

Viewer • Updated Feb 24 • 255k • 2.38k • 33
allenai/pixmo-cap

Viewer • Updated Nov 27, 2024 • 717k • 460 • 34
allenai/pixmo-points

Viewer • Updated Nov 27, 2024 • 2.38M • 566 • 39
allenai/pixmo-cap-qa

Viewer • Updated Dec 5, 2024 • 272k • 263 • 8

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes.

allenai/Llama-3.1-Tulu-3.1-8B

Text Generation • 8B • Updated Feb 10 • 1.13k • 39
allenai/Llama-3.1-Tulu-3-8B

Text Generation • 8B • Updated Feb 13 • 164k • 176
allenai/Llama-3.1-Tulu-3-70B

Text Generation • 71B • Updated Feb 10 • 1.24k • • 61
allenai/Llama-3.1-Tulu-3-405B

Text Generation • 406B • Updated Feb 10 • 139 • 110

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes.

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 12.5k • 203
allenai/llama-3.1-tulu-3-8b-preference-mixture

Viewer • Updated Feb 4 • 273k • 1.67k • 25
allenai/llama-3.1-tulu-3-70b-preference-mixture

Viewer • Updated Feb 4 • 337k • 655 • 19
allenai/llama-3.1-tulu-3-405b-preference-mixture

Viewer • Updated Feb 5 • 361k • 183 • 6

Molmo

Artifacts for open multimodal language models.

allenai/Molmo-72B-0924

Image-Text-to-Text • 73B • Updated Oct 9 • 791 • 295
allenai/Molmo-7B-D-0924

Image-Text-to-Text • 8B • Updated 9 days ago • 37.3k • 559
allenai/Molmo-7B-O-0924

Image-Text-to-Text • 8B • Updated Oct 9 • 1.65k • 162
allenai/MolmoE-1B-0924

Image-Text-to-Text • Updated Apr 24 • 1.53k • 155

OLMoE (November 2024)

Artifacts for open mixture-of-experts language models.

allenai/OLMoE-1B-7B-0924

Text Generation • 7B • Updated Oct 19, 2024 • 14k • 138
allenai/OLMoE-1B-7B-0924-SFT

7B • Updated Sep 4, 2024 • 400 • 19
allenai/OLMoE-1B-7B-0924-Instruct

Text Generation • 7B • Updated Sep 13, 2024 • 8.68k • 93
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 6.53k • 51

OLMo Suite

Artifacts for the first set of OLMo models.

allenai/OLMo-1B-0724-hf

Text Generation • 1B • Updated Aug 5, 2024 • 7.3k • 23
allenai/OLMo-7B-0724-hf

Text Generation • 7B • Updated Jul 16, 2024 • 257 • 16
allenai/OLMo-7B-0724-SFT-hf

Text Generation • 7B • Updated Jul 14, 2024 • 78 • 4
allenai/OLMo-7B-0724-Instruct-hf

Text Generation • 7B • Updated Sep 24, 2024 • 865 • 6

Tulu V2.5 Suite

A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more!

allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm

Text Generation • Updated Jun 14, 2024 • 74 • 6
allenai/tulu-2.5-preference-data

Viewer • Updated Jul 22, 2024 • 2.12M • 1.38k • 19
allenai/tulu-2.5-prompts

Viewer • Updated Jul 6, 2024 • 189k • 139 • 4
allenai/tulu-v2.5-ppo-13b-uf-mean

Text Generation • 13B • Updated Jun 14, 2024 • 83

Reward Bench

Datasets, spaces, and models for the reward model benchmark!

Running

416

Reward Bench Leaderboard

📐

416

Display and analyze reward model evaluation results
allenai/reward-bench

Viewer • Updated Sep 9, 2024 • 8.11k • 5.6k • 103
allenai/preference-test-sets

Viewer • Updated Mar 14, 2024 • 43.2k • 2.78k • 28
allenai/reward-bench-results

Updated May 7 • 8.9k • 3

Paloma

Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains

allenai/paloma

Viewer • Updated Jun 6, 2024 • 309k • 3.29k • 40
allenai/paloma-1b-baseline-dolma

Text Generation • Updated Dec 18, 2023 • 2
allenai/paloma-1b-baseline-pile

Text Generation • Updated Dec 19, 2023 • 1
allenai/paloma-1b-baseline-c4

Text Generation • Updated Dec 18, 2023 • 2

Tulu V2 Suite

The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2"

allenai/tulu-v2-sft-mixture

Viewer • Updated May 24, 2024 • 326k • 980 • 134
allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 2.58k • 157
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 2.39k • • 20
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 2.46k • 20

WildBench

Running

231

AI2 WildBench Leaderboard (V2)

🦁

231

Display and explore a leaderboard of language models
allenai/WildBench

Viewer • Updated Mar 4 • 2.3k • 2.15k • 37
allenai/WildBench-V2-Model-Outputs

Viewer • Updated Aug 1, 2024 • 62.5k • 3.6k • 2
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7, 2024 • 29

SciRIFF

Data and models to enhance instruction-following for scientific literature understanding.

allenai/SciRIFF

Viewer • Updated Jun 13, 2024 • 433k • 373 • 46
allenai/SciRIFF-train-mix

Viewer • Updated Jun 13, 2024 • 70.7k • 58 • 9
allenai/scitulu-7b

Text Generation • Updated Jun 13, 2024 • 87 • 3

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs.

allenai/wildjailbreak

Viewer • Updated Aug 8, 2024 • 2.21k • 3.58k • 93
allenai/wildguard

Text Generation • 7B • Updated Jul 27 • 25.9k • 36
allenai/llama2-7b-WildJailbreak

Text Generation • Updated Jun 29, 2024 • 6
allenai/llama2-13b-WildJailbreak

Text Generation • Updated Jun 29, 2024 • 4 • 1

Zebra Logic Bench

ZebraLogic Bench: Testing the Limits of LLMs in Logical Reasoning

Running

90

Zebra Logic Bench

🦓

90

Display and explore a leaderboard for model evaluations
allenai/ZebraLogicBench

Viewer • Updated Jul 11, 2024 • 4.26k • 1.3k • 23
allenai/ZebraLogicBench-private

Viewer • Updated Jul 4, 2024 • 4.26k • 612 • 12
Faith and Fate: Limits of Transformers on Compositionality

Paper • 2305.18654 • Published May 29, 2023 • 7

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions.

allenai/OLMo-2-1124-13B-Instruct-preview

Text Generation • 14B • Updated Jan 6 • 191 • 58
allenai/OLMo-2-1124-7B-Instruct-preview

Text Generation • 7B • Updated Jan 6 • 81 • 47
allenai/OLMo-2-1124-7B-SFT-Preview

Text Generation • Updated Jan 6 • 87 • 3
allenai/OLMo-2-1124-7B-DPO-Preview

Text Generation • Updated Jan 6 • 73 • 2

ACE

Ai2 Climate Emulator (ACE) is a family of fast ML models that simulate global atmospheric variability over time scales ranging from hours to centuries

allenai/SamudrACE-CM4-piControl

Updated Oct 17 • 15 • 3
allenai/ACE2-ERA5

Updated Nov 18 • 77 • 15
allenai/ACE2-EAMv3

Updated Sep 8 • 17 • 2
allenai/ACE2-SOM

Updated Jul 16 • 42 • 1

AI & ML interests

Recent Activity

Papers

Team members 205

allenai 's collections 33

Reward Bench Leaderboard

Reward Bench Leaderboard

Zebra Logic Bench

AI2 WildBench Leaderboard (V2)