Reward Modeling Datasets
updated
Viewer
• Updated • 37.1k • 1.56k
• 247
Viewer
• Updated • 169k • 22.1k
• 1.68k
Viewer
• Updated • 386k • 3.46k
• 323
PKU-Alignment/PKU-SafeRLHF
Viewer
• Updated • 164k • 13.6k
• 178
openai/webgpt_comparisons
Viewer
• Updated • 19.6k • 893
• 240
openai/summarize_from_feedback
Viewer
• Updated • 194k • 2.05k
• 217
HuggingFaceH4/ultrafeedback_binarized
Viewer
• Updated • 187k • 7.35k
• 324
Viewer
• Updated • 183k • 896
• 295
HuggingFaceH4/stack-exchange-preferences
Viewer
• Updated • 10.8M • 6.41k
• 133
HuggingFaceH4/hhh_alignment
Viewer
• Updated • 221 • 330
• 22
Birchlabs/openai-prm800k-stepwise-critic
Viewer
• Updated • 1.09M • 124
• 45
prometheus-eval/Feedback-Collection
Viewer
• Updated • 100k • 466
• 118
argilla/OpenHermesPreferences
Viewer
• Updated • 989k • 767
• 212
Viewer
• Updated • 8.11k • 6.54k
• 106
Viewer
• Updated • 21.4k • 13.1k
• 440
Magpie-Align/Magpie-Pro-DPO-200K
Viewer
• Updated • 207k • 22
• 7
argilla/magpie-ultra-v0.1
Viewer
• Updated • 50k • 689
• 221