1 2

Mehul Damani PRO

mehuldamani

https://damanimehul.github.io

AI & ML interests

Reinforcement Learning, Large Language Models

Recent Activity

published a model about 6 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt

published a model about 12 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1

published a model about 13 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1

View all activity

Organizations

None yet

Collections 1

Papers 4

models 190

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt

Updated about 6 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1

Updated about 12 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1

Updated about 13 hours ago

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-point9Temp-specifyConfSumLessThan1

Updated about 15 hours ago

datasets 49

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlcr

Viewer • Updated 2 days ago • 5k • 7

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlvr

Viewer • Updated 2 days ago • 5k • 7

mehuldamani/medDataset_25k

Viewer • Updated 19 days ago • 75k • 287

mehuldamani/medDataset

Viewer • Updated 20 days ago • 1.29M • 102

mehuldamani/qwen3_8b_ambigQA_rlcr_multi_analysis

Viewer • Updated 22 days ago • 2k • 17

mehuldamani/qwen3_8b_ambigQA_rlcr_single_passk_tryAgain

Viewer • Updated 23 days ago • 2k • 10

mehuldamani/ambigQA

Viewer • Updated 26 days ago • 12k • 95

mehuldamani/judge-new-sft-instruct

Viewer • Updated Dec 10, 2025 • 100 • 3

mehuldamani/judge-new-sft-base

Viewer • Updated Dec 10, 2025 • 100 • 6

mehuldamani/judge-new-instruct

Viewer • Updated Dec 10, 2025 • 100 • 4

View 49 datasets

Mehul Damani PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

mehuldamani/big-math-digits-v2-correctness

mehuldamani/hotpot-v2-correctness-7b

mehuldamani/orm-big-math-digits-v2-correctness

mehuldamani/big-math-digits-v2-brier

mehuldamani/big-math-digits-v2-correctness

mehuldamani/hotpot-v2-correctness-7b

mehuldamani/orm-big-math-digits-v2-correctness

mehuldamani/big-math-digits-v2-brier

Papers 4

models 190

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-point9Temp-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-point2Brier-specifyConfSumLessThan1

mehuldamani/rlcr-multi-from-rlvr-base-train-point2Brier

mehuldamani/rlcr-multi-from-rlvr-base-train-1Brier

mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier

mehuldamani/qwen3_8b_medical_rlcr_multi_separate_uniqueness_reward

datasets 49

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlcr

mehuldamani/medTroubleshootig-rlvr-220-evaled-on-rlvr

mehuldamani/medDataset_25k

mehuldamani/medDataset

mehuldamani/qwen3_8b_ambigQA_rlcr_multi_analysis

mehuldamani/qwen3_8b_ambigQA_rlcr_single_passk_tryAgain

mehuldamani/ambigQA

mehuldamani/judge-new-sft-instruct

mehuldamani/judge-new-sft-base

mehuldamani/judge-new-instruct

Mehul Damani PRO

AI & ML interests

Recent Activity

Organizations

Collections 1

Papers 4

models 190 Sort: Recently updated

datasets 49 Sort: Recently updated

models 190

datasets 49