Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a model
about 6 hours ago
mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConf_reasonUncertInPrompt
published
a model
about 12 hours ago
mehuldamani/rlcr-multi-from-rlvr-base-train-point5Brier-1point4Temp-specifyConfSumLessThan1
published
a model
about 13 hours ago
mehuldamani/rlcr-multi-from-rlvr-base-train-1pointBrier-point9Temp-specifyConfSumLessThan1
Organizations
None yet