4 49 1

Dawei Li

wjldw

https://david-li0406.github.io/

AI & ML interests

LLM, NLP, Data Mining

Recent Activity

updated a model 1 day ago

wjldw/ToolPRM-GRPO-synthesis

updated a model 1 day ago

wjldw/ToolPRM-GRPO-v4

published a model 1 day ago

wjldw/ToolPRM-GRPO-v4

View all activity

Organizations

updated 2 models 1 day ago

wjldw/ToolPRM-GRPO-synthesis

4B • Updated 1 day ago • 13

wjldw/ToolPRM-GRPO-v4

4B • Updated 1 day ago • 7

published a model 1 day ago

wjldw/ToolPRM-GRPO-v4

4B • Updated 1 day ago • 7

updated a model 2 days ago

wjldw/ToolPRM-Base-v4

Text Generation • 196k • Updated 2 days ago • 15

published a model 2 days ago

wjldw/ToolPRM-Base-v4

Text Generation • 196k • Updated 2 days ago • 15

updated a model 2 days ago

wjldw/ToolPRM-CoT-v4

Text Generation • 196k • Updated 2 days ago • 7

published a model 2 days ago

wjldw/ToolPRM-CoT-v4

Text Generation • 196k • Updated 2 days ago • 7

updated a model 2 days ago

wjldw/ToolPRM-Base-synthesis

Text Generation • 196k • Updated 2 days ago • 10

published 2 models 2 days ago

wjldw/ToolPRM-Base-synthesis

Text Generation • 196k • Updated 2 days ago • 10

wjldw/ToolPRM-GRPO-synthesis

4B • Updated 1 day ago • 13

updated a model 4 days ago

wjldw/ToolPRM-GRPO-v3

4B • Updated 4 days ago • 14

published a model 4 days ago

wjldw/ToolPRM-GRPO-v3

4B • Updated 4 days ago • 14

updated a model 4 days ago

wjldw/ToolPRM-Checklist-v3

Text Generation • 196k • Updated 4 days ago • 13

published a model 4 days ago

wjldw/ToolPRM-Checklist-v3

Text Generation • 196k • Updated 4 days ago • 13

updated a model 4 days ago

wjldw/ToolPRM-Base-v3

Text Generation • 196k • Updated 4 days ago • 8

published a model 4 days ago

wjldw/ToolPRM-Base-v3

Text Generation • 196k • Updated 4 days ago • 8

updated a model 4 days ago

wjldw/ToolPRM-CoT-v3

Text Generation • 196k • Updated 4 days ago • 10

published a model 4 days ago

wjldw/ToolPRM-CoT-v3

Text Generation • 196k • Updated 4 days ago • 10

upvoted 2 papers 25 days ago

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Paper • 2512.07783 • Published 28 days ago • 36

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published 28 days ago • 75

Dawei Li

AI & ML interests

Recent Activity

Organizations

wjldw's activity