arxiv:2508.16949
YANG ZHOU
Yang-Zhou
AI & ML interests
RLHF and DPO
Recent Activity
updated
a dataset
about 2 months ago
Yang-Zhou/DAPO-Math-17k-Qwen3-235B-A22B-Thinking-2507-rejection-distill
updated
a dataset
2 months ago
Yang-Zhou/DAPO-Math-17k-Qwen3-235B-A22B-Thinking-2507-rejection-distill
published
a dataset
2 months ago
Yang-Zhou/DAPO-Math-17k-Qwen3-235B-A22B-Thinking-2507-rejection-distill
Organizations
None yet