1 3 5

sungyub kim

sungyub

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

sungyub/ifbench-verl:Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

liked a Space about 1 month ago

HuggingFaceFW/finephrase

upvoted an article 2 months ago

We Got Claude to Build CUDA Kernels and teach open models!

View all activity

Organizations

None yet

New activity in sungyub/ifbench-verl 3 days ago

Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

#1 opened 3 days ago by

Wuyangqian

liked a Space about 1 month ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

219

Explore synthetic data experiments on a virtual bookshelf

upvoted 2 articles 2 months ago

Article

We Got Claude to Build CUDA Kernels and teach open models!

Jan 28

•

154

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Jan 27

•

updated a collection 3 months ago

VERL QA Datasets

Collection

High-quality QA generation datasets in VERL format: document QA, table reasoning, and multi-hop reasoning tasks. • 6 items • Updated Mar 2

updated a dataset 3 months ago

sungyub/qa-verl-unified

Viewer • Updated Jan 8 • 86.4k • 48

published a dataset 3 months ago

sungyub/qa-verl-unified

Viewer • Updated Jan 8 • 86.4k • 48

updated 2 datasets 3 months ago

sungyub/docqa-rl-verl

Viewer • Updated Jan 8 • 3.6k • 43

sungyub/code-verl-unified

Viewer • Updated Jan 8 • 959k • 2.44k • 1

liked 2 Spaces 4 months ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.33k

Read a detailed overview of the FineWeb web‑scale text dataset

Evaluation Guidebook

📝

301

Explore LLM benchmark trends over time

updated a dataset 5 months ago

sungyub/codev-r1-verl

Viewer • Updated Nov 11, 2025 • 3.13k • 42

upvoted an article 5 months ago

Article

Let's talk about LLM evaluation

May 23, 2024

•

207

liked 2 Spaces 5 months ago

The Ultra-Scale Playbook

🌌

3.77k

The ultimate guide to training LLM on large GPU Clusters

The Smol Training Playbook

📚

3.1k

The secrets to building world-class LLMs

updated 5 datasets 5 months ago

sungyub kim

AI & ML interests

Recent Activity

Organizations

sungyub's activity

Hello, I find datatrove libarary doesn't have datatrove.utils.reward_score.ifeval

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

We Got Claude to Build CUDA Kernels and teach open models!

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

FineWeb: decanting the web for the finest text data at scale

Evaluation Guidebook

Let's talk about LLM evaluation

The Ultra-Scale Playbook

The Smol Training Playbook