Jie Cheng

jinachris

https://github.com/CJReinforce

CJReinforce

AI & ML interests

Reinforcement learning, LLM

Recent Activity

upvoted a paper about 21 hours ago

STEP3-VL-10B Technical Report

upvoted a paper 4 days ago

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

upvoted a collection about 1 month ago

Nemotron-Post-Training-v3

View all activity

Organizations

None yet

upvoted a paper about 21 hours ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 3 days ago • 137

upvoted a paper 4 days ago

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published 8 days ago • 76

upvoted a collection about 1 month ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano v3. • 8 items • Updated about 12 hours ago • 56

upvoted a paper 4 months ago

VGGT-X: When VGGT Meets Dense Novel View Synthesis

Paper • 2509.25191 • Published Sep 29, 2025 • 18

liked 2 models 6 months ago

stepfun-ai/step3-fp8

Image-Text-to-Text • Updated Aug 2, 2025 • 67 • 20

stepfun-ai/step3

Image-Text-to-Text • 321B • Updated Aug 2, 2025 • 59.1k • 164

upvoted a collection 6 months ago

Step3

Collection

2 items • Updated Jul 31, 2025 • 20

liked a Space 6 months ago

AnyCoder

🏆

3.07k

Generate code with AI

upvoted a paper 7 months ago

TC-Light: Temporally Consistent Relighting for Dynamic Long Videos

Paper • 2506.18904 • Published Jun 23, 2025 • 10

liked a dataset 7 months ago

a-m-team/AM-DeepSeek-R1-0528-Distilled

Preview • Updated Jun 9, 2025 • 928 • 99

updated a model 8 months ago

jinachris/PURE-PRM-7B

Token Classification • 7B • Updated May 29, 2025 • 94 • 4

upvoted 2 papers 8 months ago

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28, 2025 • 48

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

Paper • 2410.00564 • Published Oct 1, 2024 • 1

authored a paper 8 months ago

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Paper • 2504.15275 • Published Apr 21, 2025 • 2

liked a model 8 months ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29, 2025 • 337k • • 2.39k

authored a paper 8 months ago

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

Paper • 2410.00564 • Published Oct 1, 2024 • 1

upvoted 3 papers 8 months ago

Skywork Open Reasoner 1 Technical Report

Paper • 2505.22312 • Published May 28, 2025 • 54

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 131

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Paper • 2504.15275 • Published Apr 21, 2025 • 2

updated a collection 8 months ago

PURE

Collection

PRM and fine-tuned LLM used in our PURE github repo: https://github.com/CJReinforce/PURE • 5 items • Updated May 22, 2025 • 2

Jie Cheng

AI & ML interests

Recent Activity

Organizations

jinachris's activity

AnyCoder