YangWang92's picture

YangWang92

yangwang92

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

liked a model 7 days ago

XiaomiMiMo/MiMo-V2-Flash-Base

upvoted a paper 7 days ago

Universal Reasoning Model

View all activity

Organizations

upvoted a paper 2 days ago

Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience

Paper • 2512.17260 • Published 6 days ago • 47

upvoted a paper 7 days ago

Universal Reasoning Model

Paper • 2512.14693 • Published 8 days ago • 36

upvoted a paper 8 days ago

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 10 days ago • 98

upvoted a paper 19 days ago

Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

Paper • 2510.24821 • Published Oct 28 • 38

upvoted 2 papers about 1 month ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published Nov 11 • 104

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

Paper • 2511.07384 • Published Nov 10 • 16

upvoted a collection about 1 month ago

Retrofitting Recurrence

40 items • Updated Nov 11 • 6

upvoted a paper about 1 month ago

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

Paper • 2511.06307 • Published Nov 9 • 51

upvoted a collection about 2 months ago

LLaDA 2.0

7 items • Updated about 19 hours ago • 35

upvoted 3 papers 2 months ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 7

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 176

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13 • 165

upvoted a paper 3 months ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10 • 34

upvoted a collection 3 months ago

L1

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 7 items • Updated Jul 13 • 8

upvoted 2 papers 3 months ago

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published Oct 1 • 25

Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23 • 15

upvoted a collection 3 months ago

Qwen3-Next

4 items • Updated Sep 22 • 169

upvoted 2 papers 4 months ago

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20 • 39

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

upvoted a collection 4 months ago

NVIDIA Nemotron V2

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 1 day ago • 100