tran minh thang
thangtm
·
AI & ML interests
None yet
Recent Activity
updated
a collection
about 11 hours ago
DLM
updated
a collection
about 12 hours ago
reasoning_model
updated
a collection
about 12 hours ago
RL
Organizations
None yet
data
reasoning_model
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 92 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 102 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 98
RL
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 132 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 98
RAG
OCR
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 12.3k • 1.48k -
nanonets/Nanonets-OCR2-3B
Image-Text-to-Text • 4B • Updated • 83.5k • 476 -
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text • 3B • Updated • 3.15M • 3.07k -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 139
code LLM
-
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
Paper • 2601.06953 • Published • 39 -
Dr. Zero: Self-Evolving Search Agents without Training Data
Paper • 2601.07055 • Published • 11 -
Controlled Self-Evolution for Algorithmic Code Optimization
Paper • 2601.07348 • Published • 92
flow_matching_model
DLM
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 122 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78
ARC
Reduce_thinking
-
FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
33B • Updated • 14 • 129 -
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Paper • 2506.08343 • Published • 54 -
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Paper • 2509.26226 • Published • 33
robot
code LLM
-
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
Paper • 2601.06953 • Published • 39 -
Dr. Zero: Self-Evolving Search Agents without Training Data
Paper • 2601.07055 • Published • 11 -
Controlled Self-Evolution for Algorithmic Code Optimization
Paper • 2601.07348 • Published • 92
data
flow_matching_model
reasoning_model
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 92 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 102 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 98
DLM
-
TiDAR: Think in Diffusion, Talk in Autoregression
Paper • 2511.08923 • Published • 122 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
What Makes Diffusion Language Models Super Data Learners?
Paper • 2510.04071 • Published -
LLaDA2.0: Scaling Up Diffusion Language Models to 100B
Paper • 2512.15745 • Published • 78
RL
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 132 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 98
ARC
RAG
Reduce_thinking
-
FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview
33B • Updated • 14 • 129 -
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency
Paper • 2506.08343 • Published • 54 -
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
Paper • 2509.26226 • Published • 33
OCR
-
PaddlePaddle/PaddleOCR-VL
Image-Text-to-Text • 1.0B • Updated • 12.3k • 1.48k -
nanonets/Nanonets-OCR2-3B
Image-Text-to-Text • 4B • Updated • 83.5k • 476 -
deepseek-ai/DeepSeek-OCR
Image-Text-to-Text • 3B • Updated • 3.15M • 3.07k -
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
Paper • 2509.22186 • Published • 139