Zhao Zihao's picture

Zhao Zihao

xishze

·

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

farhanangga89/MusicEmotionRecognition

upvoted a paper 19 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

upvoted a paper 19 days ago

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

View all activity

Organizations

None yet

upvoted 2 papers 19 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 25 days ago • 10

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Paper • 2605.29861 • Published 26 days ago • 16

upvoted a paper 20 days ago

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Paper • 2605.30993 • Published 25 days ago • 59

upvoted a paper 24 days ago

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Paper • 2605.28816 • Published 27 days ago • 430

upvoted 3 papers about 1 month ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published May 12 • 196

AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Paper • 2605.20025 • Published May 19 • 190

DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models

Paper • 2605.15055 • Published May 14 • 19

upvoted 2 papers 2 months ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published Apr 8 • 328

upvoted 2 papers 3 months ago

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 508

Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models

Paper • 2603.25716 • Published Mar 26 • 157

upvoted 3 papers 4 months ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

The Trinity of Consistency as a Defining Principle for General World Models

Paper • 2602.23152 • Published Feb 26 • 202

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 525