Илья Михайлов

theodorejones

19 23

AI & ML interests

Open-source multimodal experiment workflows.

Recent Activity

liked a model 13 days ago

athena2634/coho00001

upvoted a paper 24 days ago

Kwai Keye-VL-2.0 Technical Report

liked a model 30 days ago

Likithp/v9_fixed_s42

View all activity

Organizations

None yet

upvoted a paper 24 days ago

Kwai Keye-VL-2.0 Technical Report

Paper • 2606.10651 • Published 25 days ago • 192

upvoted 6 papers about 1 month ago

Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models

Paper • 2605.26895 • Published May 26 • 22

SkillGrad: Optimizing Agent Skills Like Gradient Descent

Paper • 2605.27760 • Published May 26 • 27

Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players

Paper • 2605.28816 • Published May 27 • 431

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models

Paper • 2605.25189 • Published May 24 • 4

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

Paper • 2605.20266 • Published May 18 • 56

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published May 12 • 196

upvoted a paper about 2 months ago

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

Paper • 2605.12882 • Published May 13 • 274

upvoted 7 papers 3 months ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248

Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 509

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

Understanding the Challenges in Iterative Generative Optimization with LLMs

Paper • 2603.23994 • Published Mar 25 • 29

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 353

GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

Paper • 2603.24329 • Published Mar 25 • 28

Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models

Paper • 2603.17051 • Published Mar 17 • 109

upvoted 4 papers 4 months ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published Mar 17 • 373

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published Mar 17 • 248

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 211

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

Илья Михайлов

AI & ML interests

Recent Activity

Organizations

theodorejones's activity