Training Day - a Sambosis Collection

Sambosis 's Collections

Training Day

updated Sep 17, 2025

LucasThil/randomized_clean_miniwob_episodes__image0_5000_v2

Viewer • Updated May 16, 2023 • 2.5k • 36
LucasThil/miniwob_plusplus_hierarchical_training_actions_drain

Viewer • Updated Jun 21, 2023 • 40.2k • 10 • 1
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

Paper • 2503.22677 • Published Mar 28, 2025 • 5
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs

Paper • 2503.23022 • Published Mar 29, 2025 • 6
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published Apr 4, 2025 • 18
Scaling Autonomous Agents via Automatic Reward Modeling And Planning

Paper • 2502.12130 • Published Feb 17, 2025 • 2
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Paper • 2307.12856 • Published Jul 24, 2023 • 37
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 205
Personalize Anything for Free with Diffusion Transformer

Paper • 2503.12590 • Published Mar 16, 2025 • 44
Compositional Foundation Models for Hierarchical Planning

Paper • 2309.08587 • Published Sep 15, 2023 • 11
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 26
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 144
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21, 2025 • 88
Physics of Language Models: Part 1, Context-Free Grammar

Paper • 2305.13673 • Published May 23, 2023 • 7
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details

Paper • 2506.16504 • Published Jun 19, 2025 • 31
AlphaGo Moment for Model Architecture Discovery

Paper • 2507.18074 • Published Jul 24, 2025 • 1
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27, 2025 • 37
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10, 2025 • 57
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence

Paper • 2509.12203 • Published Sep 15, 2025 • 20
SynCircuit: Automated Generation of New Synthetic RTL Circuits Can Enable Big Data in Circuits

Paper • 2509.00071 • Published Aug 26, 2025
Chunked TabPFN: Exact Training-Free In-Context Learning for Long-Context Tabular Data

Paper • 2509.00326 • Published Aug 30, 2025