Papers
-
When is Your LLM Steerable?
-
Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling
-
RLCSD: Reinforcement Learning with Contrastive On-Policy Self-Distillation
-
From AGI to ASI
-
FACTR 2: Learning External Force Sensing for Commodity Robot Arms Improves Policy Learning
-
i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models
-
Test-Time Gradient Guidance of Flow Policies in Reinforcement Learning
-
Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models
-
Self-Harness: Harnesses That Improve Themselves
-
Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models
-
How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope
-
How AI Agents Reshape Knowledge Work: Autonomy, Efficiency, and Scope
-
Latent Reasoning with Normalizing Flows
-
Slim attention: cut your context memory in half without loss – K-cache is all you need for MHA
-
MAI-Thinking-1: Building a Hill-Climbing Machine
-
AFUN: Towards an Affordance Foundation Model for Functionality Understanding
-
MPMWorlds: Material-Point-Method Simulations for Inferring and Extrapolating Physical Dynamics
-
Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses
-
Physical Atari: A Robust and Accessible Platform for Real-time Reinforcement Learning on Robots
-
StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement
-
RealityTest: How People Probe AI Identity and Whether Models Disclose It
-
Representation Forcing for Bottleneck-Free Unified Multimodal Models
-
mRNAutilus: Multi-Objective-Guided Discrete Generation of mRNA with Optimized Therapeutic Properties
-
Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning
-
The Little Book of Generative AI Foundations: An Intuitive Mathematical Primer
-
Scaling Laws for Agent Harnesses via Effective Feedback Compute
-
Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents
-
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
-
AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation
-
Self-Improving Language Models with Bidirectional Evolutionary Search
-
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players
-
Elias in the Lighthouse, Again? Diagnosing Low Diversity in LLM Stories
-
Laguna M.1/XS.2 Technical Report
-
Learn from your own latents and not from tokens: A sample-complexity theory
-
MobileMoE: Scaling On-Device Mixture of Experts
-
Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini
-
The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence
-
When Does LeJEPA Learn a World Model?
-
Unified Neural Scaling Laws
-
Language Models Need Sleep
-
LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence
-
Training-Free Looped Transformers
-
Polar: Agentic RL on Any Harness at Scale
-
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
-
SkillOpt: Executive Strategy for Self-Evolving Agent Skills
-
Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings
-
Forecasting Scientific Progress with Artificial Intelligence
-
Vector Policy Optimization: Training for Diversity Improves Test-Time Search
-
Advancing Mathematics Research with AI-Driven Formal Proof Search
