Papers
-
HRM-Text: Efficient Pretraining Beyond Scaling
-
HRM-Text: Efficient Pretraining Beyond Scaling
-
OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization
-
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation
-
CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition
-
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
-
Spectral classification of brown dwarfs using machine learning
-
Generative Recursive Reasoning
-
WavFlow: Audio Generation in Waveform Space
-
Stable Audio 3
-
ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents
-
Look Before You Leap: Autonomous Exploration for LLM Agents
-
ReactiveGWM: Steering NPC in Reactive Game World Models
-
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
-
FutureSim: Replaying World Events to Evaluate Adaptive Agents
-
Self-Distilled Agentic Reinforcement Learning
-
Useful Memories Become Faulty When Continuously Updated by LLMs
-
Targeted Neuron Modulation via Contrastive Pair Search
-
Slicing and Dicing: Configuring Optimal Mixtures of Experts
-
$δ$-mem: Efficient Online Memory for Large Language Models
-
Solve the Loop: Attractor Models for Language and Reasoning
-
ELF: Embedded Language Flows
-
Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models
-
The Truth Lies Somewhere in the Middle (of the Generated Tokens)
-
Qwen-Image-2.0 Technical Report
-
Quantifying the Utility of User Simulators for Building Collaborative LLM Assistants
-
GLiGuard: Schema-Conditioned Classification for LLM Safeguard
-
Fast Byte Latent Transformer
-
Long Context Pre-Training with Lighthouse Attention
-
Efficient Pre-Training with Token Superposition
-
Continuous Latent Diffusion Language Model
-
MiniMind-O Technical Report: An Open Small-Scale Speech-Native Omni Model
-
VLMaxxing through FrameMogging Training-Free Anti-Recomputation for Video Vision-Language Models
-
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting
-
HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness
-
Model Spec Midtraining: Improving How Alignment Training Generalizes
-
A Theory of Generalization in Deep Learning
-
Writing Code vs. Shipping Code: Productivity Effects Across Generations of AI Coding ToolsMicrosoft / Massachusetts Institute of Technology, National Bureau of Economic Research (NBER), University of Pennsylvania
-
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs
-
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning
-
Map2World: Segment Map Conditioned Text to 3D World Generation
-
Let ViT Speak: Generative Language-Image Pre-training
-
Contextual Agentic Memory is a Memo, Not True Memory
-
From Context to Skills: Can Language Models Learn from Context Skillfully?
-
Synthetic Computers at Scale for Long-Horizon Productivity Simulation
-
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
-
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
-
DORA: A Scalable Asynchronous Reinforcement Learning System for Language Model Training
-
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
-
Recursive Multi-Agent Systems
