Papers
-
The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications
-
Representational Curvature Modulates Behavioral Uncertainty in Large Language Models
-
Frontier Coding Agents Can Now Implement an AlphaZero Self-Play Machine Learning Pipeline For Connect Four That Performs Comparably to an External Solver
-
The Last Human-Written Paper: Agent-Native Research Artifacts
-
Long-Context Aware Upcycling: A New Frontier for Hybrid LLM Scaling
-
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
-
Kwai Summary Attention Technical Report
-
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation
-
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
-
Video Analysis and Generation via a Semantic Progress Function
-
The Recurrent Transformer: Greater Effective Depth and Efficient Decoding
-
There Will Be a Scientific Theory of Deep Learning
-
Hyperloop Transformers
-
AgenticQwen: Training Small Agentic Language Models with Dual Data Flywheels for Industrial-Scale Tool Use
-
Building a Precise Video Language with Human-AI Oversight
-
SWE-chat: Coding Agent Interactions From Real Users in the Wild
-
Image Generators are Generalist Vision Learners
-
Synthesizing Multi-Agent Harnesses for Vulnerability Discovery
-
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
-
OpenGame: Open Agentic Coding for Games
-
Why Fine-Tuning Encourages Hallucinations and How to Fix It
-
Discovering Novel LLM Experts via Task-Capability Coevolution
-
Autonomous Evolution of EDA Tools: Multi-Agent Self-Evolved ABC
-
Language models transmit behavioural traits through hidden signals in dataAnthropic / Alignment Research Center, Anthropic, Truthful AI, UC Berkeley, Warsaw University of Technology
-
Accelerating Speculative Decoding with Block Diffusion Draft TreesTechnion – Israel Institute of Technology
-
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems
-
Toward Autonomous Long-Horizon Engineering for ML Research
-
Nucleus-Image: Sparse MoE for Image Generation
-
Lyra 2.0: Explorable Generative 3D Worlds
-
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
-
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
-
A Mechanistic Analysis of Looped Reasoning Language Models
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
-
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
-
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation
-
GenTac: Generative Modeling and Forecasting of Soccer Tactics
-
Steered LLM Activations are Non-Surjective
-
ELT: Elastic Looped Transformers for Visual Generation
-
Strips as Tokens: Artist Mesh Generation with Native UV Segmentation
-
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
-
Scaffolding Human-AI Collaboration: A Field Experiment on Behavioral Protocols and Cognitive Reframing
-
Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
-
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
-
Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers
-
Memento: Teaching LLMs to Manage Their Own Context
-
Action Images: End-to-End Policy Learning via Multiview Video Generation
-
In-Place Test-Time Training
-
Vero: An Open RL Recipe for General Visual ReasoningPrinceton University
-
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text
