Papers
-
Kwai Summary Attention Technical Report
-
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation
-
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
-
Video Analysis and Generation via a Semantic Progress Function
-
The Recurrent Transformer: Greater Effective Depth and Efficient Decoding
-
There Will Be a Scientific Theory of Deep Learning
-
Hyperloop Transformers
-
AgenticQwen: Training Small Agentic Language Models with Dual Data Flywheels for Industrial-Scale Tool Use
-
Building a Precise Video Language with Human-AI Oversight
-
SWE-chat: Coding Agent Interactions From Real Users in the Wild
-
Image Generators are Generalist Vision Learners
-
Synthesizing Multi-Agent Harnesses for Vulnerability Discovery
-
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence
-
OpenGame: Open Agentic Coding for Games
-
Why Fine-Tuning Encourages Hallucinations and How to Fix It
-
Discovering Novel LLM Experts via Task-Capability Coevolution
-
Autonomous Evolution of EDA Tools: Multi-Agent Self-Evolved ABC
-
Language models transmit behavioural traits through hidden signals in dataAnthropic / Alignment Research Center, Anthropic, Truthful AI, UC Berkeley, Warsaw University of Technology
-
Accelerating Speculative Decoding with Block Diffusion Draft TreesTechnion – Israel Institute of Technology
-
Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems
-
Toward Autonomous Long-Horizon Engineering for ML Research
-
Nucleus-Image: Sparse MoE for Image Generation
-
Lyra 2.0: Explorable Generative 3D Worlds
-
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
-
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
-
A Mechanistic Analysis of Looped Reasoning Language Models
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
-
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
-
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation
-
GenTac: Generative Modeling and Forecasting of Soccer Tactics
-
Steered LLM Activations are Non-Surjective
-
ELT: Elastic Looped Transformers for Visual Generation
-
Strips as Tokens: Artist Mesh Generation with Native UV Segmentation
-
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
-
Scaffolding Human-AI Collaboration: A Field Experiment on Behavioral Protocols and Cognitive Reframing
-
Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
-
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
-
Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers
-
Memento: Teaching LLMs to Manage Their Own Context
-
In-Place Test-Time Training
-
Vero: An Open RL Recipe for General Visual ReasoningPrinceton University
-
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text
-
Learning, Potential, and Retention: An Approach for Evaluating Adaptive AI-Enabled Medical Devices
-
AI Trust OS -- A Continuous Governance Framework for Autonomous AI Observability and Zero-Trust Compliance in Enterprise Environments
-
Synthetic Sandbox for Training Machine Learning Engineering Agents
-
AURA: Always-On Understanding and Real-Time Assistance via Video Streams
-
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking
-
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?
-
Epicure: Multidimensional Flavor Structure in Food Ingredient Embeddings
