Papers

Filter by company

HRM-Text: Efficient Pretraining Beyond Scaling

Sapient Intelligence / Massachusetts Institute of Technology

Published on: 2026-05-20 Venue: arXiv preprint (cs.CL) 9 authors
HRM-Text: Efficient Pretraining Beyond Scaling

Published on: 2026-05-20 9 authors
OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization

Published on: 2026-05-20 4 authors
Mega-ASR: Towards In-the-wild^2 Speech Recognition via Scaling up Real-world Acoustic Simulation

Published on: 2026-05-19 7 authors
CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition

Published on: 2026-05-19 7 authors
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration

Published on: 2026-05-19 35 authors
Spectral classification of brown dwarfs using machine learning

Published on: 2026-05-19 3 authors
Generative Recursive Reasoning

Published on: 2026-05-19 6 authors
WavFlow: Audio Generation in Waveform Space

Published on: 2026-05-18 9 authors
Stable Audio 3

Published on: 2026-05-18 7 authors
ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents

Published on: 2026-05-17 6 authors
Look Before You Leap: Autonomous Exploration for LLM Agents

Published on: 2026-05-15 9 authors
ReactiveGWM: Steering NPC in Reactive Game World Models

Published on: 2026-05-14 7 authors
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Published on: 2026-05-14 5 authors
FutureSim: Replaying World Events to Evaluate Adaptive Agents

Published on: 2026-05-14 8 authors
Self-Distilled Agentic Reinforcement Learning

Published on: 2026-05-14 11 authors
Useful Memories Become Faulty When Continuously Updated by LLMs

Published on: 2026-05-13 7 authors
Targeted Neuron Modulation via Contrastive Pair Search

Published on: 2026-05-12 3 authors
Slicing and Dicing: Configuring Optimal Mixtures of Experts

Published on: 2026-05-12 4 authors
$δ$-mem: Efficient Online Memory for Large Language Models

Published on: 2026-05-12 10 authors
Solve the Loop: Attractor Models for Language and Reasoning

Published on: 2026-05-12 2 authors
ELF: Embedded Language Flows

Published on: 2026-05-11 8 authors
Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models

Published on: 2026-05-11 5 authors
The Truth Lies Somewhere in the Middle (of the Generated Tokens)

Published on: 2026-05-11 3 authors
Qwen-Image-2.0 Technical Report

Published on: 2026-05-11 75 authors
Quantifying the Utility of User Simulators for Building Collaborative LLM Assistants

Published on: 2026-05-10 4 authors
GLiGuard: Schema-Conditioned Classification for LLM Safeguard

Published on: 2026-05-08 4 authors
Fast Byte Latent Transformer

Published on: 2026-05-08 8 authors
Long Context Pre-Training with Lighthouse Attention

Published on: 2026-05-07 3 authors
Efficient Pre-Training with Token Superposition

Published on: 2026-05-07 3 authors
Continuous Latent Diffusion Language Model

Published on: 2026-05-07 11 authors
MiniMind-O Technical Report: An Open Small-Scale Speech-Native Omni Model

Published on: 2026-05-05 1 author
VLMaxxing through FrameMogging Training-Free Anti-Recomputation for Video Vision-Language Models

Published on: 2026-05-05 2 authors
Sharpness-Aware Pretraining Mitigates Catastrophic Forgetting

Published on: 2026-05-04 5 authors
HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Published on: 2026-05-04 11 authors
Model Spec Midtraining: Improving How Alignment Training Generalizes

Published on: 2026-05-03 4 authors
A Theory of Generalization in Deep Learning

Published on: 2026-05-02 2 authors
Writing Code vs. Shipping Code: Productivity Effects Across Generations of AI Coding Tools

Microsoft / Massachusetts Institute of Technology, National Bureau of Economic Research (NBER), University of Pennsylvania

Published on: 2026-05-01 Venue: NBER Working Paper Series, No. 35275 3 authors
Persistent Visual Memory: Sustaining Perception for Deep Generation in LVLMs

Published on: 2026-05-01 9 authors
Odysseus: Scaling VLMs to 100+ Turn Decision-Making in Games via Reinforcement Learning

Published on: 2026-05-01 13 authors
Map2World: Segment Map Conditioned Text to 3D World Generation

Published on: 2026-05-01 5 authors
Let ViT Speak: Generative Language-Image Pre-training

Published on: 2026-05-01 10 authors
Contextual Agentic Memory is a Memo, Not True Memory

Published on: 2026-04-30 3 authors
From Context to Skills: Can Language Models Learn from Context Skillfully?

Published on: 2026-04-30 13 authors
Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Published on: 2026-04-30 4 authors
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

Published on: 2026-04-29 3 authors
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

Published on: 2026-04-29 98 authors
DORA: A Scalable Asynchronous Reinforcement Learning System for Language Model Training

Published on: 2026-04-29 18 authors
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Published on: 2026-04-29 18 authors
Recursive Multi-Agent Systems

Published on: 2026-04-28 12 authors

Prev 1 2 3 4 5 6 7 8 Next

Go to section

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: