Papers
-
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces
-
LLMs as Orchestrators: Constraint-Compliant Multi-Agent Optimization for Recommendation Systems
-
IRIS: Implicit Reward-Guided Internal Sifting for Mitigating Multimodal Hallucination
-
BlossomRec: Block-level Fused Sparse Attention Mechanism for Sequential Recommendations
-
ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution
-
HY3D-Bench: Generation of 3D Assets
-
Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
-
CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability
-
LIVE: Long-horizon Interactive Video World Modeling
-
AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent
-
Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth
-
Accelerating Scientific Research with Gemini: Case Studies and Common TechniquesGoogle / Bar-Ilan University, Carnegie Mellon University, École Polytechnique Fédérale de Lausanne, Harvard University, Illinois Institute of Technology, MIT, Nanyang Technological University, Purdue University, Rutgers University, Texas A&M University, University of California, University of Maryland, University of Michigan, University of Southern California
-
Closing the Loop: Universal Repository Representation with RPG-Encoder
-
Understanding Agent Scaling in LLM-Based Multi-Agent Systems via DiversityCalifornia Institute of Technology, Johns Hopkins University, Shanghai Jiao Tong University, University of California, Berkeley
-
AutoFigure: Generating and Refining Publication-Ready Scientific IllustrationsWestlake University
-
Agent Primitives: Reusable Latent Building Blocks for Multi-Agent SystemsUniversity of Illinois Urbana-Champaign
-
Generative AI for Enzyme Design and BiocatalysisUniversitat Pompeu Fabra
-
No Generation without Representation: Efficient Causal Protein Language Models Enable Zero-Shot Fitness Estimation
-
Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models
-
HunyuanImage 3.0 Technical Report
-
MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models
-
OneMall: One Architecture, More Scenarios -- End-to-End Generative Recommender Family at Kuaishou E-Commerce
-
SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
-
An Empirical Study on Noisy Data and LLM Pretraining Loss Divergence
-
Interpretable Tabular Foundation Models via In-Context Kernel Regression
-
RFS: Reinforcement Learning with Residual Flow Steering for Dexterous Manipulation
-
CUA-Skill: Develop Skills for Computer Using Agent
-
ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
-
SimMerge: Learning to Select Merge Operators from Similarity Signals
-
Argument Rarity-based Originality Assessment for AI-Assisted WritingRitsumeikan Global Innovation Research Organization
-
AgentRx: Diagnosing AI Agent Failures from Execution Trajectories
-
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
-
What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-Zoom
-
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Complex Real-World Tasks
-
Toward Fully Autonomous Driving: AI, Challenges, Opportunities, and Needs
-
AlignGemini: Generalizable AI-Generated Image Detection Through Task-Model AlignmentTencent / East China University of Science and Technology, Hong Kong University of Science and Technology, Shenzhen University
-
SpanNorm: Reconciling Training Stability and Performance in Deep Transformers
-
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
-
MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers
-
LLM-42: Enabling Determinism in LLM Inference with Verified Speculation
-
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
-
Lost in Transmission: When and Why LLMs Fail to Reason Globally
-
PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing
-
PI-Light: Physics-Inspired Diffusion for Full-Image Relighting
-
How AI Impacts Skill Formation
-
M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit Quantization
-
Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding
-
Thought-Transfer: Indirect Targeted Poisoning Attacks on Chain-of-Thought Reasoning Models
-
DeepSeek-OCR 2: Visual Causal Flow
-
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models
