Papers
-
Slim attention: cut your context memory in half without loss – K-cache is all you need for MHA
-
Just-in-Time: Training-Free Spatial Acceleration for Diffusion Transformers
-
DiT4DiT: Jointly Modeling Video Dynamics and Actions for Generalizable Robot Control
-
MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production
-
Two Teachers Better Than One: Hardware-Physics Co-Guided Distributed Scientific Machine Learning
-
SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding
-
WS-Net: Weak-Signal Representation Learning and Gated Abundance Reconstruction for Hyperspectral Unmixing via State-Space and Weak Signal Attention Fusion
-
The Epistemic Support-Point Filter: Jaynesian Maximum Entropy Meets Popperian Falsification
-
Time, Identity and Consciousness in Language Model Agents
-
FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation
-
EPOCH: An Agentic Protocol for Multi-Round System Optimization
-
From Days to Minutes: An Autonomous AI Agent Achieves Reliable Clinical Triage in Remote Patient Monitoring
-
Sim2Act: Robust Simulation-to-Decision Learning via Adversarial Calibration and Group-Relative Perturbation
-
Spectral-Structured Diffusion for Single-Image Rain Removal
-
Streaming Autoregressive Video Generation via Diagonal Distillation
-
Reviving ConvNeXt for Efficient Convolutional Diffusion ModelsSwiss Federal Institute of Technology in Zurich, University of Pisa
-
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable RewardsUniversity of Chinese Academy of Sciences
-
TiPToP: A Modular Open-Vocabulary Planning System for Robotic ManipulationMIT Computer Science and Artificial Intelligence Laboratory, University of Pennsylvania
-
ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-CompareNVIDIA / Hong Kong University of Science and Technology, Shanghai Jiao Tong University, Swiss Federal Institute of Technology in Zurich, University of California, Merced
-
Towards a Neural Debugger for PythonJohannes Kepler University Linz
-
ZeroWBC: Learning Natural Visuomotor Humanoid Control Directly from Human Egocentric VideoNorthwestern Polytechnical University, Shanghai Jiao Tong University, Tsinghua University, University of Science and Technology of China
-
On the Width Scaling of Neural Optimizers Under Matrix Operator Norms I: Row/Column Normalization and Hyperparameter TransferNorthwestern University, University of British Columbia, University of Chicago
-
Reinforced Generation of Combinatorial Structures: Ramsey Numbers
-
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
-
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
-
OpenClaw-RL: Train Any Agent Simply by TalkingPrinceton University
-
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and EditingFudan University, Nanjing University, Shanghai AI Laboratory, Shanghai Jiao Tong University, South China University of Technology, The Chinese University of Hong Kong Multimedia Laboratory, Tsinghua University, University of Science and Technology of China, Xiamen University
-
Reward Prediction with Factorized World StatesEast China Normal University, Hong Kong University of Science and Technology
-
Hybrid Quantum-Classical Encoding for Accurate Residue-Level pKa Prediction
-
Hybrid Quantum Neural Network for Multivariate Clinical Time Series Forecasting
-
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
-
Tiny Autoregressive Recursive Models
-
High-Fidelity Pruning for Large Language Models
-
From Reactive to Map-Based AI: Tuned Local LLMs for Semantic Zone Inference in Object-Goal Navigation
-
EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs
-
DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation
-
Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization
-
DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning
-
TrianguLang: Geometry-Aware Semantic Consensus for Pose-Free 3D Localization
-
Adaptive MLP Pruning for Large Vision Transformers
-
Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
-
Tau-BNO: Brain Neural Operator for Tau Transport Model
-
SAMoE-VLA: A Scene Adaptive Mixture-of-Experts Vision-Language-Action Model for Autonomous Driving
-
UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking
-
Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting
-
SaiVLA-0: Cerebrum--Pons--Cerebellum Tripartite Architecture for Compute-Aware Vision-Language-Action
-
Ramsa: A Large Sociolinguistically Rich Emirati Arabic Speech Corpus for ASR and TTS
-
Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows
-
EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery
-
TRIAGE: Type-Routed Interventions via Aleatoric-Epistemic Gated Estimation in Robotic Manipulation and Adaptive Perception -- Don't Treat All Uncertainty the Same
