Papers
-
Random Dot Product Graphs as Dynamical Systems: Limitations and Opportunities
-
Reasoning Models Struggle to Control their Chains of Thought
-
Interpretable Perception and Reasoning for Audiovisual GeolocationMichigan State University
-
The Rise of AI in Weather and Climate Information and its Impact on Global InequalityBarcelona Supercomputing Center, Catalan Institution for Research and Advanced Studies, Imperial College London
-
Any to Full: Prompting Depth Anything for Depth Completion in One StageHong Kong University of Science and Technology, JD Logistics, Michigan State University, Rutgers University
-
Unsupervised domain adaptation for radioisotope identification in gamma spectroscopyPacific Northwest National Laboratory, University of Washington
-
Cultural Perspectives and Expectations for Generative AI: A Global Survey Approach
-
Interpretable Motion Artificat Detection in structural Brain MRIHarvard Medical School, Indian Institute of Technology
-
LTLGuard: Formalizing LTL Specifications with Compact Language Models and Lightweight Symbolic ReasoningAristotle University of Thessaloniki, Austrian Institute of Technology, Northeastern University
-
Structured Multidimensional Representation Learning for Large Language ModelsUniversite du Littoral Cote d’Opale, University Mohammed VI Polytechnic
-
Unlocking ImageNet's Multi-Object Nature: Automated Large-Scale Multilabel AnnotationRochester Institute of Technology, University of Rochester
-
From Phase Grounding to Intelligent Surgical NarrativesNew Mexico Institute of Mining and Technology
-
Revisiting the (Sub)Optimality of Best-of-N for Inference-Time AlignmentColumbia University
-
Dynamic Targeting of Satellite Observations Using Supplemental Geostationary Satellite Data and Hierarchical PlanningCalifornia Institute of Technology, Carnegie Mellon University, Harvard University
-
Let's Talk, Not Type: An Oral-First Multi-Agent Architecture for GuaraníUniversity of Kansas
-
CodeScout: Contextual Problem Statement Enhancement for Software Agents
-
NERdME: a Named Entity Recognition Dataset for Indexing Research Artifacts in Code RepositoriesFIZ Karlsruhe, Fraunhofer-Institut für Offene Kommunikationssysteme, Karlsruhe Institute of Technology, Technische Universitat Berlin, Université de Toulouse Jean Jaurès
-
Uni-LVC: A Unified Method for Intra- and Inter-Mode Learned Video CompressionPurdue University
-
Full Dynamic Range Sky-Modelling For Image Based LightingUniversité Laval
-
MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation
-
Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow EditingUniversity of Texas
-
TML-Bench: Benchmark for Data Science Agents on Tabular ML Tasks
-
The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention SinksNew York University
-
On-Policy Self-Distillation for Reasoning CompressionColumbia University, Cornell University, Iowa State University, Princeton University, Rice University, University of Michigan
-
A Simple Baseline for Unifying Understanding, Generation, and Editing via Vanilla Next-token Prediction
-
OpenFrontier: General Navigation with Visual-Language Grounded FrontiersETH Zurich
-
STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
-
AI+HW 2035: Shaping the Next Decade
-
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling
-
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
-
KARL: Knowledge Agents via Reinforcement Learning
-
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
-
Helios: Real Real-Time Long Video Generation Model
-
$τ$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge
-
AgentIR: Reasoning-Aware Retrieval for Deep Research AgentsCarnegie Mellon University, University of Queenland, University of Waterloo
-
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
-
Adaptive Memory Admission Control for LLM Agents
-
ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training
-
Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning
-
Single-minus graviton tree amplitudes are nonzero
-
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
-
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
-
RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots
-
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions
-
ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning
-
V1 : Unifying Generation and Self-Verification for Parallel Reasoners
-
Phi-4-reasoning-vision-15B Technical Report
-
Helios: Real Real-Time Long Video Generation Model
-
The Controllability Trap: A Governance Framework for Military AI Agents
-
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines
