Papers
-
Towards Robust Retrieval-Augmented Generation Based on Knowledge Graph: A Comparative Analysis
-
Tureis: Transformer-based Unified Resilience for IoT Devices in Smart Homes
-
Random Dot Product Graphs as Dynamical Systems: Limitations and Opportunities
-
Reasoning Models Struggle to Control their Chains of Thought
-
Interpretable Perception and Reasoning for Audiovisual Geolocation
-
The Rise of AI in Weather and Climate Information and its Impact on Global Inequality
-
Any to Full: Prompting Depth Anything for Depth Completion in One Stage
-
Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy
-
Cultural Perspectives and Expectations for Generative AI: A Global Survey Approach
-
Interpretable Motion Artificat Detection in structural Brain MRI
-
LTLGuard: Formalizing LTL Specifications with Compact Language Models and Lightweight Symbolic Reasoning
-
Structured Multidimensional Representation Learning for Large Language Models
-
Unlocking ImageNet's Multi-Object Nature: Automated Large-Scale Multilabel Annotation
-
From Phase Grounding to Intelligent Surgical Narratives
-
Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment
-
Dynamic Targeting of Satellite Observations Using Supplemental Geostationary Satellite Data and Hierarchical Planning
-
Let's Talk, Not Type: An Oral-First Multi-Agent Architecture for Guaraní
-
CodeScout: Contextual Problem Statement Enhancement for Software Agents
-
NERdME: a Named Entity Recognition Dataset for Indexing Research Artifacts in Code Repositories
-
Uni-LVC: A Unified Method for Intra- and Inter-Mode Learned Video Compression
-
Full Dynamic Range Sky-Modelling For Image Based Lighting
-
MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation
-
Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow Editing
-
TML-Bench: Benchmark for Data Science Agents on Tabular ML Tasks
-
The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention SinksNew York University
-
On-Policy Self-Distillation for Reasoning CompressionColumbia University, Cornell University, Iowa State University, Princeton University, Rice University, University of Michigan
-
A Simple Baseline for Unifying Understanding, Generation, and Editing via Vanilla Next-token Prediction
-
OpenFrontier: General Navigation with Visual-Language Grounded FrontiersETH Zurich
-
STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
-
AI+HW 2035: Shaping the Next Decade
-
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling
-
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
-
KARL: Knowledge Agents via Reinforcement Learning
-
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
-
AgentIR: Reasoning-Aware Retrieval for Deep Research AgentsCarnegie Mellon University, University of Queenland, University of Waterloo
-
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
-
Adaptive Memory Admission Control for LLM Agents
-
ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training
-
Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning
-
Single-minus graviton tree amplitudes are nonzero
-
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
-
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
-
RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots
-
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions
-
ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning
-
V1 : Unifying Generation and Self-Verification for Parallel Reasoners
-
Phi-4-reasoning-vision-15B Technical Report
-
Helios: Real Real-Time Long Video Generation Model
-
EvoSkill: Automated Skill Discovery for Multi-Agent Systems
-
Speculative Speculative Decoding
