Papers
-
Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned
-
Reinforcing Structured Chain-of-Thought for Video Understanding
-
Can Small Models Reason About Legal Documents? A Comparative Study
-
Adapting Segment Anything Model 3 for Concept-Driven Lesion Segmentation in Medical Images: An Experimental Study
-
Collision-Aware Vision-Language Learning for End-to-End Driving with Multimodal Infraction Datasets
-
Toward Actionable Digital Twins for Radiation-Based Imaging and Therapy: Mathematical Formulation, Modular Workflow, and an OpenKBP-Based Dose-Surrogate Prototype
-
Globalized Adversarial Regret Optimization: Robust Decisions with Uncalibrated Predictions
-
Low-Rank-Modulated Functa: Exploring the Latent Space of Implicit Neural Representations for Interpretable Ultrasound Video Analysis
-
Online Learning for Dynamic Constellation Topologies
-
EngineAD: A Real-World Vehicle Engine Anomaly Detection Dataset
-
Adversarial-Robust Multivariate Time-Series Anomaly Detection via Joint Information Retention
-
On the Objective and Feature Weights of Minkowski Weighted k-Means
-
When Chain-of-Thought Backfires: Evaluating Prompt Sensitivity in Medical Language Models
-
BEVMAPMATCH: Multimodal BEV Neural Map Matching for Robust Re-Localization of Autonomous Vehicles
-
Neuro-Cognitive Reward Modeling for Human-Centered Autonomous Vehicle Control
-
MemoryCD: Benchmarking Long-Context User Memory of LLM Agents for Lifelong Cross-Domain Personalization
-
Do Neurons Dream of Primitive Operators? Wake-Sleep Compression Rediscovers Schank's Event Semantics
-
Second-Order, First-Class: A Composable Stack for Curvature-Aware Training
-
Diffusion MRI Transformer with a Diffusion Space Rotary Positional Embedding (D-RoPE)
-
A Priori Sampling of Transition States with Guided Diffusion
-
Policy-Guided World Model Planning for Language-Conditioned Visual Navigation
-
Epileptic Seizure Prediction Using Patient-Adaptive Transformer Networks
-
Natural-Language Agent Harnesses
-
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
-
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data
-
The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
-
Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs
-
Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning
-
DB SwinT: A Dual-Branch Swin Transformer Network for Road Extraction in Optical Remote Sensing Imagery
-
UW-VOS: A Large-Scale Dataset for Underwater Video Object Segmentation
-
CVPD at QIAS 2026: RAG-Guided LLM Reasoning for Al-Mawarith Share Computation and Heir Allocation
-
Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing
-
COVTrack++: Learning Open-Vocabulary Multi-Object Tracking from Continuous Videos via a Synergistic Paradigm
-
ELITE: Experiential Learning and Intent-Aware Transfer for Self-improving Embodied Agents
-
Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale
-
i-IF-Learn: Iterative Feature Selection and Unsupervised Learning for High-Dimensional Complex Data
-
Decompose and Transfer: CoT-Prompting Enhanced Alignment for Open-Vocabulary Temporal Action Detection
-
Lagrangian Relaxation Score-based Generation for Mixed Integer linear Programming
-
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs
-
SpectralSplats: Robust Differentiable Tracking via Spectral Moment Supervision
-
A$^3$: Towards Advertising Aesthetic Assessment
-
SemLayer: Semantic-aware Generative Segmentation and Layer Construction for Abstract Icons
-
Minimal Sufficient Representations for Self-interpretable Deep Neural Networks
-
HAM: A Training-Free Style Transfer Approach via Heterogeneous Attention Modulation for Diffusion Models
-
MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning
-
LGEST: Dynamic Spatial-Spectral Expert Routing for Hyperspectral Image Classification
-
FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval
-
Hierarchical Spatial-Temporal Graph-Enhanced Model for Map-Matching
-
Beyond Semantic Priors: Mitigating Optimization Collapse for Generalizable Visual Forensics
-
Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification
MongoDB - Build AI That Scales
