Papers
-
Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints
-
NeuroVLM-Bench: Evaluation of Vision-Enabled Large Language Models for Clinical Reasoning in Neurological Disorders
-
CORA: A Pathology Synthesis Driven Foundation Model for Coronary CT Angiography Analysis and MACE Risk Assessment
-
Gaze patterns predict preference and confidence in pairwise AI image evaluation
-
Towards automatic smoke detector inspection: Recognition of the smoke detectors in industrial facilities and preparation for future drone integration
-
Resisting Humanization: Ethical Front-End Design Choices in AI for Sensitive Contexts
-
SentinelAI: A Multi-Agent Framework for Structuring and Linking NG9-1-1 Emergency Incident Data
-
AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective
-
How Far Are Vision-Language Models from Constructing the Real World? A Benchmark for Physical Generative Reasoning
-
OptiSAR-Net++: A Large-Scale Benchmark and Transformer-Free Framework for Cross-Domain Remote Sensing Visual Grounding
-
More Than "Means to an End": Supporting Reasoning with Transparently Designed AI Data Science Processes
-
Learning to Staff: Offline Reinforcement Learning and Fine-Tuned LLMs for Warehouse Staffing Optimization
-
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs
-
Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies
-
Generalizing Dynamics Modeling More Easily from Representation Perspective
-
Large-Scale Avalanche Mapping from SAR Images with Deep Learning-based Change Detection
-
Bounding Box Anomaly Scoring for simple and efficient Out-of-Distribution detection
-
Improving LLM Predictions via Inter-Layer Structural Encoders
-
Vision-based Deep Learning Analysis of Unordered Biomedical Tabular Datasets via Optimal Spatial Cartography
-
MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation
-
GeoTikzBridge: Advancing Multimodal Code Generation for Geometric Perception and Reasoning
-
Think 360°: Evaluating the Width-centric Reasoning Capability of MLLMs Beyond Depth
-
WiFi2Cap: Semantic Action Captioning from Wi-Fi CSI via Limb-Level Semantic Alignment
-
Coordinate Encoding on Linear Grids for Physics-Informed Neural Networks
-
TimeWeaver: Age-Consistent Reference-Based Face Restoration with Identity Preservation
-
Synthetic or Authentic? Building Mental Patient Simulators from Longitudinal Evidence
-
How Far Can VLMs Go for Visual Bug Detection? Studying 19,738 Keyframes from 41 Hours of Gameplay Videos
-
Detecting Non-Membership in LLM Training Data via Rank Correlations
-
Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics
-
Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints
-
PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset
-
Labeled Compression Schemes for Concept Classes of Finite Functions
-
HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment
-
Double Coupling Architecture and Training Method for Optimization Problems of Differential Algebraic Equations with Parameters
-
Spiking Personalized Federated Learning for Brain-Computer Interface-Enabled Immersive Communication
-
Behavioral Heterogeneity as Quantum-Inspired Representation
-
How Utilitarian Are OpenAI's Models Really? Replicating and Reinterpreting Pfeffer, Krügel, and Uhl (2025)
-
SOUPLE: Enhancing Audio-Visual Localization and Segmentation with Learnable Prompt Contexts
-
Explanation Generation for Contradiction Reconciliation with LLMs
-
Multitask-Informed Prior for In-Context Learning on Tabular Data: Application to Steel Property Prediction
-
Algorithmic warm starts for Hamiltonian Monte Carlo
-
Beyond Binary Correctness: Scaling Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks
-
REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees
-
CLiGNet: Clinical Label-Interaction Graph Network for Medical Specialty Classification from Clinical Transcriptions
-
PRISM: A Dual View of LLM Reasoning through Semantic Flow and Latent Computation
-
KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training
-
MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding
-
Multimodal Industrial Anomaly Detection via Geometric Prior
-
Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning
-
ENC-Bench: A Benchmark for Evaluating Multimodal Large Language Models in Electronic Navigational Chart Understanding
MongoDB - Build AI That Scales
