Papers
-
Selective Fine-Tuning of GPT Architectures for Parameter-Efficient Clinical Text Classification
-
Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models
-
Relationship-Aware Safety Unlearning for Multimodal LLMs
-
Fair Benchmarking of Emerging One-Step Generative Models Against Multistep Diffusion and Flow Models
-
Deep Learning From Routine Histology Improves Risk Stratification for Biochemical Recurrence in Prostate Cancer
-
Joint Segmentation and Grading with Iterative Optimization for Multimodal Glaucoma Diagnosis
-
Walking Further: Semantic-aware Multimodal Gait Recognition Under Long-Range Conditions
-
Flood Risk Follows Valleys, Not Grids: Graph Neural Networks for Flash Flood Susceptibility Mapping in Himachal Pradesh with Conformal Uncertainty Quantification
-
Efficient Federated Conformal Prediction with Group-Conditional Guarantees
-
Selective Noise Suppression and Discriminative Mutual Interaction for Robust Audio-Visual Segmentation
-
DualTSR: Unified Dual-Diffusion Transformer for Scene Text Image Super-Resolution
-
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control
-
Vavanagi: a Community-run Platform for Documentation of the Hula Language in Papua New Guinea
-
Memory as Asset: From Agent-centric to Human-centric Memory Management
-
Cryptographic Runtime Governance for Autonomous AI Systems: The Aegis Architecture for Verifiable Policy Enforcement
-
UniFusion: A Unified Image Fusion Framework with Robust Representation and Source-Aware Preservation
-
Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
-
Interleaved Resampling and Refitting: Data and Compute-Efficient Evaluation of Black-Box Predictors
-
Safety-Potential Pruning for Enhancing Safety Prompts Against VLM Jailbreaking Without Retraining
-
FIND: A Simple yet Effective Baseline for Diffusion-Generated Image Detection
-
A Real-Time Neuro-Symbolic Ethical Governor for Safe Decision Control in Autonomous Robotic Manipulation
-
Membership Inference for Contrastive Pre-training Models with Text-only PII Queries
-
Self-Indexing KVCache: Predicting Sparse Attention from Compressed Keys
-
"I'm Not Reading All of That": Understanding Software Engineers' Level of Cognitive Engagement with Agentic Coding Assistants
-
Not All Directions Matter: Toward Structured and Task-Aware Low-Rank Adaptation
-
Agentic DAG-Orchestrated Planner Framework for Multi-Modal, Multi-Hop Question Answering in Hybrid Data Lakes
-
S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction
-
Domain-Skewed Federated Learning with Feature Decoupling and Calibration
-
QiMeng-CodeV-SVA: Training Specialized LLMs for Hardware Assertion Generation via RTL-Grounded Bidirectional Data Synthesis
-
FOCUS: Bridging Fine-Grained Recognition and Open-World Discovery across Domains
-
CamLit: Unified Video Diffusion with Explicit Camera and Lighting Control
-
BIT: Matching-based Bi-directional Interaction Transformation Network for Visible-Infrared Person Re-Identification
-
GoldenStart: Q-Guided Priors and Entropy Control for Distilling Flow Policies
-
Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective
-
OAHuman: Occlusion-Aware 3D Human Reconstruction from Monocular Images
-
Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
-
MistExit: Learning to Exit for Early Mistake Detection in Procedural Videos
-
ZOTTA: Test-Time Adaptation with Gradient-Free Zeroth-Order Optimization
-
ITKIT: Feasible CT Image Analysis based on SimpleITK and MMEngine
-
Automatic Inter-document Multi-hop Scientific QA Generation
-
Sampling Boltzmann distributions via normalizing flow approximation of transport maps
-
Bringing Model Editing to Generative Recommendation in Cold-Start Scenarios
-
MedPriv-Bench: Benchmarking the Privacy-Utility Trade-off of Large Language Models in Medical Open-End Question Answering
-
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
-
Beyond Distance: Quantifying Point Cloud Dynamics with Persistent Homology and Dynamic Optimal Transport
-
Toward Clinically Ready Foundation Models in Medical Image Analysis: Adaptation Mechanisms and Deployment Trade-offs
-
Learning in Function Spaces: An Unified Functional Analytic View of Supervised and Unsupervised Learning
-
Controllable Accent Normalization via Discrete Diffusion
-
All-day Multi-scenes Lifelong Vision-and-Language Navigation with Tucker Adaptation
-
DC-ViT: Modulating Spatial and Channel Interactions for Multi-Channel Images
