Papers
-
Rethinking Evaluation in Retrieval-Augmented Personalized Dialogue: A Cognitive and Linguistic Perspective
-
Interleaved Resampling and Refitting: Data and Compute-Efficient Evaluation of Black-Box Predictors
-
Safety-Potential Pruning for Enhancing Safety Prompts Against VLM Jailbreaking Without Retraining
-
FIND: A Simple yet Effective Baseline for Diffusion-Generated Image Detection
-
A Real-Time Neuro-Symbolic Ethical Governor for Safe Decision Control in Autonomous Robotic Manipulation
-
Membership Inference for Contrastive Pre-training Models with Text-only PII Queries
-
Self-Indexing KVCache: Predicting Sparse Attention from Compressed Keys
-
"I'm Not Reading All of That": Understanding Software Engineers' Level of Cognitive Engagement with Agentic Coding Assistants
-
Not All Directions Matter: Toward Structured and Task-Aware Low-Rank Adaptation
-
Agentic DAG-Orchestrated Planner Framework for Multi-Modal, Multi-Hop Question Answering in Hybrid Data Lakes
-
S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction
-
Domain-Skewed Federated Learning with Feature Decoupling and Calibration
-
QiMeng-CodeV-SVA: Training Specialized LLMs for Hardware Assertion Generation via RTL-Grounded Bidirectional Data Synthesis
-
FOCUS: Bridging Fine-Grained Recognition and Open-World Discovery across Domains
-
CamLit: Unified Video Diffusion with Explicit Camera and Lighting Control
-
BIT: Matching-based Bi-directional Interaction Transformation Network for Visible-Infrared Person Re-Identification
-
GoldenStart: Q-Guided Priors and Entropy Control for Distilling Flow Policies
-
Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective
-
OAHuman: Occlusion-Aware 3D Human Reconstruction from Monocular Images
-
Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
-
MistExit: Learning to Exit for Early Mistake Detection in Procedural Videos
-
ZOTTA: Test-Time Adaptation with Gradient-Free Zeroth-Order Optimization
-
ITKIT: Feasible CT Image Analysis based on SimpleITK and MMEngine
-
Automatic Inter-document Multi-hop Scientific QA Generation
-
Sampling Boltzmann distributions via normalizing flow approximation of transport maps
-
Bringing Model Editing to Generative Recommendation in Cold-Start Scenarios
-
MedPriv-Bench: Benchmarking the Privacy-Utility Trade-off of Large Language Models in Medical Open-End Question Answering
-
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
-
Beyond Distance: Quantifying Point Cloud Dynamics with Persistent Homology and Dynamic Optimal Transport
-
Toward Clinically Ready Foundation Models in Medical Image Analysis: Adaptation Mechanisms and Deployment Trade-offs
-
Learning in Function Spaces: An Unified Functional Analytic View of Supervised and Unsupervised Learning
-
Controllable Accent Normalization via Discrete Diffusion
-
All-day Multi-scenes Lifelong Vision-and-Language Navigation with Tucker Adaptation
-
DC-ViT: Modulating Spatial and Channel Interactions for Multi-Channel Images
-
Multi-Period Texture Contrast Enhancement for Low-Contrast Wafer Defect Detection and Segmentation
-
AEX: Non-Intrusive Multi-Hop Attestation and Provenance for LLM APIs
-
High-Fidelity Compression of Seismic Velocity Models via SIREN Auto-Decoders
-
MorphSNN: Adaptive Graph Diffusion and Structural Plasticity for Spiking Neural Networks
-
Windowed Fourier Propagator: A Frequency-Local Neural Operator for Wave Equations in Inhomogeneous Media
-
RegFormer++: An Efficient Large-Scale 3D LiDAR Point Registration Network with Projection-Aware 2D Transformer
-
Seeking Physics in Diffusion Noise
-
RL-ScanIQA: Reinforcement-Learned Scanpaths for Blind 360°Image Quality Assessment
-
Show Me When and Where: Towards Referring Video Object Segmentation in the Wild
-
4D Synchronized Fields: Motion-Language Gaussian Splatting for Temporal Scene Understanding
-
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
-
A Physically-Grounded Attack and Adaptive Defense Framework for Real-World Low-Light Image Enhancement
-
In-Field 3D Wheat Head Instance Segmentation From TLS Point Clouds Using Deep Learning Without Manual Labels
-
Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange
-
Mind the Shift: Decoding Monetary Policy Stance from FOMC Statements with Large Language Models
-
Enhancing LLM Training via Spectral Clipping
