Papers
-
Chain-of-Trajectories: Unlocking the Intrinsic Generative Optimality of Diffusion Models via Graph-Theoretic Planning
-
AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
-
Transition Flow Matching
-
Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents
-
Cross-RAG: Zero-Shot Retrieval-Augmented Time Series Forecasting via Cross-Attention
-
Towards Next-Generation LLM Training: From the Data-Centric Perspective
-
Training-Free Generation of Protein Sequences from Small Family Alignments via Stochastic Attention
-
Multimodal Deep Learning for Early Prediction of Patient Deterioration in the ICU: Integrating Time-Series EHR Data with Clinical Notes
-
Beyond Creed: A Non-Identity Safety Condition A Strong Empirical Alternative to Identity Framing in Low-Data LoRA Fine-Tuning
-
GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation
-
Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator
-
Automated Diabetic Screening via Anterior Segment Ocular Imaging: A Deep Learning and Explainable AI Approach
-
DeFRiS: Silo-Cooperative IoT Applications Scheduling via Decentralized Federated Reinforcement Learning
-
GNNVerifier: Graph-based Verifier for LLM Task Planning
-
Loosely-Structured Software: Engineering Context, Structure, and Evolution Entropy in Runtime-Rewired Multi-Agent Systems
-
Criterion-referenceability determines LLM-as-a-judge validity across physics assessment formats
-
A Skill-augmented Agentic Framework and Benchmark for Multi-Video Understanding
-
Gauge-Equivariant Intrinsic Neural Operators for Geometry-Consistent Learning of Elliptic PDE Maps
-
Efficient Event Camera Volume System
-
TrajMamba: An Ego-Motion-Guided Mamba Model for Pedestrian Trajectory Prediction from an Egocentric Perspective
-
PHAC: Promptable Human Amodal Completion
-
CAMD: Coverage-Aware Multimodal Decoding for Efficient Reasoning of Multimodal Large Language Models
-
Face-Guided Sentiment Boundary Enhancement for Weakly-Supervised Temporal Sentiment Localization
-
Learning Constituent Headedness
-
Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark
-
BrainBench: Exposing the Commonsense Reasoning Gap in Large Language Models
-
Online Learning for Supervisory Switching Control
-
LiDAR-EVS: Enhance Extrapolated View Synthesis for 3D Gaussian Splatting with Pseudo-LiDAR Supervision
-
Topology-Preserving Data Augmentation for Ring-Type Polygon Annotations
-
SSR: A Training-Free Approach for Streaming 3D Reconstruction
-
Investigating the Impact of Speech Enhancement on Audio Deepfake Detection in Noisy Environments
-
Understanding the geometry of deep learning with decision boundary volume
-
POLCA: Stochastic Generative Optimization with LLM
-
AnyPhoto: Multi-Person Identity Preserving Image Generation with ID Adaptive Modulation on Location Canvas
-
OpenHospital: A Thing-in-itself Arena for Evolving and Benchmarking LLM-based Collective Intelligence
-
Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image
-
HO-SFL: Hybrid-Order Split Federated Learning with Backprop-Free Clients and Dimension-Free Aggregation
-
$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval
-
Vietnamese Automatic Speech Recognition: A Revisit
-
High-Fidelity 3D Facial Avatar Synthesis with Controllable Fine-Grained Expressions
-
Information Asymmetry across Language Varieties: A Case Study on Cantonese-Mandarin and Bavarian-German QA
-
Orthogonal Subspace Clustering: Enhancing High-Dimensional Data Analysis through Adaptive Dimensionality Reduction and Efficient Clustering
-
BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator
-
Mind-of-Director: Multi-modal Agent-Driven Film Previsualization via Collaborative Decision-Making
-
LaPro-DTA: Latent Dual-View Drug Representations and Salient Protein Feature Extraction for Generalizable Drug--Target Affinity Prediction
-
GARCH-FIS: A Hybrid Forecasting Model with Dynamic Volatility-Driven Parameter Adaptation
-
Face-to-Face: A Video Dataset for Multi-Person Interaction Modeling
-
Global Truncated Loss Minimization for Robust and Threshold-Resilient Geometric Estimation
-
Multi-Task Genetic Algorithm with Multi-Granularity Encoding for Protein-Nucleotide Binding Site Prediction
-
Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces
