Papers
-
POLCA: Stochastic Generative Optimization with LLM
-
AnyPhoto: Multi-Person Identity Preserving Image Generation with ID Adaptive Modulation on Location Canvas
-
OpenHospital: A Thing-in-itself Arena for Evolving and Benchmarking LLM-based Collective Intelligence
-
Zero-Shot Reconstruction of Animatable 3D Avatars with Cloth Dynamics from a Single Image
-
HO-SFL: Hybrid-Order Split Federated Learning with Backprop-Free Clients and Dimension-Free Aggregation
-
$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval
-
Vietnamese Automatic Speech Recognition: A Revisit
-
High-Fidelity 3D Facial Avatar Synthesis with Controllable Fine-Grained Expressions
-
Information Asymmetry across Language Varieties: A Case Study on Cantonese-Mandarin and Bavarian-German QA
-
Orthogonal Subspace Clustering: Enhancing High-Dimensional Data Analysis through Adaptive Dimensionality Reduction and Efficient Clustering
-
BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator
-
Mind-of-Director: Multi-modal Agent-Driven Film Previsualization via Collaborative Decision-Making
-
LaPro-DTA: Latent Dual-View Drug Representations and Salient Protein Feature Extraction for Generalizable Drug--Target Affinity Prediction
-
GARCH-FIS: A Hybrid Forecasting Model with Dynamic Volatility-Driven Parameter Adaptation
-
Face-to-Face: A Video Dataset for Multi-Person Interaction Modeling
-
Global Truncated Loss Minimization for Robust and Threshold-Resilient Geometric Estimation
-
Multi-Task Genetic Algorithm with Multi-Granularity Encoding for Protein-Nucleotide Binding Site Prediction
-
Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces
-
Universe Routing: Why Self-Evolving Agents Need Epistemic Control
-
OpenReservoirComputing: GPU-Accelerated Reservoir Computing in JAX
-
VorTEX: Various overlap ratio for Target speech EXtraction
-
Knowledge Activation: AI Skills as the Institutional Knowledge Primitive for Agentic Software Development
-
Fold-CP: A Context Parallelism Framework for Biomolecular Modeling
-
HiMemVLN: Enhancing Reliability of Open-Source Zero-Shot Vision-and-Language Navigation with Hierarchical Memory System
-
Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning
-
M2IR: Proactive All-in-One Image Restoration via Mamba-style Modulation and Mixture-of-Experts
-
SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression
-
RAZOR: Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion Models
-
RadarXFormer: Robust Object Detection via Cross-Dimension Fusion of 4D Radar Spectra and Images for Autonomous Driving
-
Planning as Goal Recognition: Deriving Heuristics from Intention Models - Extended Version
-
Two Birds, One Projection: Harmonizing Safety and Utility in LVLMs via Inference-time Feature Projection
-
SemanticFace: Semantic Facial Action Estimation via Semantic Distillation in Interpretable Space
-
Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks
-
Neural Networks as Local-to-Global Computations
-
Halfway to 3D: Ensembling 2.5D and 3D Models for Robust COVID-19 CT Diagnosis
-
Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections
-
DamageArbiter: A CLIP-Enhanced Multimodal Arbitration Framework for Hurricane Damage Assessment from Street-View Imagery
-
The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments
-
Real-Time Driver Safety Scoring Through Inverse Crash Probability Modeling
-
ContiGuard: A Framework for Continual Toxicity Detection Against Evolving Evasive Perturbations
-
Integrating Weather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting
-
Lost in Aggregation: On a Fundamental Expressivity Limit of Message-Passing Graph Neural Networks
-
Personalized Federated Learning with Residual Fisher Information for Medical Image Segmentation
-
From Artefact to Insight: Efficient Low-Rank Adaptation of BrushNet for Scanning Probe Microscopy Image Restoration
-
AutoMoT: A Unified Vision-Language-Action Model with Asynchronous Mixture-of-Transformers for End-to-End Autonomous Driving
-
PCodeTrans: Translate Decompiled Pseudocode to Compilable and Executable Equivalent
-
From Horizontal to Rotated: Cross-View Object Geo-Localization with Orientation Awareness
-
Architecture-Agnostic Feature Synergy for Universal Defense Against Heterogeneous Generative Threats
-
Video Detector: A Dual-Phase Vision-Based System for Real-Time Traffic Intersection Control and Intelligent Transportation Analysis
-
A Score Filter Enhanced Data Assimilation Framework for Data-Driven Dynamical Systems
