Papers
-
Data Darwinism Part II: DataEvolve -- AI can Autonomously Evolve Pretraining Data Curation
-
MBD: A Model-Based Debiasing Framework Across User, Content, and Model Dimensions
-
GenState-AI: State-Aware Dataset for Text-to-Video Retrieval on AI-Generated Videos
-
Creative Convergence or Imitation? Genre-Specific Homogeneity in LLM-Generated Chinese Literature
-
End-to-End Spatial-Temporal Transformer for Real-time 4D HOI Reconstruction
-
DASH: Dynamic Audio-Driven Semantic Chunking for Efficient Omnimodal Token Compression
-
AR-Flow VAE: A Structured Autoregressive Flow Prior Variational Autoencoder for Unsupervised Blind Source Separation
-
Solution for 10th Competition on Ambivalence/Hesitancy (AH) Video Recognition Challenge using Divergence-Based Multimodal Fusion
-
Echoes Across Centuries: Phonetic Signatures of Persian Poets
-
Zoom to Essence: Trainless GUI Grounding by Inferring upon Interface Elements
-
Life cycle assessment for all organic chemicals
-
How to find expressible and trainable parameterized quantum circuits?
-
On the Degrees of Freedom of Gridded Control Points in Learning-Based Medical Image Registration
-
Uni-MDTrack: Learning Decoupled Memory and Dynamic States for Parameter-Efficient Visual Tracking in All Modality
-
PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark
-
Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs
-
Inclusive AI for Group Interactions: Predicting Gaze-Direction Behaviors in People with Intellectual and Developmental Disabilities
-
STAG-CN: Spatio-Temporal Apiary Graph Convolutional Network for Disease Onset Prediction in Beehive Sensor Networks
-
An Industrial-Scale Insurance LLM Achieving Verifiable Domain Mastery and Hallucination Control without Competence Trade-offs
-
AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents
-
LongVidSearch: An Agentic Benchmark for Multi-hop Evidence Retrieval Planning in Long Videos
-
Physics-Informed Policy Optimization via Analytic Dynamics Regularization
-
AI Can Learn Scientific Taste
-
On the (Generative) Linear Sketching Problem
-
Wi-Spike: A Low-power WiFi Human Multi-action Recognition Model with Spiking Neural Networks
-
Geometric and Topological Deep Learning for Predicting Thermo-mechanical Performance in Cold Spray Deposition Process Modeling
-
The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs
-
Convergence of Two Time-Scale Stochastic Approximation: A Martingale Approach
-
V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning
-
Disentangling Dynamical Systems: Causal Representation Learning Meets Local Sparse Attention
-
Unlearning-based sliding window for continual learning under concept drift
-
Infinite Problem Generator: Verifiably Scaling Physics Reasoning Data with Agentic Workflows
-
Predicting Stress-strain Behaviors of Additively Manufactured Materials via Loss-based and Activation-based Physics-informed Machine Learning
-
Fine-tuning MLLMs Without Forgetting Is Easier Than You Think
-
Bridging the Gap in the Responsible AI Divides
-
Refining 3D Medical Segmentation with Verbal Instruction
-
WorldVLM: Combining World Model Forecasting and Vision-Language Reasoning
-
R3DP: Real-Time 3D-Aware Policy for Embodied Manipulation
-
CangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming Language
-
Mapping Dark-Matter Clusters via Physics-Guided Diffusion Models
-
Trust-Region Noise Search for Black-Box Alignment of Diffusion and Flow Models
-
Unlocking the Latent Canvas: Eliciting and Benchmarking Symbolic Visual Expression in LLMs
-
Expanding mmWave Datasets for Human Pose Estimation with Unlabeled Data and LiDAR Datasets
-
High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise
-
Excited Pfaffians: Generalized Neural Wave Functions Across Structure and State
-
Learning to Forget: Sleep-Inspired Memory Consolidation for Resolving Proactive Interference in Large Language Models
-
VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning
-
MALicious INTent Dataset and Inoculating LLMs for Enhanced Disinformation Detection
-
LatSearch: Latent Reward-Guided Search for Faster Inference-Time Scaling in Video Diffusion
-
Interp3R: Continuous-time 3D Geometry Estimation with Frames and Events
