Papers
-
Looking Back and Forth: Cross-Image Attention Calibration and Attentive Preference Learning for Multi-Image Hallucination MitigationBeijing Institute of Technology, Harbin Institute of Technology, Tsinghua University
-
Hindsight Credit Assignment for Long-Horizon LLM AgentsCity University of Hong Kong, Nanjing University
-
Animating Petascale Time-varying Data on Commodity Hardware with LLM-assisted ScriptingUniversity of Utah, Vanderbilt University
-
Bi-directional digital twin prototype anchoring with multi-periodicity learning for few-shot fault diagnosisShanghai Jiao Tong University
-
SODA: Sensitivity-Oriented Dynamic Acceleration for Diffusion TransformerHarbin Institute of Technology
-
MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
-
User Review Writing via Interview with Dialogue SystemsThe University of Electro-Communications
-
VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video UnderstandingPeking University
-
The Talking Robot: Distortion-Robust Acoustic Models for Robot-Robot CommunicationGeorgia Institute of Technology, Institute of Science Tokyo
-
Interpretable Maximum Margin Deep Anomaly DetectionCapital Normal University, Yunnan University
-
Physics-Guided VLM Priors for All-Cloud RemovalWuhan University
-
Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement NetworkOcean University of China
-
Aligning What EEG Can See: Structural Representations for Brain-Vision MatchingBeijing University of Posts and Telecommunications
-
CoTJudger: A Graph-Driven Framework for Automatic Evaluation of Chain-of-Thought Efficiency and Redundancy in LRMsBeihang University, Chinese Academy of Sciences, Nanjing University, Shenzhen Institutes of Advanced Technology, Shenzhen University of Advanced Technology, Southeast University, The University of Manchester, The University of New South Wales, University of Science and Technology of China
-
Entropy-Aware On-Policy Distillation of Language ModelsKorea Advanced Institute of Science & Technology, University of Toronto, Vector Institute
-
VLN-Cache: Enabling Token Caching for VLN Models with Visual/Semantic Dynamics AwarenessChina University of Geosciences, Huazhong University of Science and Technology, Peking University
-
Dreamer-CDP: Improving Reconstruction-free World Models Via Continuous Deterministic Representation PredictionFriedrich Miescher Instiute for Biomedical Research
-
Countdown-Code: A Testbed for Studying The Emergence and Generalization of Reward Hacking in RLVRUniversity of Illinois Urbana-Champaign, University of Michigan
-
mAVE: A Watermark for Joint Audio-Visual Generation ModelsTsinghua University
-
Statistical Contraction for Chance-Constrained Trajectory Optimization of Non-Gaussian Stochastic SystemsUniversity of Illinois Urbana-Champaign
-
Facial Expression Generation Aligned with Human Preference for Natural Dyadic Interaction
-
NuNext: Reframing Nucleus Detection as Next-Point DetectionHarbin Institute of Technology, Nanjing University, Shanghai Artificial Intelligence Laboratory, University of Science and Technology Beijing, Westlake University, Zhejiang University
-
Grounding Machine Creativity in Game Design Knowledge Representations: Empirical Probing of LLM-Based Executable Synthesis of Goal Playable Patterns under Structural ConstraintsChalmers University of Technology, University of Gothenburg
-
Efficient Personalized Reranking with Semi-Autoregressive Generation and Online Knowledge DistillationBeijing University of Posts and Telecommunications, Shanghai Jiao Tong University, University of Science and Technology of China
-
Deep Generative Spatiotemporal Engression for Probabilistic Forecasting of EpidemicsSorbonne Center for Artificial Intelligence, Sorbonne University, Sorbonne University Abu Dhabi
-
Vision Language Models Cannot Reason About Physical TransformationAuburn University, Brown University, Carnegie Mellon University, Emory University, Johns Hopkins University, University of California, University of Michigan, University of Toronto
-
Enhancing Consistency of Werewolf AI through Dialogue Summarization and Persona InformationThe University of Electro-Communications
-
Efficient Chest X-ray Representation Learning via Semantic-Partitioned Contrastive LearningShenzhen University of Advanced Technology
-
aCAPTCHA: Verifying That an Entity Is a Capable Agent via Asymmetric HardnessNankai University, Tsinghua University
-
Turn: A Language for Agentic Computation
-
TIQA: Human-Aligned Text Quality Assessment in Generated ImagesInnopolis University, Ivannikov Institute for System Programming of the Russian Academy of Sciences, Moscow State University
-
Inter-Image Pixel Shuffling for Multi-focus Image FusionHuaqiao University
-
Combining Adam and its Inverse Counterpart to Enhance Generalization of Deep Learning Optimizers
-
Deep Expert Injection for Anchoring Retinal VLMs with Domain-Specific Knowledge
-
The Model Knows Which Tokens Matter: Automatic Token Selection via Noise Gating
-
Emotion Transcription in Conversation: A Benchmark for Capturing Subtle and Complex Emotional States through Natural Language
-
PDD: Manifold-Prior Diverse Distillation for Medical Anomaly Detection
-
CanoVerse: 3D Object Scalable Canonicalization and Dataset for Generation and Pose
-
LiveWorld: Simulating Out-of-Sight Dynamics in Generative Video World Models
-
Fine-Grained Table Retrieval Through the Lens of Complex QueriesCentrum Wiskunde & Informatica, University of Amsterdam, University of California
-
Agentic Planning with Reasoning for Image Styling via Offline RL
-
AMB-DSGDN: Adaptive Modality-Balanced Dynamic Semantic Graph Differential Network for Multimodal Emotion RecognitionCentral South University of Forestry and Technology, State University of New York
-
Improving reasoning at inference time via uncertainty minimisationAarhus University
-
Spectral Conditioning of Attention Improves Transformer PerformanceUniversity of Adelaide
-
PromptGate Client Adaptive Vision Language Gating for Open Set Federated Active LearningUniversity of Bonn
-
ACD-U: Asymmetric co-teaching with machine unlearning for robust learning with noisy labelsKansai University
-
Making LLMs Optimize Multi-Scenario CUDA Kernels Like ExpertsTsinghua University
-
Class Visualizations and Activation Atlases for Enhancing Interpretability in Deep Learning-Based Computational PathologyBavarian Cancer Research Center, Universität Augsburg, University of Chicago, University of Leeds
-
Learning to Rank the Initial Branching Order of SAT SolversHarvard University, KTH Royal Institute of Technology, Mohamed bin Zayed University of Artificial Intelligence
-
FreeFly-Thinking : Aligning Chain-of-Thought Reasoning with Continuous UAV NavigationShanghai Jiao Tong University, Tongji University
