Papers
-
Dynamic Chunking Diffusion Transformer
-
Frequency-Separable Hamiltonian Neural Network for Multi-Timescale DynamicsPrinceton University
-
LATO: 3D Mesh Flow Matching with Structured TOpology Preserving LAtents
-
Tiny, Hardware-Independent, Compression-based Classification
-
CLAIRE: Compressed Latent Autoencoder for Industrial Representation and Evaluation -- A Deep Learning Framework for Smart ManufacturingBirmingham City University, New Jersey Institute of Technology
-
Computer vision-based estimation of invertebrate biomassAarhus University, Finnish Environment Institute, University of Duisburg-Essen, University of Jyväskylä
-
HiDE: Hierarchical Dictionary-Based Entropy Modeling for Learned Image CompressionHohai University, Nanjing Audit University, Nanjing University, Nanjing University of Aeronautics and Astronautics
-
ESAA-Security: An Event-Sourced, Verifiable Architecture for Agent-Assisted Security Audits of AI-Generated Code
-
OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray AnalysisChinese Academy of Sciences, Hong Kong University of Science and Technology, Peking University, Shanghai Jiao Tong University, Shenzhen University, The University of Hong Kong
-
Adaptive Lipschitz-Free Conditional Gradient Methods for Stochastic Composite Nonconvex OptimizationShenzhen University of Advanced Technology
-
Rewis3d: Reconstruction Improves Weakly-Supervised Semantic SegmentationETH Zurich, Max Planck Institute for Informatics, Saarland University
-
Failure Detection in Chemical Processes using Symbolic Machine Learning: A Case Study on Ethylene Oxidation
-
MoEMambaMIL: Structure-Aware Selective State Space Modeling for Whole-Slide Image AnalysisFudan University, Tongji University
-
Kinetic-based regularization: Learning spatial derivatives and PDE applicationsFondazione Istituto Italiano di Tecnologia, Jawaharlal Nehru Centre for Advanced Scientific Research
-
CHMv2: Improvements in Global Canopy Height Mapping using DINOv3
-
Prompt Group-Aware Training for Robust Text-Guided Nuclei SegmentationFudan University
-
REACT++: Efficient Cross-Attention for Real-Time Scene Graph GenerationUmeå University
-
Solving Jigsaw Puzzles in the Wild: Human-Guided Reconstruction of Cultural Heritage FragmentsFoscari University of Venice, Universita degli Studi di Milano, Zhejiang Normal University
-
Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows
-
Efficient, Property-Aligned Fan-Out Retrieval via RL-Compiled Diffusion
-
DiffInf: Influence-Guided Diffusion for Supervision Alignment in Facial Attribute LearningJohns Hopkins University
-
U6G XL-MIMO Radiomap Prediction: Multi-Config Dataset and Beam Map ApproachSoutheast University, Sun Yat-sen University
-
Adapter-Augmented Bandits for Online Multi-Constrained Multi-Modal Inference SchedulingMacquarie University, Sun Yat-sen University
-
Locating and Editing Figure-Ground Organization in Vision TransformersFriedrich-Alexander-Universität Erlangen-Nürnberg
-
Physical Simulator In-the-Loop Video GenerationGoogle / Agency for Science, Technology and Research, Singapore, Max Planck Institute for Informatics, Saarbrucken Research Center for Visual Computing, Interaction and Artificial Intelligence, Singapore University of Technology and Design
-
A Reference Architecture of Reinforcement Learning FrameworksMcMaster Centre for Software Certification, McMaster University
-
Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection TaskKeio University, The University of Tokyo
-
Non-invasive Growth Monitoring of Small Freshwater Fish in Home Aquariums via Stereo VisionFraunhofer Heinrich-Hertz-Institut, Humboldt-Universität zu Berlin
-
From Prompting to Preference Optimization: A Comparative Study of LLM-based Automated Essay ScoringVietnam National University, Vietnam National University Ho Chi Minh City
-
CLoPA: Continual Low Parameter Adaptation of Interactive Segmentation for Medical Image AnnotationKing’s College London
-
Abductive Reasoning with Syllogistic Forms in Large Language ModelsKeio University, The University of Tokyo
-
Certified and accurate computation of function space norms of deep neural networksUniversity of Vienna
-
Toward Generative Quantum Utility via Correlation-Complexity Map
-
Prosodic Boundary-Aware Streaming Generation for LLM-Based TTS with Streaming Text InputNanyang Technological University, Southeast University, Tianjin University
-
What if? Emulative Simulation with World Models for Situated ReasoningETH Zurich, Hunan University, Karlsruhe Institute of Technology, Robotics and AI Institute, Sofia University, The Institute for Computer Science, Artificial Intelligence and Technology
-
CaTok: Taming Mean Flows for One-Dimensional Causal Image TokenizationFudan University, Shanghai Innovation Institute, Shanghai Key Laboratory
-
Pinterest Canvas: Large-Scale Image Generation at Pinterest
-
Training Flow Matching: The Role of Weighting and ParameterizationCentre national de la recherche scientifique, École normale supérieure de Lyon, Laboratoire de l'Informatique du Parallélisme, National Institute for Research in Digital Science and Technology, Technische Universitat Berlin, Université Claude Bernard Lyon 1
-
Do Foundation Models Know Geometry? Probing Frozen Features for Continuous Physical Measurement
-
GreenRFM: Toward a resource-efficient radiology foundation modelCenter for Medical Imaging, Robotics, Analytic Computing & Learning, Chinese Academy of Sciences, Jiangsu Provincial Key Laboratory, Suzhou Institute for Advanced Research, University of Chinese Academy of Sciences, University of Science and Technology of China
-
Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature MatchingMassachusetts Institute of Technology
-
PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations
-
Quantum Diffusion Models: Score Reversal Is Not Free in Gaussian DynamicsMassachusetts Institute of Technology
-
NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches
-
COLD-Steer: Steering Large Language Models via In-Context One-step Learning DynamicsGeorgia Institute of Technology, Massachusetts Institute of Technology
-
Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding and Editing
-
Speak in Context: Multilingual ASR with Speech Context Alignment via Contrastive LearningUniversity of Essex
-
Semantics-Aware Caching for Concept LearningHeinz Nixdorf Institute, Paderborn University
-
Self-Supervised Flow Matching for Scalable Multi-Modal SynthesisBlack Forest Labs, Massachusetts Institute of Technology
-
When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion ModelsColumbia University, Illinois Institute of Technology, University of Delaware
