Papers
-
A Reference Architecture of Reinforcement Learning Frameworks
-
Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection Task
-
Non-invasive Growth Monitoring of Small Freshwater Fish in Home Aquariums via Stereo Vision
-
From Prompting to Preference Optimization: A Comparative Study of LLM-based Automated Essay Scoring
-
CLoPA: Continual Low Parameter Adaptation of Interactive Segmentation for Medical Image Annotation
-
Abductive Reasoning with Syllogistic Forms in Large Language Models
-
Certified and accurate computation of function space norms of deep neural networks
-
Toward Generative Quantum Utility via Correlation-Complexity Map
-
Prosodic Boundary-Aware Streaming Generation for LLM-Based TTS with Streaming Text Input
-
What if? Emulative Simulation with World Models for Situated Reasoning
-
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization
-
Pinterest Canvas: Large-Scale Image Generation at Pinterest
-
Training Flow Matching: The Role of Weighting and Parameterization
-
Do Foundation Models Know Geometry? Probing Frozen Features for Continuous Physical Measurement
-
GreenRFM: Toward a resource-efficient radiology foundation model
-
Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching
-
PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations
-
Quantum Diffusion Models: Score Reversal Is Not Free in Gaussian Dynamics
-
NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches
-
COLD-Steer: Steering Large Language Models via In-Context One-step Learning Dynamics
-
Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding and Editing
-
Speak in Context: Multilingual ASR with Speech Context Alignment via Contrastive Learning
-
Semantics-Aware Caching for Concept Learning
-
Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis
-
When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models
-
SG-DOR: Learning Scene Graphs with Direction-Conditioned Occlusion Reasoning for Pepper Plants
-
Gauge Freedom and Metric Dependence in Neural Representation Spaces
-
HGT-Scheduler: Deep Reinforcement Learning for the Job Shop Scheduling Problem via Heterogeneous Graph Transformers
-
Artificial Intelligence for Detecting Fetal Orofacial Clefts and Advancing Medical Education
-
AV-Unified: A Unified Framework for Audio-visual Scene Understanding
-
Spatial Calibration of Diffuse LiDARs
-
NEGATE: Constrained Semantic Guidance for Linguistic Negation in Text-to-Video Diffusion
-
SpatialMAGIC: A Hybrid Framework Integrating Graph Diffusion and Spatial Attention for Spatial Transcriptomics Imputation
-
RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering
-
SurgFormer: Scalable Learning of Organ Deformation with Resection Support and Real-Time Inference
-
Modeling and Measuring Redundancy in Multisource Multimodal Data for Autonomous Driving
-
LiveSense: A Real-Time Wi-Fi Sensing Platform for Range-Doppler on COTS Laptop
-
KCLarity at SemEval-2026 Task 6: Encoder and Zero-Shot Approaches to Political Evasion Detection
-
Hierarchical Industrial Demand Forecasting with Temporal and Uncertainty Explanations
-
xaitimesynth: A Python Package for Evaluating Attribution Methods for Time Series with Synthetic Ground Truth
-
Causal Interpretation of Neural Network Computations with Contribution Decomposition
-
EgoReasoner: Learning Egocentric 4D Reasoning via Task-Adaptive Structured Thinking
-
Physics-Informed Diffusion Model for Generating Synthetic Extreme Rare Weather Events Data
-
Boosting deep Reinforcement Learning using pretraining with Logical Options
-
A recipe for scalable attention-based MLIPs: unlocking long-range accuracy with all-to-all node attention
-
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
-
SUREON: A Benchmark and Vision-Language-Model for Surgical Reasoning
-
SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation
-
Fly360: Omnidirectional Obstacle Avoidance within Drone View
-
BEVLM: Distilling Semantic Knowledge from LLMs into Bird's-Eye View Representations
