Papers
-
Robust Self-Training with Closed-loop Label Correction for Learning from Noisy Labels
-
MO-SAE:Multi-Objective Stacked Autoencoders Optimization for Edge Anomaly Detection
-
CT-Conditioned Diffusion Prior with Physics-Constrained Sampling for PET Super-Resolution
-
Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks
-
Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition
-
LineMaster Pro: A Low-Cost Intelligent Line Following Robot with PID Control and Ultrasonic Obstacle Avoidance for Educational Robotics
-
FedPBS: Proximal-Balanced Scaling Federated Learning Model for Robust Personalized Training for Non-IID Data
-
Scene Generation at Absolute Scale: Utilizing Semantic and Geometric Guidance From Text for Accurate and Interpretable 3D Indoor Scene Generation
-
AgriChat: A Multimodal Large Language Model for Agriculture Image Understanding
-
The Phenomenology of Hallucinations
-
Towards Stable Self-Supervised Object Representations in Unconstrained Egocentric Video
-
Evaluation of Visual Place Recognition Methods for Image Pair Retrieval in 3D Vision and Robotics
-
OpenCOOD-Air: Prompting Heterogeneous Ground-Air Collaborative Perception with Spatial Conversion and Offset Prediction
-
Generative Inverse Design of Cold Metals for Low-Power Electronics
-
SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization
-
Close to Reality: Interpretable and Feasible Data Augmentation for Imbalanced Learning
-
Discriminative Flow Matching Via Local Generative Predictors
-
True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity
-
OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset
-
Iterative Semantic Reasoning from Individual to Group Interests for Generative Recommendation with LLMs
-
GroupGuard: A Framework for Modeling and Defending Collusive Attacks in Multi-Agent Systems
-
Bidirectional Cross-Attention Fusion of High-Res RGB and Low-Res HSI for Multimodal Automated Waste Sorting
-
Sat-JEPA-Diff: Bridging Self-Supervised Learning and Generative Diffusion for Remote Sensing
-
ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering
-
DCP-CLIP:A Coarse-to-Fine Framework for Open-Vocabulary Semantic Segmentation with Dual Interaction
-
LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
-
EviAgent: Evidence-Driven Agent for Radiology Report Generation
-
GenLie: A Global-Enhanced Lie Detection Network under Sparsity and Semantic Interference
-
IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation
-
USIS-PGM: Photometric Gaussian Mixtures for Underwater Salient Instance Segmentation
-
sebis at ArchEHR-QA 2026: How Much Can You Do Locally? Evaluating Grounded EHR QA on a Single Notebook
-
VID-AD: A Dataset for Image-Level Logical Anomaly Detection under Vision-Induced Distraction
-
vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models
-
EchoLVFM: One-Step Video Generation via Latent Flow Matching for Echocardiogram Synthesis
-
Leveraging a Statistical Shape Model for Efficient Generation of Annotated Training Data: A Case Study on Liver Landmarks Segmentation
-
Shapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications
-
Chunk-Guided Q-Learning
-
FLUX: Data Worth Training On
-
When Visual Privacy Protection Meets Multimodal Large Language Models
-
Exploiting temporal parallelism for LSTM Autoencoder acceleration on FPGA
-
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models
-
Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning
-
VAD4Space: Visual Anomaly Detection for Planetary Surface Imagery
-
Human-like Object Grouping in Self-supervised Vision Transformers
-
TDMM-LM: Bridging Facial Understanding and Animation via Language Models
-
Location Aware Embedding for Geotargeting in Sponsored Search Advertising
-
A Systematic Evaluation Protocol of Graph-Derived Signals for Tabular Machine Learning
-
PhyGaP: Physically-Grounded Gaussians with Polarization Cues
-
The Taxonomies, Training, and Applications of Event Stream Modelling for Electronic Health Records
-
U-Face: An Efficient and Generalizable Framework for Unsupervised Facial Attribute Editing via Subspace Learning
MongoDB - Build AI That Scales
