Papers
-
AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models
-
MedQ-UNI: Toward Unified Medical Image Quality Assessment and Restoration via Vision-Language Modeling
-
Recolour What Matters: Region-Aware Colour Editing via Token-Level Diffusion
-
GAIN: A Benchmark for Goal-Aligned Decision-Making of Large Language Models under Imperfect Norms
-
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding
-
Do Post-Training Algorithms Actually Differ? A Controlled Study Across Model Scales Uncovers Scale-Dependent Ranking Inversions
-
WASD: Locating Critical Neurons as Sufficient Conditions for Explaining and Controlling LLM Behavior
-
UniFluids: Unified Neural Operator Learning with Conditional Flow-matching
-
Do Vision Language Models Understand Human Engagement in Games?
-
T-QPM: Enabling Temporal Out-Of-Distribution Detection and Domain Generalization for Vision-Language Models in Open-World
-
The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices
-
Precise Performance of Linear Denoisers in the Proportional Regime
-
Diffusion-Guided Semantic Consistency for Multimodal Heterogeneity
-
Collaborative Adaptive Curriculum for Progressive Knowledge Distillation
-
TexEditor: Structure-Preserving Text-Driven Texture Editing
-
EntropyCache: Decoded Token Entropy Guided KV Caching for Diffusion Language Models
-
AIMER: Calibration-Free Task-Agnostic MoE Pruning
-
FILT3R: Latent State Adaptive Kalman Filter for Streaming 3D Reconstruction
-
Cross-Domain Demo-to-Code via Neurosymbolic Counterfactual Reasoning
-
NymeriaPlus: Enriching Nymeria Dataset with Additional Annotations and Data
-
Recovering Sparse Neural Connectivity from Partial Measurements: A Covariance-Based Approach with Granger-Causality Refinement
-
Efficient Video Diffusion with Sparse Information Transmission for Video Compression
-
HOMEY: Heuristic Object Masking with Enhanced YOLO for Property Insurance Risk Detection
-
From Snapshots to Symphonies: The Evolution of Protein Prediction from Static Structures to Generative Dynamics and Multimodal Interactions
-
Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM
-
Foundations and Architectures of Artificial Intelligence for Motor Insurance
-
OnlinePG: Online Open-Vocabulary Panoptic Mapping with 3D Gaussian Splatting
-
CAFlow: Adaptive-Depth Single-Step Flow Matching for Efficient Histopathology Super-Resolution
-
On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization
-
Counting Circuits: Mechanistic Interpretability of Visual Reasoning in Large Vision-Language Models
-
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model
-
Correlation-Weighted Multi-Reward Optimization for Compositional Generation
-
When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making
-
Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
-
Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning
-
Data-efficient pre-training by scaling synthetic megadocs
-
Transformer-Based Predictive Maintenance for Risk-Aware Instrument Calibration
-
Beyond Passive Aggregation: Active Auditing and Topology-Aware Defense in Decentralized Federated Learning
-
iSatCR: Graph-Empowered Joint Onboard Computing and Routing for LEO Data Delivery
-
GAPSL: A Gradient-Aligned Parallel Split Learning on Heterogeneous Data
-
Remedying Target-Domain Astigmatism for Cross-Domain Few-Shot Object Detection
-
SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement
-
CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models
-
HEP Statistical Inference for UAV Fault Detection: CLs, LRT, and SBI Applied to Blade Damage
-
SINDy-KANs: Sparse identification of non-linear dynamics through Kolmogorov-Arnold networks
-
Learning Decision-Sufficient Representations for Linear Optimization
-
End-to-End QGAN-Based Image Synthesis via Neural Noise Encoding and Intensity Calibration
-
Cross-Lingual LLM-Judge Transfer via Evaluation Decomposition
-
HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering
-
CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention
MongoDB - Build AI That Scales
