Papers
-
Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
-
Dropout Robustness and Cognitive Profiling of Transformer Models via Stochastic Inference
-
ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation
-
M2P: Improving Visual Foundation Models with Mask-to-Point Weakly-Supervised Learning for Dense Point Tracking
-
Process Supervision for Chain-of-Thought Reasoning via Monte Carlo Net Information Gain
-
Federated Distributional Reinforcement Learning with Distributional Critic Regularization
-
Multi-Source Evidence Fusion for Audio Question Answering
-
Discovering Decoupled Functional Modules in Large Language Models
-
Intellectual Stewardship: Re-adapting Human Minds for Creative Knowledge Work in the Age of AI
-
Symmetry-Reduced Physics-Informed Learning of Tensegrity Dynamics
-
Steering Video Diffusion Transformers with Massive Activations
-
FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair
-
TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models
-
CodeScout: An Effective Recipe for Reinforcement Learning of Code Search Agents
-
RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy
-
Text-to-Stage: Spatial Layouts from Long-form Narratives
-
Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control
-
Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models
-
Verification and Validation of Physics-Informed Surrogate Component Models for Dynamic Power-System Simulation
-
The Silent Thought: Modeling Internal Cognition in Full-Duplex Spoken Dialogue Models via Latent Reasoning
-
Event-Centric Human Value Understanding in News-Domain Texts: An Actor-Conditioned, Multi-Granularity Benchmark
-
How do LLMs Compute Verbal Confidence
-
Video Understanding: From Geometry and Semantics to Unified Models
-
Omni-3DEdit: Generalized Versatile 3D Editing in One-Pass
-
Revisiting foundation models for cell instance segmentation
-
Physics-Aware Machine Learning for Seismic and Volcanic Signal Interpretation
-
VISER: Visually-Informed System for Enhanced Robustness in Open-Set Iris Presentation Attack Detection
-
Procedural Generation of Algorithm Discovery Tasks in Machine Learning
-
RHYME-XT: A Neural Operator for Spatiotemporal Control Systems
-
Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval
-
Operator-Theoretic Foundations and Policy Gradient Methods for General MDPs with Unbounded Costs
-
Edit Spillover as a Probe: Do Image Editing Models Implicitly Understand World Relations?
-
Differential Attention-Augmented BiomedCLIP with Asymmetric Focal Optimization for Imbalanced Multi-Label Video Capsule Endoscopy Classification
-
DebugLM: Learning Traceable Training Data Provenance for LLMs
-
AI-Assisted Goal Setting Improves Goal Progress Through Social Accountability
-
Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation
-
MAED: Mathematical Activation Error Detection for Mitigating Physical Fault Attacks in DNN Inference
-
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
-
scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns
-
A Creative Agent is Worth a 64-Token Template
-
A Noise Sensitivity Exponent Controls Large Statistical-to-Computational Gaps in Single- and Multi-Index Models
-
Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs
-
Don't Vibe Code, Do Skele-Code: Interactive No-Code Notebooks for Subject Matter Experts to Build Lower-Cost Agentic Workflows
-
Understanding Task Aggregation for Generalizable Ultrasound Foundation Models
-
SpiderCam: Low-Power Snapshot Depth from Differential Defocus
-
Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages
-
Noise-Aware Misclassification Attack Detection in Collaborative DNN Inference
-
IndicSafe: A Benchmark for Evaluating Multilingual LLM Safety in South Asia
-
Only relative ranks matter in weight-clustered large language models
-
SegFly: A 2D-3D-2D Paradigm for Aerial RGB-Thermal Semantic Segmentation at Scale
MongoDB - Build AI That Scales
