Papers
-
Intelligent Co-Design: An Interactive LLM Framework for Interior Spatial Design via Multi-Modal Agents
-
Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences
-
Oscillating Dispersion for Maximal Light-throughput Spectral Imaging
-
PMAx: An Agentic Framework for AI-Driven Process Mining
-
NV-Bench: Benchmark of Nonverbal Vocalization Synthesis for Expressive Text-to-Speech Generation
-
Conditional Rectified Flow-based End-to-End Rapid Seismic Inversion Method
-
FuXiWeather2: Learning accurate atmospheric state estimation for operational global weather forecasting
-
Deep learning and the rate of approximation by flows
-
CRASH: Cognitive Reasoning Agent for Safety Hazards in Autonomous Driving
-
A PPO-Based Bitrate Allocation Conditional Diffusion Model for Remote Sensing Image Compression
-
Exploring Novelty Differences between Industry and Academia: A Knowledge Entity-centric Perspective
-
Controlled Langevin Dynamics for Sampling of Feedforward Neural Networks Trained with Minibatches
-
IRIS: Intersection-aware Ray-based Implicit Editable Scenes
-
Trajectory-Diversity-Driven Robust Vision-and-Language Navigation
-
Brain-Inspired Graph Multi-Agent Systems for LLM Reasoning
-
SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations
-
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition
-
GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks
-
Spectral Rectification for Parameter-Efficient Adaptation of Foundation Models in Colonoscopy Depth Estimation
-
More Test-Time Compute Can Hurt: Overestimation Bias in LLM Beam Search
-
Why AI systems don't learn and what to do about it: Lessons on autonomous learning from cognitive science
-
RARE disease detection from Capsule Endoscopic Videos based on Vision Transformers
-
Persistence Spheres: a Bi-continuous Linear Representation of Measures for Partial Optimal Transport
-
RieMind: Geometry-Grounded Spatial Agent for Scene Understanding
-
Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization
-
When Does Sparsity Mitigate the Curse of Depth in LLMs
-
AI Evasion and Impersonation Attacks on Facial Re-Identification with Activation Map Explanations
-
SFCoT: Safer Chain-of-Thought via Active Safety Evaluation and Calibration
-
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?
-
A Closer Look into LLMs for Table Understanding
-
Pointing-Based Object Recognition
-
Detection of Autonomous Shuttles in Urban Traffic Images Using Adaptive Residual Context
-
Fusian: Multi-LoRA Fusion for Fine-Grained Continuous MBTI Personality Control in Large Language Models
-
TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems
-
SEA-Vision: A Multilingual Benchmark for Comprehensive Document and Scene Text Understanding in Southeast Asia
-
A Hybrid Modeling Framework for Crop Prediction Tasks via Dynamic Parameter Calibration and Multi-Task Learning
-
Local Urysohn Width: A Topological Complexity Measure for Classification
-
RESQ: A Unified Framework for REliability- and Security Enhancement of Quantized Deep Neural Networks
-
GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
-
AnyCrowd: Instance-Isolated Identity-Pose Binding for Arbitrary Multi-Character Animation
-
Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities
-
MA-VLCM: A Vision Language Critic Model for Value Estimation of Policies in Multi-Agent Team Settings
-
CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents
-
Invisible failures in human-AI interactions
-
Physics-informed fine-tuning of foundation models for partial differential equations
-
Gym-V: A Unified Vision Environment System for Agentic Vision Research
-
Real-Time Human Frontal View Synthesis from a Single Image
-
Listening to the Echo: User-Reaction Aware Policy Optimization via Scalar-Verbal Hybrid Reinforcement Learning
-
MV2UV: Generating High-quality UV Texture Maps with Multiview Prompts
-
Deep Reinforcement Learning for Fano Hypersurfaces
MongoDB - Build AI That Scales
