Papers
-
IRIS: Intersection-aware Ray-based Implicit Editable Scenes
-
Trajectory-Diversity-Driven Robust Vision-and-Language Navigation
-
Brain-Inspired Graph Multi-Agent Systems for LLM Reasoning
-
SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations
-
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition
-
GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks
-
Spectral Rectification for Parameter-Efficient Adaptation of Foundation Models in Colonoscopy Depth Estimation
-
More Test-Time Compute Can Hurt: Overestimation Bias in LLM Beam Search
-
Why AI systems don't learn and what to do about it: Lessons on autonomous learning from cognitive science
-
RARE disease detection from Capsule Endoscopic Videos based on Vision Transformers
-
Persistence Spheres: a Bi-continuous Linear Representation of Measures for Partial Optimal Transport
-
RieMind: Geometry-Grounded Spatial Agent for Scene Understanding
-
Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization
-
When Does Sparsity Mitigate the Curse of Depth in LLMs
-
AI Evasion and Impersonation Attacks on Facial Re-Identification with Activation Map Explanations
-
SFCoT: Safer Chain-of-Thought via Active Safety Evaluation and Calibration
-
SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?
-
A Closer Look into LLMs for Table Understanding
-
Pointing-Based Object Recognition
-
Detection of Autonomous Shuttles in Urban Traffic Images Using Adaptive Residual Context
-
Fusian: Multi-LoRA Fusion for Fine-Grained Continuous MBTI Personality Control in Large Language Models
-
TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems
-
SEA-Vision: A Multilingual Benchmark for Comprehensive Document and Scene Text Understanding in Southeast Asia
-
A Hybrid Modeling Framework for Crop Prediction Tasks via Dynamic Parameter Calibration and Multi-Task Learning
-
Local Urysohn Width: A Topological Complexity Measure for Classification
-
RESQ: A Unified Framework for REliability- and Security Enhancement of Quantized Deep Neural Networks
-
GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
-
AnyCrowd: Instance-Isolated Identity-Pose Binding for Arbitrary Multi-Character Animation
-
Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities
-
MA-VLCM: A Vision Language Critic Model for Value Estimation of Policies in Multi-Agent Team Settings
-
CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents
-
Invisible failures in human-AI interactions
-
Physics-informed fine-tuning of foundation models for partial differential equations
-
Gym-V: A Unified Vision Environment System for Agentic Vision Research
-
Real-Time Human Frontal View Synthesis from a Single Image
-
Listening to the Echo: User-Reaction Aware Policy Optimization via Scalar-Verbal Hybrid Reinforcement Learning
-
MV2UV: Generating High-quality UV Texture Maps with Multiview Prompts
-
Deep Reinforcement Learning for Fano Hypersurfaces
-
Music Genre Classification: A Comparative Analysis of Classical Machine Learning and Deep Learning Approaches
-
Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting
-
GLANCE: Gaze-Led Attention Network for Compressed Edge-inference
-
A Framework for Modeling Liquefaction-Induced Road Disruptions After Earthquakes: Implications for Emergency Response and Access in the Cascadia Region of North America
-
Evasive Intelligence: Lessons from Malware Analysis for Evaluating AI Agents
-
Evaluating Time Awareness and Cross-modal Active Perception of Large Models via 4D Escape Room Task
-
RoCo Challenge at AAAI 2026: Benchmarking Robotic Collaborative Manipulation for Assembly Towards Industrial Automation
-
Automated Counting of Stacked Objects in Industrial Inspection
-
Anchor then Polish for Low-light Enhancement
-
Agent Lifecycle Toolkit (ALTK): Reusable Middleware Components for Robust AI Agents
-
Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation
-
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer
