Papers
-
Are Large Language Models Truly Smarter Than Humans?
-
Online Semi-infinite Linear Programming: Efficient Algorithms via Function Approximation
-
Robust Generative Audio Quality Assessment: Disentangling Quality from Spurious Correlations
-
A Scoping Review of AI-Driven Digital Interventions in Mental Health Care: Mapping Applications Across Screening, Support, Monitoring, Prevention, and Clinical Education
-
Leveling3D: Leveling Up 3D Reconstruction with Feed-Forward 3D Gaussian Splatting and Geometry-Aware Generation
-
CoMAI: A Collaborative Multi-Agent Framework for Robust and Equitable Interview Evaluation
-
Generative AI for Quantum Circuits and Quantum Code: A Technical Review and Taxonomy
-
SpecSteer: Synergizing Local Context and Global Reasoning for Efficient Personalized Generation
-
Dual Consensus: Escaping from Spurious Majority in Unsupervised RLVR via Two-Stage Vote Mechanism
-
Ground Reaction Inertial Poser: Physics-based Human Motion Capture from Sparse IMUs and Insole Pressure Sensors
-
ReFORM: Review-aggregated Profile Generation via LLM with Multi-Factor Attention for Restaurant Recommendation
-
PureCLIP-Depth: Prompt-Free and Decoder-Free Monocular Depth Estimation within CLIP Embedding Space
-
Neural Pushforward Samplers for the Fokker-Planck Equation on Embedded Riemannian Manifolds
-
Exclusivity-Guided Mask Learning for Semi-Supervised Crowd Instance Segmentation and Counting
-
RASLF: Representation-Aware State Space Model for Light Field Super-Resolution
-
More Rounds, More Noise: Why Multi-Turn Review Fails to Improve Cross-Context Verification
-
How to Utilize Complementary Vision-Text Information for 2D Structure Understanding
-
Synergizing Deep Learning and Biological Heuristics for Extreme Long-Tail White Blood Cell Classification
-
Visual Prompt Discovery via Semantic Exploration
-
Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models
-
When Thinking Hurts: Mitigating Visual Forgetting in Video Reasoning via Frame Repetition
-
Point-to-Mask: From Arbitrary Point Annotations to Mask-Level Infrared Small Target Detection
-
Is Semi-Automatic Transcription Useful in Corpus Creation? Preliminary Considerations on the KIParla Corpus
-
Human/AI Collective Intelligence for Deliberative Democracy: A Human-Centred Design Approach
-
AW-MoE: All-Weather Mixture of Experts for Robust Multi-Modal 3D Object Detection
-
Behavior-Centric Extraction of Scenarios from Highway Traffic Data and their Domain-Knowledge-Guided Clustering using CVQ-VAE
-
Adaptive Theory of Mind for LLM-based Multi-Agent Coordination
-
FG-SGL: Fine-Grained Semantic Guidance Learning via Motion Process Decomposition for Micro-Gesture Recognition
-
CineSRD: Leveraging Visual, Acoustic, and Linguistic Cues for Open-World Visual Media Speaker Diarization
-
VIGOR: VIdeo Geometry-Oriented Reward for Temporal Generative Alignment
-
MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing
-
Physics-integrated neural differentiable modeling for immersed boundary systems
-
Laya: A LeJEPA Approach to EEG via Latent Prediction over Reconstruction
-
Locate-then-Sparsify: Attribution Guided Sparse Strategy for Visual Hallucination Mitigation
-
Persistent Story World Simulation with Continuous Character Customization
-
Surrogate-Assisted Genetic Programming with Rank-Based Phenotypic Characterisation for Dynamic Multi-Mode Project Scheduling
-
VisBrowse-Bench: Benchmarking Visual-Native Search for Multimodal Browsing Agents
-
Attention-guided Evidence Grounding for Spoken Question Answering
-
PyPhonPlan: Simulating phonetic planning with dynamic neural fields and task dynamics
-
Micro-AU CLIP: Fine-Grained Contrastive Learning from Local Independence to Global Dependency for Micro-Expression Action Unit Detection
-
DriveFix: Spatio-Temporally Coherent Driving Scene Restoration
-
NeSy-Route: A Neuro-Symbolic Benchmark for Constrained Route Planning in Remote Sensing
-
Omnilingual MT: Machine Translation for 1,600 Languages
-
Learning to Predict, Discover, and Reason in High-Dimensional Event Sequences
-
DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns
-
A Human-Centred Architecture for Large Language Models-Cognitive Assistants in Manufacturing within Quality Management Systems
-
An Interpretable Machine Learning Framework for Non-Small Cell Lung Cancer Drug Response Analysis
-
Decoding the Critique Mechanism in Large Reasoning Models
-
Behavioral Steering in a 35B MoE Language Model via SAE-Decoded Probe Vectors: One Agency Axis, Not Five Traits
-
SpikeCLR: Contrastive Self-Supervised Learning for Few-Shot Event-Based Vision using Spiking Neural Networks
