Papers
-
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
-
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
-
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions
-
Charts Are Not Images: On the Challenges of Scientific Chart Editing
-
Neural Texture Splatting: Expressive 3D Gaussian Splatting for View Synthesis, Geometry, and Dynamic Reconstruction
-
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
-
Steering MoE LLMs via Expert (De)Activation
-
Adobe Researchers present a powerful, unified approach to generative video editing at CVPR 2025
-
Splat and Replace: 3D Reconstruction with Repetitive Elements
-
Progressive Autoregressive Video Diffusion Models
-
PixelFlow: Pixel-Space Generative Models with Flow
-
OWLViz: An Open-World Benchmark for Visual Question Answering
-
Generative Video Propagation
-
ARTIST: Improving the Generation of Text-rich Images with Disentangled Diffusion Models and Large Language Models
-
Creative Text-to-Audio Generation via Synthesizer Programming
-
Retrieval Augmented Generation for Domain-specific Question Answering
-
Distinguishing homolytic versus heterolytic bond dissociation of phenyl sulfonium cations with localized active space methods
-
LRM: Large Reconstruction Model for Single Image to 3D
-
tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction
