WavFlow

WavFlow

WavFlow is a Meta AI research model for multimodal audio generation in raw waveform space. It generates synchronized, high-fidelity audio from video and text inputs without relying on latent audio compression. Its method uses waveform patchifying and amplitude lifting to make flow matching stable directly on raw audio through direct x-prediction. The project reports evaluation on VGGSound for video-to-audio and AudioCaps for text-to-audio, positioning WavFlow as an end-to-end waveform-generation alternative to latent-based audio generation systems.

Overview

WavFlow is Meta AI’s open-source audio-generation model for creating synchronized high-fidelity audio from video and text directly in raw waveform space.

🔊Sound effects 🔊Video to audio

About Meta Platforms

We're connecting people to what they care about, powering new, meaningful experiences, and advancing the state-of-the-art through open research and accessible tooling.

Industry: Technology, Information and Media

Company Size: 78865

Location: Menlo Park, California, US

Website: ai.meta.com

View Company Profile

Tools using WavFlow

No tools found for this model yet.

Last updated: May 22, 2026

Go to section

Search

Overview

About Meta Platforms

Tools using WavFlow

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: