VoxCPM2

VoxCPM2

VoxCPM2 is OpenBMB’s latest tokenizer-free TTS model, built to generate continuous speech representations directly through an end-to-end diffusion autoregressive architecture rather than discrete speech tokens. The repository describes it as a 2B-parameter model based on MiniCPM-4, trained on more than 2 million hours of multilingual speech data. It supports 30 languages, natural-language voice design, controllable voice cloning from short reference clips, and “ultimate cloning” with transcript-guided continuation, while outputting 48 kHz audio. The repo also reports real-time streaming with RTF as low as about 0.3 on an RTX 4090, or about 0.13 with Nano-VLLM.

Overview

VoxCPM2 is OpenBMB’s open-source tokenizer-free multilingual text-to-speech model for natural speech generation, voice design, and controllable voice cloning. It is a 2B-parameter model trained on over 2 million hours of speech, supports 30 languages, and produces 48 kHz studio-quality audio with real-time streaming capability.

🔊Text to speech 🎤Voice changing 🗣️Voice cloning 🔊Audio

About OpenBMB

OpenBMB is short for Open Lab for Big Model Base. The goal of OpenBMB is to build the model base and toolkit for large-scale pre-trained language models.

Company Size: 100

Location: Beijing, CN

Website: openbmb.cn

View Company Profile

Tools using VoxCPM2

No tools found for this model yet.

Last updated: April 7, 2026

Go to section

Search

Overview

About OpenBMB

Tools using VoxCPM2

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: