VoxCPM

Default

Standard English Neutral VoxCPM

Default is a neutral AI voice powered by the VoxCPM text-to-speech model. This standard-tier voice speaks English and delivers studio-quality speech synthesis. With near-instant generation speed and a quality rating of 5/5, Default is well-suited for high-fidelity audio, audiobooks, long-form content with voice consistency. The VoxCPM engine is developed by OpenBMB under the Apache 2.0 license, making it safe for commercial use. Key capabilities include: 44.1khz audio, tokenizer-free, cross-lingual cloning, context-aware, lora fine-tuning. The VoxCPM model also supports voice cloning — upload a short audio sample to create a custom voice that retains the same quality characteristics.

No ratings yet

VoxCPMModel Information

Model VoxCPM
Developer OpenBMB
Quality
Speed Fast
License Apache 2.0
Cloning Supported
Tier Standard (2x characters)
Parameters 500M
Architecture Continuous Space + Flow Matching
Training Data 1800000 hours
Year 2025

Best Use Cases for Default

Recommended applications based on this voice's characteristics

Audiobooks & Narration

Use Default to narrate long-form content with natural prosody and expression.

Video Voiceovers

Add professional narration to YouTube videos, ads, and social media content.

Apps & Accessibility

Fast generation makes this voice ideal for real-time apps, screen readers, and accessibility tools.

Podcasts & Broadcasting

Studio-quality output suitable for podcasts, radio, and professional broadcasting.

More VoxCPM Voices

Other voices from the same TTS model

Default Chinese

Chinese Neutral

Frequently Asked Questions

VoxCPM 1.5 by OpenBMB is a novel tokenizer-free TTS model that operates in continuous space rather than discrete tokens. It produces high-fidelity 44.1kHz audio, supports zero-shot voice cloning from 3-10 seconds, and maintains consistency across paragraphs. Cross-language cloning lets you apply an English voice to Chinese speech and vice versa.

VoxCPM was developed by OpenBMB and is released under the Apache 2.0 license, which permits commercial use of generated audio.

VoxCPM supports 2 languages: English, Chinese.

VoxCPM is in the Standard tier — 2 credits per 1,000 characters. You can preview any VoxCPM voice for free before generating full audio.

VoxCPM has very fast generation speed. It runs in near real-time, making it suitable for streaming and interactive applications.

VoxCPM is rated 5/5 for audio quality on TTS.ai. It delivers studio-grade, human-like speech.

Yes, VoxCPM supports zero-shot voice cloning. Upload 5-30 seconds of reference audio to create a custom voice.

Yes, VoxCPM is specifically recommended for high-fidelity audio, audiobooks, long-form content with voice consistency. Its 44.1khz audio, tokenizer-free, cross-lingual cloning capabilities make it an excellent choice for this use case.

Yes, VoxCPM is licensed under Apache 2.0, which allows commercial use. Audio generated with VoxCPM voices can be used in videos, podcasts, apps, games, and any other commercial project.

Yes, all voices on TTS.ai use commercially-licensed open-source models (MIT, Apache 2.0). The generated audio is yours to use in videos, podcasts, apps, games, and any other commercial application.

Send a POST request to /api/v1/tts/ with the model name and voice ID. See our API Documentation page for code examples in Python, JavaScript, Go, and cURL.

Yes, click the play button on this page to hear a sample. You can also type custom text on the Text to Speech page and generate a free preview with any voice.

Try Default Now

Type any text and hear it spoken by Default. Free to use.