VieNeu-TTS-v2 TTS
A Vietnamese-first, CPU-only model with en-vi code-switching, 7 regional preset voices, and zero-shot cloning.
VieNeu-TTS-v2 is a 300M-parameter Vietnamese-first model built on a Qwen3 backbone and trained on more than 10,000 hours of bilingual data. It handles seamless English-Vietnamese code-switching, ships 7 preset voices spanning Northern and Southern accents, and clones a voice instantly from just 3-5 seconds of reference audio. Notably it runs entirely on CPU — using GGUF Q4 inference plus an ONNX audio decoder — with no GPU required, finishing a generation in about 7 seconds. It's purpose-built for Vietnamese content and bilingual en-vi narration, an underserved niche in open TTS.
At a glance
- Developer
- Phạm Nguyễn Ngọc Bảo
- License
- Apache 2.0
- Tier
- standard
- Speed
- fast
- Voice cloning
- Yes
- Languages
- Vietnamese, English
- Max characters
- 1000
VieNeu-TTS-v2 voices
Best for
Vietnamese content and bilingual en-vi narration