VieNeu-TTS-v2

VieNeu-TTS-v2 TTS

A Vietnamese-first, CPU-only model with en-vi code-switching, 7 regional preset voices, and zero-shot cloning.

VieNeu-TTS-v2 is a 300M-parameter Vietnamese-first model built on a Qwen3 backbone and trained on more than 10,000 hours of bilingual data. It handles seamless English-Vietnamese code-switching, ships 7 preset voices spanning Northern and Southern accents, and clones a voice instantly from just 3-5 seconds of reference audio. Notably it runs entirely on CPU — using GGUF Q4 inference plus an ONNX audio decoder — with no GPU required, finishing a generation in about 7 seconds. It's purpose-built for Vietnamese content and bilingual en-vi narration, an underserved niche in open TTS.

At a glance

Developer
Phạm Nguyễn Ngọc Bảo
License
Apache 2.0
Tier
standard
Speed
fast
Voice cloning
Yes
Languages
Vietnamese, English
Max characters
1000

VieNeu-TTS-v2 AI Voices

Bích Ngọc (North, Female)

Vietnamese
Sjálfgefið Female
Nota

Phạm Tuyên (North, Male)

Vietnamese
Sjálfgefið Male
Nota

Thanh Bình (North, Male)

Vietnamese
Sjálfgefið Male
Nota

Thái Sơn (South, Male)

Vietnamese
Sjálfgefið Male
Nota

Thục Đoan (South, Female)

Vietnamese
Sjálfgefið Female
Nota

Trúc Ly (North, Female)

Vietnamese
Sjálfgefið Female
Nota

Xuân Vĩnh (South, Male)

Vietnamese
Sjálfgefið Male
Nota

Best for

Vietnamese content and bilingual en-vi narration

VieNeu-TTS-v2 TTS — FAQ

Yes. VieNeu-TTS-v2 runs entirely on CPU via GGUF Q4 inference and an ONNX audio decoder — no GPU needed — and completes a generation in around 7 seconds.

It is Vietnamese-first with English support and seamless en-vi code-switching. It ships 7 preset voices spanning Northern and Southern Vietnamese accents.

Yes. It supports instant zero-shot voice cloning from just 3-5 seconds of reference audio. It is Apache 2.0 licensed and free to use commercially.
← All voices