VieNeu-TTS-v2

VieNeu-TTS-v2 TTS

A Vietnamese-first, CPU-only model with en-vi code-switching, 7 regional preset voices, and zero-shot cloning.

VieNeu-TTS-v2 is a 300M-parameter Vietnamese-first model built on a Qwen3 backbone and trained on more than 10,000 hours of bilingual data. It handles seamless English-Vietnamese code-switching, ships 7 preset voices spanning Northern and Southern accents, and clones a voice instantly from just 3-5 seconds of reference audio. Notably it runs entirely on CPU — using GGUF Q4 inference plus an ONNX audio decoder — with no GPU required, finishing a generation in about 7 seconds. It's purpose-built for Vietnamese content and bilingual en-vi narration, an underserved niche in open TTS.

At a glance

Developer
Phạm Nguyễn Ngọc Bảo
License
Apache 2.0
Tier
standard
Speed
fast
Voice cloning
Yes
Languages
Vietnamese, English
Max characters
1000

VieNeu-TTS-v2 AI Voices

Bích Ngọc (North, Female)

Vietnamese
רגיל Female
השתמש

Phạm Tuyên (North, Male)

Vietnamese
רגיל Male
השתמש

Thanh Bình (North, Male)

Vietnamese
רגיל Male
השתמש

Thái Sơn (South, Male)

Vietnamese
רגיל Male
השתמש

Thục Đoan (South, Female)

Vietnamese
רגיל Female
השתמש

Trúc Ly (North, Female)

Vietnamese
רגיל Female
השתמש

Xuân Vĩnh (South, Male)

Vietnamese
רגיל Male
השתמש

Best for

Vietnamese content and bilingual en-vi narration

VieNeu-TTS-v2 TTS — FAQ

Yes. VieNeu-TTS-v2 runs entirely on CPU via GGUF Q4 inference and an ONNX audio decoder — no GPU needed — and completes a generation in around 7 seconds.

It is Vietnamese-first with English support and seamless en-vi code-switching. It ships 7 preset voices spanning Northern and Southern Vietnamese accents.

Yes. It supports instant zero-shot voice cloning from just 3-5 seconds of reference audio. It is Apache 2.0 licensed and free to use commercially.
← All voices