Chinese (Mandarin) Text to Speech

Turn Chinese (Mandarin) text into natural speech with AI voices. 25 voices. Free, no signup — download as MP3 or WAV.

Mandarin text-to-speech lives or dies on tone: it has four lexical tones plus a neutral tone, and getting the contour wrong turns "mā" (mother) into "mǎ" (horse), so the model must predict pitch per syllable, not just per sentence. Tone sandhi adds another layer — for example two third tones in a row shift the first to a rising tone, and the words "一" (yī) and "不" (bù) change tone depending on what follows. Because Hanzi carry no spaces and many characters are polyphonic (多音字), high-quality Chinese synthesis depends heavily on word segmentation and grapheme-to-phoneme disambiguation from context.

Open the Chinese (Mandarin) voice editor

Sample — 中文(普通话)

“今天天气很好,我们一起去公园散步,顺便买点水果回家吧。”

Native name
中文(普通话)
Speakers
about 1.1 billion speakers (roughly 920 million native Mandarin)
Language family
Sinitic branch of Sino-Tibetan
Script
Chinese characters (Hanzi) — Simplified and Traditional
Spoken in
Mainland China, Taiwan, Singapore, Malaysia, Hong Kong, global Chinese diaspora

25 Chinese (Mandarin) AI Voices

Chinese Speaker 1

Bark
표준 Neutral
사용

Chinese Speaker 2

Bark
표준 Neutral
사용

Chinese Speaker

Bark Small
표준 Neutral
사용

Chinese Female

CosyVoice 2
표준 Female
사용

Chinese Male

CosyVoice 2
표준 Male
사용

Chinese Female

CosyVoice3
표준 Female
사용

Chinese Male

CosyVoice3
표준 Male
사용

Default (Chinese)

Darwin TTS
표준 Neutral
사용

Default

GPT-SoVITS
표준 Neutral
사용

Chinese Default

IndexTTS-2
표준 Neutral
사용

Xiaobei

Kokoro
자유 Female
사용

Xiaoni

Kokoro
자유 Female
사용

Xiaoxiao

Kokoro
자유 Female
사용

Yunjian

Kokoro
자유 Male
사용

Chinese

MeloTTS
자유 Female
사용

Default (Chinese)

Ming-Omni TTS
자유 Neutral
사용

Chinese

MOSS-TTS Nano
표준 Neutral
사용

Default (Chinese)

MOSS-TTSD
표준 Neutral
사용

Chinese

OpenVoice
최고급 Neutral
사용

Huayan (Chinese)

Piper
자유 Female
사용

Uncle Fu

Qwen3 TTS
표준 Male
사용

Chinese Default

Spark TTS
표준 Neutral
사용

Speaker 1 (Chinese)

VibeVoice
표준 Neutral
사용

Speaker 2 (Chinese)

VibeVoice
표준 Neutral
사용

Default Chinese

VoxCPM
표준 Neutral
사용

What people use Chinese (Mandarin) text to speech for

E-learning and Mandarin language-teaching narration
Short-video (Douyin/Bilibili) and livestream voiceover
Navigation and in-car voice prompts
Customer-service IVR and chatbot voices
News and audiobook narration for the diaspora

Chinese (Mandarin) Text to Speech — FAQ

Yes. You can paste either Simplified (mainland/Singapore) or Traditional (Taiwan/Hong Kong) text; both are read in Mandarin pronunciation.

The model predicts each syllable's tone contour from context and applies tone sandhi rules, so sequences like third-tone pairs and the special cases of 一 and 不 come out naturally.

Mostly yes. Multi-reading characters such as 行 (xíng vs háng) or 长 (cháng vs zhǎng) are disambiguated from surrounding words, though rare proper nouns can still be ambiguous.

These voices are Standard Mandarin (Putonghua). Cantonese uses a different tone system and pronunciation and is not the same as Mandarin TTS.

Related languages