Ripoti ya Mdudu / Ombi la Pekee

CosyVoice3 TTS

Alibaba FunAudioLLM's latest multilingual model with ~150ms bi-streaming, instruction control, and zero-shot cloning.

0/500 wahusika · Tia sahihi kwa 5,000 kwa kila kizazi →

Tia sahihi kwa kiwango cha tabia 5,000

SSML Mode (Kutumia Lugha kwa Usemi kwa Njia Nzuri)

Pakua maandishi yako katika tovuti ya SSML kwa ajili ya udhibiti sahihi:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Hisia - Moyo na Mitindo ya Mavazi

Tag anaelewa mfano unaochaguliwa na unajibu ujumbe huu:

Kamusi Cypecial Dictionary

Matamshi ya desturi (neno = matamshi):

Pit 0

-12 +12

Kioo cha AI

Sauti

Lugha

Fimbo ya Ziada

Mwendo 1.0x

0.5x 2.0x

Nikiwa huru na Piper, VITS, MelloTTTS

Unaweza kuchagua mfano, maandishi, na kidofo kinachoitwa Genete.

Habari CosyVoice3

CosyVoice3 is the newest generation from Alibaba's FunAudioLLM team and a clear step up from CosyVoice 2. It introduces bi-streaming inference with roughly 150ms latency and instruction-based control, letting you steer emotion, speed, and volume through prompts. Speaker similarity for zero-shot voice cloning is improved, and coverage spans 9 languages plus 18 Chinese dialects. An RL-tuned variant pushes prosody to a state-of-the-art level. With a 5,000-character ceiling, fast generation, and strong cloning, it's geared toward multilingual production TTS and real-time applications.

Bora kwa: Multilingual production TTS, real-time applications, voice cloning

Ng'ombe wote CosyVoice3 sauti

Kutupia jicho

Mbuni: Alibaba (FunAudioLLM)
Lenzi: Apache 2.0
Tier: standard
Mwendo: fast
Kufanyizwa kwa Sauti: Ndiyo
Lugha: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Russian
Wahusika wa Max: 5000

CosyVoice3 sauti

Chinese Female

Chinese

Kiwango Female

Chinese Male

Chinese

Kiwango Male

English Female

English

Kiwango Female

English Male

English

Kiwango Male

French Female

French

Kiwango Female

German Female

German

Kiwango Female

Italian Female

Italian

Kiwango Female

Japanese Female

Japanese

Kiwango Female

Korean Female

Korean

Kiwango Female

Russian Female

Russian

Kiwango Female

Spanish Female

Spanish

Kiwango Female

CosyVoice3 TTS ngumuSTEGAQ

CosyVoice3 adds bi-streaming inference at around 150ms latency, instruction-based control over emotion/speed/volume, improved speaker similarity for cloning, and coverage of 9 languages plus 18 Chinese dialects, with an RL-tuned variant for state-of-the-art prosody.

Yes. It supports zero-shot voice cloning from a reference clip (around 3 seconds minimum) with improved speaker similarity over the previous generation.

Yes. CosyVoice3 is licensed under Apache 2.0, permitting commercial use.

← Sauti zote

CosyVoice3 TTS

Waeleze rafiki zako kuhusu mapenzi ya TTS.ai?

Habari CosyVoice3

Kutupia jicho

CosyVoice3 sauti

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Russian Female

Spanish Female

CosyVoice3 TTS ngumuSTEGAQ

What makes CosyVoice3 different from CosyVoice 2?

Does CosyVoice3 support voice cloning?

Is CosyVoice3 free for commercial use?