Segnala bug / richiesta di funzionalità

CosyVoice3 TTS

Alibaba FunAudioLLM's latest multilingual model with ~150ms bi-streaming, instruction control, and zero-shot cloning.

Testo
File

0/500 caratteri · Iscriviti per 5.000 per generazione →

Iscriviti per un limite di 5.000 caratteri

Modalità SSML (Linguaggio di marcatura sintesi vocale per un controllo fine)

Avvolgi il tuo testo nei tag SSML per un controllo preciso:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Emozione / Tag stile

Tags il modello selezionato comprende clic su

Dizionario della pronuncia

Definire le pronunciazioni personalizzate (parola = pronuncia):

Piazzola 0

-12 +12

Modello AI

Voce

Lingua

Formato di output

Velocità 1.0x

0.5x 2.0x

Gratis con Piper, VITS, MeloTTS

L'audio generato apparirà qui. Scegli un modello, inserisci testo e fai clic su Genera.

Informazioni CosyVoice3

CosyVoice3 is the newest generation from Alibaba's FunAudioLLM team and a clear step up from CosyVoice 2. It introduces bi-streaming inference with roughly 150ms latency and instruction-based control, letting you steer emotion, speed, and volume through prompts. Speaker similarity for zero-shot voice cloning is improved, and coverage spans 9 languages plus 18 Chinese dialects. An RL-tuned variant pushes prosody to a state-of-the-art level. With a 5,000-character ceiling, fast generation, and strong cloning, it's geared toward multilingual production TTS and real-time applications.

Meglio per: Multilingual production TTS, real-time applications, voice cloning

Sfoglia tutti CosyVoice3 voci

A colpo d'occhio

Sviluppatore: Alibaba (FunAudioLLM)
Licenza: Apache 2.0
Livello: standard
Velocità: fast
Clonazione vocale: Sì
Lingue: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Russian
Caratteri massimi: 5000

CosyVoice3 voci

Chinese Female

Chinese

Standard Female

Chinese Male

Chinese

Standard Male

English Female

English

Standard Female

English Male

English

Standard Male

French Female

French

Standard Female

German Female

German

Standard Female

Italian Female

Italian

Standard Female

Japanese Female

Japanese

Standard Female

Korean Female

Korean

Standard Female

Russian Female

Russian

Standard Female

Spanish Female

Spanish

Standard Female

CosyVoice3 FAQ del TTS

CosyVoice3 adds bi-streaming inference at around 150ms latency, instruction-based control over emotion/speed/volume, improved speaker similarity for cloning, and coverage of 9 languages plus 18 Chinese dialects, with an RL-tuned variant for state-of-the-art prosody.

Yes. It supports zero-shot voice cloning from a reference clip (around 3 seconds minimum) with improved speaker similarity over the previous generation.

Yes. CosyVoice3 is licensed under Apache 2.0, permitting commercial use.

← Tutte le voci

CosyVoice3 TTS

Ti piace TTS.ai? Dillo ai tuoi amici!

Informazioni CosyVoice3

A colpo d'occhio

CosyVoice3 voci

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Russian Female

Spanish Female

CosyVoice3 FAQ del TTS

What makes CosyVoice3 different from CosyVoice 2?

Does CosyVoice3 support voice cloning?

Is CosyVoice3 free for commercial use?