Segnala bug / richiesta di funzionalità

CosyVoice 2 TTS

Alibaba Tongyi Lab's streaming TTS reaching human-parity naturalness with near-zero latency and zero-shot cloning.

Testo
File

0/500 caratteri · Iscriviti per 5.000 per generazione →

Iscriviti per un limite di 5.000 caratteri

Modalità SSML (Linguaggio di marcatura sintesi vocale per un controllo fine)

Avvolgi il tuo testo nei tag SSML per un controllo preciso:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Emozione / Tag stile

Tags il modello selezionato comprende clic su

Dizionario della pronuncia

Definire le pronunciazioni personalizzate (parola = pronuncia):

Piazzola 0

-12 +12

Modello AI

Voce

Lingua

Formato di output

Velocità 1.0x

0.5x 2.0x

Gratis con Piper, VITS, MeloTTS

L'audio generato apparirà qui. Scegli un modello, inserisci testo e fai clic su Genera.

Informazioni CosyVoice 2

CosyVoice 2, from Alibaba's Tongyi Lab, was designed to make high-quality speech viable in real time. It uses a finite scalar quantization approach combined with flow matching to support streaming synthesis at extremely low latency, while reaching human-comparable naturalness that outperforms many commercial systems in subjective tests. Beyond quality, it offers zero-shot voice cloning from about 3 seconds of audio, cross-lingual synthesis, and fine-grained emotion control. Covering 8 languages with a 1,000-character cap, it's a strong fit for voice assistants, streaming TTS, and other real-time applications.

Meglio per: Real-time applications, streaming TTS, voice assistants

Sfoglia tutti CosyVoice 2 voci

A colpo d'occhio

Sviluppatore: Alibaba (Tongyi Lab)
Licenza: Apache 2.0
Livello: standard
Velocità: medium
Clonazione vocale: Sì
Lingue: English, Chinese, Japanese, Korean, French, German, Italian, Spanish
Caratteri massimi: 1000

CosyVoice 2 voci

Chinese Female

Chinese

Standard Female

Chinese Male

Chinese

Standard Male

English Female

English

Standard Female

English Male

English

Standard Male

French Female

French

Standard Female

German Female

German

Standard Female

Italian Female

Italian

Standard Female

Japanese Female

Japanese

Standard Female

Korean Female

Korean

Standard Female

Spanish Female

Spanish

Standard Female

CosyVoice 2 FAQ del TTS

Yes. CosyVoice 2 uses finite scalar quantization for streaming synthesis at very low latency, which is what makes it suitable for voice assistants and real-time applications.

Yes. It offers zero-shot voice cloning from roughly 3 seconds of reference audio, plus cross-lingual synthesis and emotion control.

Yes. CosyVoice 2 is Apache 2.0 licensed. It supports 8 languages: English, Chinese, Japanese, Korean, French, German, Italian, and Spanish.

← Tutte le voci

CosyVoice 2 TTS

Ti piace TTS.ai? Dillo ai tuoi amici!

Informazioni CosyVoice 2

A colpo d'occhio

CosyVoice 2 voci

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Spanish Female

CosyVoice 2 FAQ del TTS

Can CosyVoice 2 stream audio in real time?

Does CosyVoice 2 support voice cloning?

Is CosyVoice 2 free for commercial use?