Signaler la demande de bogue/caractère

CosyVoice 2 TTS

Alibaba Tongyi Lab's streaming TTS reaching human-parity naturalness with near-zero latency and zero-shot cloning.

0/500 caractères · Inscrivez-vous pour 5 000 par génération →

Inscrivez-vous pour la limite de 5 000 caractères

Mode SSML (Markup de synthèse de discours Langue pour le contrôle fin)

Enveloppez votre texte dans des balises SSML pour un contrôle précis :

<speak><prosody rate="slow">Slow speech</prosody></speak>

Émotion / Étiquettes de style

Mots clés le modèle sélectionné comprend — cliquez pour en déposer un dans votre texte où il se produit:

Dictionnaire de prononciation

Définir les prononciations personnalisées (mot = prononciation) :

Emplacement 0

-12 +12

Modèle AI

Voix

Langue

Format de sortie

Régime 1.0x

0.5x 2.0x

Gratuit avec Piper, VITS, MeloTTS

Votre audio généré apparaîtra ici. Choisissez un modèle, entrez le texte et cliquez sur Générer.

À propos CosyVoice 2

CosyVoice 2, from Alibaba's Tongyi Lab, was designed to make high-quality speech viable in real time. It uses a finite scalar quantization approach combined with flow matching to support streaming synthesis at extremely low latency, while reaching human-comparable naturalness that outperforms many commercial systems in subjective tests. Beyond quality, it offers zero-shot voice cloning from about 3 seconds of audio, cross-lingual synthesis, and fine-grained emotion control. Covering 8 languages with a 1,000-character cap, it's a strong fit for voice assistants, streaming TTS, and other real-time applications.

Meilleur pour: Real-time applications, streaming TTS, voice assistants

Tout voir CosyVoice 2 voix

En un coup d'oeil

Développeur: Alibaba (Tongyi Lab)
Licence: Apache 2.0
Niveau: standard
Régime: medium
Closonnage de la voix: Oui
Langues: English, Chinese, Japanese, Korean, French, German, Italian, Spanish
Personnages maxi: 1000

CosyVoice 2 voix

Chinese Female

Chinese

Norme Female

Chinese Male

Chinese

Norme Male

English Female

English

Norme Female

English Male

English

Norme Male

French Female

French

Norme Female

German Female

German

Norme Female

Italian Female

Italian

Norme Female

Japanese Female

Japanese

Norme Female

Korean Female

Korean

Norme Female

Spanish Female

Spanish

Norme Female

CosyVoice 2 TTS — FAQ

Yes. CosyVoice 2 uses finite scalar quantization for streaming synthesis at very low latency, which is what makes it suitable for voice assistants and real-time applications.

Yes. It offers zero-shot voice cloning from roughly 3 seconds of reference audio, plus cross-lingual synthesis and emotion control.

Yes. CosyVoice 2 is Apache 2.0 licensed. It supports 8 languages: English, Chinese, Japanese, Korean, French, German, Italian, and Spanish.

← Toutes les voix

CosyVoice 2 TTS

Vous aimez TTS.ai ? Parlez-en à vos amis !

À propos CosyVoice 2

En un coup d'oeil

CosyVoice 2 voix

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Spanish Female

CosyVoice 2 TTS — FAQ

Can CosyVoice 2 stream audio in real time?

Does CosyVoice 2 support voice cloning?

Is CosyVoice 2 free for commercial use?