Informar de Bug / Pedido de Feature

CosyVoice 2 TTS

Alibaba Tongyi Lab's streaming TTS reaching human-parity naturalness with near-zero latency and zero-shot cloning.

0/500 caracteres · Inscreva-se por 5.000 por geração →

Inscrever-se para o limite de 5000 caracteres

Modo SSML (Sintetização da fala Língua de marca para controle fino)

Envolva o seu texto em tags SSML para controle preciso:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Etiquetas de Emoção / Estilo

Etiquetas o modelo selecionado entende — clique para soltar um para o seu texto onde acontece:

Dicionário de pronunciação

Definir pronúncias personalizadas (palavra = pronúncia):

Pitch 0

-12 +12

Modelo de IA

Voz

Língua

Formato de saída

Velocidade 1.0x

0.5x 2.0x

Grátis com Piper, VITS, MeloTTS

Seu áudio gerado aparecerá aqui. Escolha um modelo, introduza texto e clique em Gerar.

Sobre CosyVoice 2

CosyVoice 2, from Alibaba's Tongyi Lab, was designed to make high-quality speech viable in real time. It uses a finite scalar quantization approach combined with flow matching to support streaming synthesis at extremely low latency, while reaching human-comparable naturalness that outperforms many commercial systems in subjective tests. Beyond quality, it offers zero-shot voice cloning from about 3 seconds of audio, cross-lingual synthesis, and fine-grained emotion control. Covering 8 languages with a 1,000-character cap, it's a strong fit for voice assistants, streaming TTS, and other real-time applications.

Melhor para: Real-time applications, streaming TTS, voice assistants

Procurar todos CosyVoice 2 vozes

De uma olhada

Desenvolvedor: Alibaba (Tongyi Lab)
Licença: Apache 2.0
Tier: standard
Velocidade: medium
Clonagem de voz: Sim
Línguas: English, Chinese, Japanese, Korean, French, German, Italian, Spanish
Número máximo de caracteres: 1000

CosyVoice 2 vozes

Chinese Female

Chinese

Norma Female

Chinese Male

Chinese

Norma Male

English Female

English

Norma Female

English Male

English

Norma Male

French Female

French

Norma Female

German Female

German

Norma Female

Italian Female

Italian

Norma Female

Japanese Female

Japanese

Norma Female

Korean Female

Korean

Norma Female

Spanish Female

Spanish

Norma Female

CosyVoice 2 TTS — FAQ

Yes. CosyVoice 2 uses finite scalar quantization for streaming synthesis at very low latency, which is what makes it suitable for voice assistants and real-time applications.

Yes. It offers zero-shot voice cloning from roughly 3 seconds of reference audio, plus cross-lingual synthesis and emotion control.

Yes. CosyVoice 2 is Apache 2.0 licensed. It supports 8 languages: English, Chinese, Japanese, Korean, French, German, Italian, and Spanish.

← Todas as vozes

CosyVoice 2 TTS

Gosta do TTS.ai? Conte aos seus amigos!

Sobre CosyVoice 2

De uma olhada

CosyVoice 2 vozes

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Spanish Female

CosyVoice 2 TTS — FAQ

Can CosyVoice 2 stream audio in real time?

Does CosyVoice 2 support voice cloning?

Is CosyVoice 2 free for commercial use?