Rapporteer bug / feature request

CosyVoice 2 TTS

Alibaba Tongyi Lab's streaming TTS reaching human-parity naturalness with near-zero latency and zero-shot cloning.

0/500 tekens · Meld je aan voor 5.000 per generatie →

Aanmelden voor 5.000 tekenlimiet

SSML-modus (Speech Synthesis Markup Language voor fijne controle)

Wrap uw tekst in SSML-tags voor nauwkeurige controle:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Emotie / Style Tags

Tags het geselecteerde model begrijpt

Woordenboek van de uitspraak

Definieer aangepaste uitspraaken (woord = uitspraak):

Pitchunit synonyms for matching user input 0

-12 +12

AI-model

Stem

Taal

Uitvoerformaat

Snelheid 1.0x

0.5x 2.0x

Gratis met Piper, VITS, MeloTTS

Uw gegenereerde audio zal hier verschijnen. Kies een model, voer tekst in en klik op Genereren.

Info CosyVoice 2

CosyVoice 2, from Alibaba's Tongyi Lab, was designed to make high-quality speech viable in real time. It uses a finite scalar quantization approach combined with flow matching to support streaming synthesis at extremely low latency, while reaching human-comparable naturalness that outperforms many commercial systems in subjective tests. Beyond quality, it offers zero-shot voice cloning from about 3 seconds of audio, cross-lingual synthesis, and fine-grained emotion control. Covering 8 languages with a 1,000-character cap, it's a strong fit for voice assistants, streaming TTS, and other real-time applications.

Beste voor: Real-time applications, streaming TTS, voice assistants

Alles doorbladeren CosyVoice 2 stemmen

In een oogopslag

Ontwikkelaar: Alibaba (Tongyi Lab)
Licentie: Apache 2.0
Niveau: standard
Snelheid: medium
Klonen van stemmen: Ja.
Talen: English, Chinese, Japanese, Korean, French, German, Italian, Spanish
Max. tekens: 1000

CosyVoice 2 stemmen

Chinese Female

Chinese

Standaard Female

Chinese Male

Chinese

Standaard Male

English Female

English

Standaard Female

English Male

English

Standaard Male

French Female

French

Standaard Female

German Female

German

Standaard Female

Italian Female

Italian

Standaard Female

Japanese Female

Japanese

Standaard Female

Korean Female

Korean

Standaard Female

Spanish Female

Spanish

Standaard Female

CosyVoice 2 Veelgestelde vragen

Yes. CosyVoice 2 uses finite scalar quantization for streaming synthesis at very low latency, which is what makes it suitable for voice assistants and real-time applications.

Yes. It offers zero-shot voice cloning from roughly 3 seconds of reference audio, plus cross-lingual synthesis and emotion control.

Yes. CosyVoice 2 is Apache 2.0 licensed. It supports 8 languages: English, Chinese, Japanese, Korean, French, German, Italian, and Spanish.

← Alle stemmen

CosyVoice 2 TTS

Hou van TTS.ai? Vertel het je vrienden!

Info CosyVoice 2

In een oogopslag

CosyVoice 2 stemmen

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Spanish Female

CosyVoice 2 Veelgestelde vragen

Can CosyVoice 2 stream audio in real time?

Does CosyVoice 2 support voice cloning?

Is CosyVoice 2 free for commercial use?