Zgłosić błąd / żądanie funkcji

CosyVoice3 TTS

Alibaba FunAudioLLM's latest multilingual model with ~150ms bi-streaming, instruction control, and zero-shot cloning.

Tekst
Pliki

0/500 znaki · Zarejestruj się na 5000 na pokolenie →

Zarejestruj się. dla 5000 limitów znaków

Tryb SSML (Syntezy mowy Markup Język do dobrej kontroli)

Zawiń tekst w tagi SSML dla precyzyjnej kontroli:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Znaczniki emocji / stylu

Tagi wybrany model rozumie — kliknij aby usunąć jeden do swojego tekstu, gdzie się to dzieje:

Słownik wymówek

Definiuj własny wymówki (słowo = wymówka):

Pitch 0

-12 +12

Model AI

Głos

Język

Format wyjściowy

Prędkość 1.0x

0.5x 2.0x

Darmowe z Piper, VITS, Melotts

Tutaj pojawi się generowany dźwięk. Wybierz model, wpisz tekst i kliknij Generuj.

O tematie CosyVoice3

CosyVoice3 is the newest generation from Alibaba's FunAudioLLM team and a clear step up from CosyVoice 2. It introduces bi-streaming inference with roughly 150ms latency and instruction-based control, letting you steer emotion, speed, and volume through prompts. Speaker similarity for zero-shot voice cloning is improved, and coverage spans 9 languages plus 18 Chinese dialects. An RL-tuned variant pushes prosody to a state-of-the-art level. With a 5,000-character ceiling, fast generation, and strong cloning, it's geared toward multilingual production TTS and real-time applications.

Najlepsze dla: Multilingual production TTS, real-time applications, voice cloning

Przeglądaj wszystkie CosyVoice3 głosy

Na jedno spojrzenie

Rozwijacz: Alibaba (FunAudioLLM)
Licencja: Apache 2.0
Poziom szczelności: standard
Prędkość: fast
Klonowanie głosu: Tak.
Języki: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Russian
Maksymalna liczba znaków: 5000

CosyVoice3 głosy

Chinese Female

Chinese

Standardowe Female

Chinese Male

Chinese

Standardowe Male

English Female

English

Standardowe Female

English Male

English

Standardowe Male

French Female

French

Standardowe Female

German Female

German

Standardowe Female

Italian Female

Italian

Standardowe Female

Japanese Female

Japanese

Standardowe Female

Korean Female

Korean

Standardowe Female

Russian Female

Russian

Standardowe Female

Spanish Female

Spanish

Standardowe Female

CosyVoice3 TTS — FAQ

CosyVoice3 adds bi-streaming inference at around 150ms latency, instruction-based control over emotion/speed/volume, improved speaker similarity for cloning, and coverage of 9 languages plus 18 Chinese dialects, with an RL-tuned variant for state-of-the-art prosody.

Yes. It supports zero-shot voice cloning from a reference clip (around 3 seconds minimum) with improved speaker similarity over the previous generation.

Yes. CosyVoice3 is licensed under Apache 2.0, permitting commercial use.

← Wszystkie głosy

CosyVoice3 TTS

Powiedz znajomym!

O tematie CosyVoice3

Na jedno spojrzenie

CosyVoice3 głosy

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Russian Female

Spanish Female

CosyVoice3 TTS — FAQ

What makes CosyVoice3 different from CosyVoice 2?

Does CosyVoice3 support voice cloning?

Is CosyVoice3 free for commercial use?