รายงานข้อผิดพลาด / ขอฟีเจอร์ใหม่

CosyVoice3 เสียง

Alibaba FunAudioLLM's latest multilingual model with ~150ms bi-streaming, instruction control, and zero-shot cloning.

0/500 ตัวอักษร · ลงทะเบียน 5,000 คนต่อรุ่น →

ลงทะเบียน สำหรับจำกัดตัวอักษร 5,000 ตัว

โหมด SSML (ภาษาการทำเครื่องหมายสังเคราะห์เสียงสำหรับควบคุมการพูดName)

หมุนข้อความของคุณในแท็ก SSML เพื่อควบคุมอย่างแม่นยำ:

<speak><prosody rate="slow">Slow speech</prosody></speak>

แท็กอารมณ์/ รูปแบบ

แท็กที่โมเดลที่เลือกไว้เข้าใจ - คลิกเพื่อวางแท็กลงในข้อความของคุณที่มันเกิดขึ้น:

คำอธิบายการออกเสียงName

ตั้งค่าการออกเสียงที่กำหนดเอง (คำ = การออกเสียง):

ระดับเสียง 0

-12 +12

โมเดล AI

เสียง

ภาษา

รูปแบบผลลัพธ์

ความเร็ว 1.0x

0.5x 2.0x

ใช้ฟรีกับไพเปอร์, VITS, MeloTTS

เสียงที่สร้างขึ้นจะปรากฏที่นี่ เลือกโมเดล พิมพ์ข้อความ และคลิกที่ สร้าง

เกี่ยวกับ CosyVoice3

CosyVoice3 is the newest generation from Alibaba's FunAudioLLM team and a clear step up from CosyVoice 2. It introduces bi-streaming inference with roughly 150ms latency and instruction-based control, letting you steer emotion, speed, and volume through prompts. Speaker similarity for zero-shot voice cloning is improved, and coverage spans 9 languages plus 18 Chinese dialects. An RL-tuned variant pushes prosody to a state-of-the-art level. With a 5,000-character ceiling, fast generation, and strong cloning, it's geared toward multilingual production TTS and real-time applications.

เหมาะสำหรับ: Multilingual production TTS, real-time applications, voice cloning

แสดงทั้งหมด CosyVoice3 เสียง

เพียงแค่มองดู

ผู้พัฒนา: Alibaba (FunAudioLLM)
ใบอนุญาต: Apache 2.0
สัตว์: standard
ความเร็ว: fast
เสียง: ใช่
ภาษา: English, Chinese, Japanese, Korean, German, Spanish, French, Italian, Russian
จำนวนตัวอักษรสูงสุด: 5000

CosyVoice3 เสียง

Chinese Female

Chinese

ค่ามาตรฐาน Female

Chinese Male

Chinese

ค่ามาตรฐาน Male

English Female

English

ค่ามาตรฐาน Female

English Male

English

ค่ามาตรฐาน Male

French Female

French

ค่ามาตรฐาน Female

German Female

German

ค่ามาตรฐาน Female

Italian Female

Italian

ค่ามาตรฐาน Female

Japanese Female

Japanese

ค่ามาตรฐาน Female

Korean Female

Korean

ค่ามาตรฐาน Female

Russian Female

Russian

ค่ามาตรฐาน Female

Spanish Female

Spanish

ค่ามาตรฐาน Female

CosyVoice3 คำถามที่พบบ่อย

CosyVoice3 adds bi-streaming inference at around 150ms latency, instruction-based control over emotion/speed/volume, improved speaker similarity for cloning, and coverage of 9 languages plus 18 Chinese dialects, with an RL-tuned variant for state-of-the-art prosody.

Yes. It supports zero-shot voice cloning from a reference clip (around 3 seconds minimum) with improved speaker similarity over the previous generation.

Yes. CosyVoice3 is licensed under Apache 2.0, permitting commercial use.

← เสียงทั้งหมด

CosyVoice3 เสียง

รัก TTS.ai บอกเพื่อนๆ

เกี่ยวกับ CosyVoice3

เพียงแค่มองดู

CosyVoice3 เสียง

Chinese Female

Chinese Male

English Female

English Male

French Female

German Female

Italian Female

Japanese Female

Korean Female

Russian Female

Spanish Female

CosyVoice3 คำถามที่พบบ่อย

What makes CosyVoice3 different from CosyVoice 2?

Does CosyVoice3 support voice cloning?

Is CosyVoice3 free for commercial use?