CosyVoice 2

Chinese Male

ستاندارد Chinese Male CosyVoice 2

Chinese Male is a male AI voice powered by the CosyVoice 2 text-to-speech model. This standard-tier voice speaks Chinese and delivers studio-quality speech synthesis. With moderate generation speed and a quality rating of 5/5, Chinese Male is well-suited for real-time applications, streaming tts, voice assistants. The CosyVoice 2 engine is developed by Alibaba (Tongyi Lab) under the Apache 2.0 license, making it safe for commercial use. Key capabilities include: streaming, zero-shot cloning, cross-lingual, emotion control, human-parity. The CosyVoice 2 model also supports voice cloning — upload a short audio sample to create a custom voice that retains the same quality characteristics.

No ratings yet

CosyVoice 2زانیاری مۆدێل

مۆدێل CosyVoice 2
پەرەپێدەر Alibaba (Tongyi Lab)
باڵا
خێرایی ناوەند
مۆڵەتنامە Apache 2.0
دووبارە دروستکردنەوە پاڵپشتی کراوە
ئەیلول ستاندارد (٢ کرێدیت/١ک هێما)
Parameters 300M
Architecture Finite Scalar Quantization + Flow Matching
Training Data 200000 hours
Year 2024

بەکارھێنانی باشە بۆ Chinese Male

پڕۆگرامە پێشنیارکراوەکان لەسەر بنەمای ئەم دەنگە

Audiobooks & Narration

Use Chinese Male to narrate long-form content with natural prosody and expression.

Video Voiceovers

Add professional narration to YouTube videos, ads, and social media content.

Podcasts & Broadcasting

Studio-quality output suitable for podcasts, radio, and professional broadcasting.

Custom Brand Voice

Clone this voice style with your own audio to create a unique branded TTS voice.

زیاتر CosyVoice 2 دەنگی

دەنگی تر لە هەمان مۆدێلی TTS

Chinese Female

Chinese Female

English Female

English Female

English Male

English Male

Japanese Female

Japanese Female

پرسیاری زۆر کراوە

CosyVoice 2 by Alibaba's Tongyi Lab achieves human-comparable speech quality with extremely low latency, making it ideal for real-time applications. It uses a finite scalar quantization approach for streaming synthesis and supports zero-shot voice cloning, cross-lingual synthesis, and fine-grained emotion control. It outperforms many commercial TTS systems in subjective evaluations.

CosyVoice 2 was developed by Alibaba (Tongyi Lab) and is released under the Apache 2.0 license, which permits commercial use of generated audio.

CosyVoice 2 supports 8 languages: English, Chinese, Japanese, Korean, French, German, Italian, Spanish.

CosyVoice 2 is in the Standard tier — 2 credits per 1,000 characters. You can preview any CosyVoice 2 voice for free before generating full audio.

CosyVoice 2 has moderate generation speed. Generation typically takes a few seconds depending on text length.

CosyVoice 2 is rated 5/5 for audio quality on TTS.ai. It delivers studio-grade, human-like speech.

Yes, CosyVoice 2 supports zero-shot voice cloning. Upload 5-30 seconds of reference audio to create a custom voice.

Yes, CosyVoice 2 is specifically recommended for real-time applications, streaming tts, voice assistants. Its streaming, zero-shot cloning, cross-lingual capabilities make it an excellent choice for this use case.

Yes, CosyVoice 2 is licensed under Apache 2.0, which allows commercial use. Audio generated with CosyVoice 2 voices can be used in videos, podcasts, apps, games, and any other commercial project.

Yes, all voices on TTS.ai use commercially-licensed open-source models (MIT, Apache 2.0). The generated audio is yours to use in videos, podcasts, apps, games, and any other commercial application.

Send a POST request to /api/v1/tts/ with the model name and voice ID. See our API Documentation page for code examples in Python, JavaScript, Go, and cURL.

Yes, click the play button on this page to hear a sample. You can also type custom text on the Text to Speech page and generate a free preview with any voice.

هەوڵبدە Chinese Male ئێستا

هەر نوسراوێک بنوسە و گوێی لێبگرە Chinese Male. ئازادە بۆ بەکارهێنان.