Darwin TTS TTS
A Qwen3-TTS variant whose talker FFN weights are blended from the Qwen3 language model for sharper cross-lingual cloning.
Darwin-TTS-1.7B-Cross by FINAL-Bench is a research variant of Qwen3-TTS-1.7B with an unusual construction: 84 of its talker-FFN tensors (about 8.6% of them) are blended at a 3% ratio with the matching tensors from Qwen3-1.7B-Base, all without any retraining. The result is a model that produces noticeably crisper cross-lingual voice cloning across Korean, English, Japanese, and Chinese — its four core languages. It operates in zero-shot voice-clone mode, needing only about three seconds of reference audio to capture a speaker. Darwin is best suited to transferring a single reference voice across those four languages, for example dubbing or multilingual narration with consistent speaker identity.
At a glance
- Developer
- FINAL-Bench
- License
- Apache 2.0
- Tier
- standard
- Speed
- medium
- Voice cloning
- Yes
- Languages
- English, Korean, Japanese, Chinese
- Max characters
- 2000
Darwin TTS AI Voices
Best for
Cross-lingual voice cloning between English / Korean / Japanese / Chinese with a single reference voice