Àkọlé àwòrán

Tortoise TTS TTS

A quality-first autoregressive model — slow, but among the most realistic open-source speech available.

0/500 Àwọn àmì-àṣírí · Ṣàfihàn fún 5,000 ni gbogbo ìgbà →

Ṣẹ̀dà fun àwọn àmì-àṣírí 5,000

Àwọn Ìṣàmúlò-ètò (Àwọn Àwọn Àkọ́kọ́ Àwọn Àkọ́kọ́ Àwọn Àkọ́kọ́)

Fi àkọlé rẹ pamọ́ sí àwọn àmì-ìwé SSML fún ìdáràn:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Àwọn Àmì-ìwé Ìjánu-ìró / Ìyárà

Àwọn Àmì-ìwé tí àwọn ìṣàmúlò-ètò tí a yàn gbọ́ - tẹ̀ láti fi ọkan sínú àkọ́lé rẹ̀ nínú àwọn ààyè-iṣẹ́ tí o bá jẹ́:

Àwọn Àwọn Àkọlé

Àwọn àwọn ìṣàfarawé àwọn àwọn ìṣàfarawé àwọn (ọrọ = ìṣàfàlì):

Ìjánu-ìsún 0

-12 +12

Àwọn ìṣàmúlò-ètò

Àwọn àmì-ìwé

Àwọn

Ìgúnrégé àwọn ààtòjọ-ẹ̀yàn

Ìjánu-ìṣàmúlò-ètò 1.0x

0.5x 2.0x

Free pẹlu Piper, VITS, MeloTTS

Àwọn àwòrán tí o ti ṣẹ̀dà tí o bá han níbẹ̀. Yan àwọn àwòrán, tẹ̀lẹ̀ àkọlé, ki o si tẹ̀ Ṣẹ̀dà.

Ààyè-iṣẹ́ Tortoise TTS

Tortoise TTS, created by James Betker, deliberately trades speed for quality. It is an autoregressive multi-voice system using a DALL-E-inspired architecture, and it produces some of the most realistic synthetic speech in the open-source ecosystem, with excellent prosody and speaker similarity. The name is a nod to its pace: it is noticeably slower than most alternatives, but the payoff is studio-grade output. It supports multiple voices and voice cloning (which benefits from a longer reference, around fifteen seconds), making it a long-standing favorite for audiobooks and premium narration where wait time is acceptable. Tortoise is English-focused and released under the permissive Apache 2.0 license.

Tí o dara jù fún: Audiobooks, premium content, quality-first applications

Wá Gbogbo àwòrán Tortoise TTS Àwọn àwòrán

Nínú àwọn ìṣàfarawé

Àwọn Àkọlé: James Betker
Àwọn Ààyè-iṣẹ́: Apache 2.0
Àwọn àwọn ààyè-iṣẹ́: premium
Ìjánu-ìṣàmúlò-ètò: slow
Ìṣàfarawé àwọn àmì-ìwé: Yà
Àwọn: English
Àwọn àyọkà ìpele: 2000

Tortoise TTS Àwọn àwòrán

Random

English

Àwọn ìṣàmúlò-ètò Neutral

Tortoise TTS Àwọn Àtòjọ-ẹ̀yàn

It is autoregressive and uses a DALL-E-inspired architecture that deliberately prioritizes quality over speed. The trade-off is some of the most realistic open-source speech available, which is why it remains popular for audiobooks despite the wait.

Yes. It supports multi-voice synthesis and voice cloning; results improve with a longer reference, around fifteen seconds of clean audio.

Quality-first applications — audiobooks and premium narration — where its slow but highly realistic output is worth the generation time. It is English-focused and Apache 2.0 licensed.

← Gbogbo àwọn ìrànwọ́

Tortoise TTS TTS

O fẹ́ TTS.ai? Fì sọ̀kalẹ̀ fún àwọn ọrẹ̀ rẹ̀!

Ààyè-iṣẹ́ Tortoise TTS

Nínú àwọn ìṣàfarawé

Tortoise TTS Àwọn àwòrán

Random

Tortoise TTS Àwọn Àtòjọ-ẹ̀yàn

Why is Tortoise TTS so slow?

Can Tortoise TTS clone a voice?

What is Tortoise TTS best for?