I-Free AI Umbhalo usuka kumazwi

20+ imodeli yomthombo ovulekile 107+ izizwi, 32+ Izilimi. Akukho akhawunti edingekayo.

1K+
abakhiqizi
2K+
izizukulwane
20+
Amamodeli we-AI
107+
imisindo
0/500 amaphawu Ikhululekile
Uthanda i-TTS.ai? Xhumana nabangane bakho!

Konke okudingayo ngezwi AI

Amathuluzi angama-30+ asebenza ngemodeli ye-AI evulekile

20+ Amamodeli omsindo we-AI

Uhlelo oluphelele kakhulu lwezimo ze-TTS ezivulekile ezikhona kwi-platform eyodwa

KokoroKokoro Free

I-Kokoro iyimodeli ye-text-to-speech eneparameter engu-82 million eyenza kahle ngaphezu kwe-weight class yayo. Nakuba incane kakhulu, ikhiqiza amagama acacile futhi acacile. I-Kokoro isekela izilimi eziningi kufaka phakathi isiNgisi, isiJaphani, isiTshayina, nesiKoreane ngezinhlobonhlobo zamazwi acacile. Isebenza ngokushesha kakhulu — ikhiqiza umsindo osheshayo cishe ngama-100x kunosikhathi sangempela kwi-GPU.

Okungcono kakhulu: Ikhwalithi ephezulu ye-TTS enesikhathi sokuphuma esincane, izisebenziso zokusakaza

Zama mahhala

PiperPiper Free

I-Piper iyinjini elula yokubhala-ukukhuluma ethuthukiswe yi-Rhasspy esebenzisa i-VITS ne-larynx architectures. Isebenza ngokuphelele ku-CPU, iyenza ibe ngcono kakhulu kumadivayisi e-edge, ukuphathwa kwekhaya, namathuluzi adinga i-TTS engenayo. Ngezwi elingaphezu kuka-100 lidlula ulwimi olungaphezu kuka-30, i-Piper inikeza ukukhuluma okubukekayo ngokuzenzakalela ngejubane lesikhathi sangempela ngisho ne-Raspberry Pi 4.

Okungcono kakhulu: Ukubukeka okukhawulelwe, ukufinyeleleka, kanye nezisebenziso ezifakwe ngaphakathi

Zama mahhala

VITSVITS Free

VITS (Izibalo ezishintshayo ezifunda ngokuphikisanayo ukuqala ukubhala-ukukhuluma-ukuphela-ku-kuphela) yindlela ye-TTS elinganayo ekugcineni-ku-kuphela ekhiqiza umsindo ozwakalayo ojwayelekile kunalezo ezingemuva-ezimbili. Isebenzisa izibalo ezishintshayo ezithuthukisiwe ngokuhamba okujwayelekile kanye nenqubo yokuqeqeshwa okuphikisanayo, ethola ukukhula okuphawulekayo ekungavamile.

Okungcono kakhulu: Umbhalo-ku-ukukhuluma okusetshenziswa kakhulu nge-prosody ejwayelekile

Zama mahhala

MeloTTSMeloTTS Free

MeloTTS ngu MyShell.ai yi-TTS library eminingi ye-languages exhasa isiNgisi (i-American, i-British, i-Indian, i-Australian), isiShayina, isiJalimane, isiKorean. Ishesha kakhulu, isebenza umbhalo ngejubane elifanayo nesikhathi sangempela kwi-CPU kuphela. MeloTTS isetshenziselwa ukusetshenziswa kokukhiqizwa futhi ixhasa i-CPU ne-GPU inference.

Okungcono kakhulu: Izisebenziso zokukhiqiza ezidinga i-TTS esheshayo, enezilimi eziningi

Zama mahhala

BarkBark Standard

Imodeli yokubhala-kuya-kwesandi esekelwe ku-transformer ekhiqiza amagama acacile, umculo, kanye nemiphumela yomsindo.

Umthuthukisi: Suno · Ilayisense: MIT

Zama

Bark SmallBark Small Standard

Uhlobo oluncane lwe-Bark olunezincazelo ezisheshayo nokusetshenziswa okuphansi kwememori.

Umthuthukisi: Suno · Ilayisense: MIT

Zama

CosyVoice 2CosyVoice 2 Standard

I-Alibaba's scalable streaming TTS ne-human-parity naturalness ne-near-zero latency.

Umthuthukisi: Alibaba (Tongyi Lab) · Ilayisense: Apache 2.0

Zama

Dia TTSDia TTS Standard

Imodeli yokukhiqiza umsindo oningi owenza ukuxhumana okujwayelekile phakathi kwama-speakers.

Umthuthukisi: Nari Labs · Ilayisense: Apache 2.0

Zama

Parler TTSParler TTS Standard

Sichaza umsindo ofuna ngesilimi esijwayelekile futhi i-Parler ikhiqiza umsindo olinganayo.

Umthuthukisi: Hugging Face · Ilayisense: Apache 2.0

Zama

GLM-TTSGLM-TTS Standard

Ithola iphutha lophawu oluphansi phakathi kwemodeli ye-TTS yomthombo ovulekile.

Umthuthukisi: Zhipu AI · Ilayisense: GLM-4 License

Zama

IndexTTS-2IndexTTS-2 Standard

I-TTS engekho emthethweni ene-fine-grained emotional control ne-high expressionality.

Umthuthukisi: Index Team · Ilayisense: Bilibili Model License

Zama

Spark TTSSpark TTS Standard

Uhlu lwezwi lokuklonya i-TTS nge-emoji elawulwayo nesimo sokukhuluma nge-prompts.

Umthuthukisi: SparkAudio · Ilayisense: CC BY-NC-SA 4.0

Zama

GPT-SoVITSGPT-SoVITS Standard

Uhlu lwezwi lokuklonya TTS oluncane oluphindayo noma yiluphi ulwimi kusuka kumasekondi angama-5 kuphela wesandi.

Umthuthukisi: RVC-Boss · Ilayisense: MIT

Zama

OrpheusOrpheus Standard

Imodeli ye-TTS enamandla okuqonda esezingeni lomuntu eqeqeshiwe ngehora le-100K ledatha yokukhuluma.

Umthuthukisi: Canopy Labs · Ilayisense: Llama 3.2 Community

Zama

Qwen3 TTSQwen3 TTS Standard

I-Alibaba's multilingual TTS nezwi lokuklonya, izizwi ezisetshenzisiwe, kanye nobuciko bezwi kusuka kumbhalo.

Umthuthukisi: Alibaba (Qwen) · Ilayisense: Apache 2.0

Zama

ChatterboxChatterbox Premium

Uhlelo olusha lokuklonya umsindo olungenalutho olune-emotion control oluvela ku-Resemble AI.

Ubunjani:

Zama

Tortoise TTSTortoise TTS Premium

Umbhalo-ku-ukukhuluma okhuluma ngezilimi eziningi obhekene nekhwalithi ngesakhiwo esibuyela emuva.

Ubunjani:

Zama

StyleTTS 2StyleTTS 2 Premium

Uhlelo lokuhlela amagama ngokuya ngesimo sengqondo somuntu kanye noqeqesho oluphikisanayo.

Ubunjani:

Zama

OpenVoiceOpenVoice Premium

Ukuklonya umsindo ngokuzenzakalela ngokulawula okuqinile ngesitayela, inkanuko, nesimo.

Ubunjani:

Zama

Sesame CSMSesame CSM Premium

Imodeli yokukhuluma ekhuluma ngokuzimela ekhiqiza ukuxhumana okujwayelekile ngesikhathi esifanele kanye nemizwa.

Ubunjani:

Zama

CosyVoice 2CosyVoice 2

I-Alibaba's scalable streaming TTS ne-human-parity naturalness ne-near-zero latency.

Izilimi: en, zh, ja, ko, fr, de, it, es

Umsindo

GLM-TTSGLM-TTS

Ithola iphutha lophawu oluphansi phakathi kwemodeli ye-TTS yomthombo ovulekile.

Izilimi: en, zh

Umsindo

IndexTTS-2IndexTTS-2

I-TTS engekho emthethweni ene-fine-grained emotional control ne-high expressionality.

Izilimi: en, zh

Umsindo

Spark TTSSpark TTS

Uhlu lwezwi lokuklonya i-TTS nge-emoji elawulwayo nesimo sokukhuluma nge-prompts.

Izilimi: en, zh

Umsindo

GPT-SoVITSGPT-SoVITS

Uhlu lwezwi lokuklonya TTS oluncane oluphindayo noma yiluphi ulwimi kusuka kumasekondi angama-5 kuphela wesandi.

Izilimi: en, zh, ja, ko

Umsindo

ChatterboxChatterbox

Uhlelo olusha lokuklonya umsindo olungenalutho olune-emotion control oluvela ku-Resemble AI.

Izilimi: en

Umsindo

Tortoise TTSTortoise TTS

Umbhalo-ku-ukukhuluma okhuluma ngezilimi eziningi obhekene nekhwalithi ngesakhiwo esibuyela emuva.

Izilimi: en

Umsindo

OpenVoiceOpenVoice

Ukuklonya umsindo ngokuzenzakalela ngokulawula okuqinile ngesitayela, inkanuko, nesimo.

Izilimi: en, zh, ja, ko, fr, de, es, it

Umsindo

Qwen3 TTSQwen3 TTS

I-Alibaba's multilingual TTS nezwi lokuklonya, izizwi ezisetshenzisiwe, kanye nobuciko bezwi kusuka kumbhalo.

Izilimi: en, zh, ja, ko, de, fr, ru, pt, es, it

Umsindo

Umthuthukisi-kuqala API

I-REST API ehambisana ne-OpenAI. Ingxenye eyodwa, amamodeli angama-22+ Ukusakazwa kwengxoxo yesikhathi sangempela.

  • Ifomethi ehambisana ne-OpenAI
  • Ukusakazwa kwe-TTS kwezinhlelo zokusebenza zesikhathi sangempela
  • Uhlelo lwe-batch lwemisebenzi enkulu
  • Ulwaziso lwe-Webhook
Bona amadokhumende we-API
pip install ttsai npm install @ttsainpm/ttsai
Python
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-xxx")
audio = client.generate(
    text="Hello from TTS.ai!",
    model="kokoro",
    voice="af_bella",
)
client.save(audio, "output.mp3")

Intengo elula, ecacile

Qalisa ngokukhululekileyo. Ukukala njengoba ukhula.

Ikhululekile

$0

15,000 characters

  • Kokoro, Piper, VITS, MeloTTS
  • Iphutha lophawu lwe-500
  • 3 gen/ihora (akukho akhawunti)
Ubhalise

Isiqalisi

$9/ihora

500,000 characters/month

  • Zonke imodeli ezingu-22+
  • 100,000 chars per generation
  • Ukulungiswa kwezwi
Qala
Okuthandwa kakhulu

I-Pro

$29/ihora

2,000,000 characters/month

  • Konke ku-Starter
  • Ukungena kwe-API
  • Ukulungiswa kokuqala
Thola i-Pro

Ibhizinisi

$99/ihora

10,000,000 characters/month

  • Konke ku-Pro
  • I-bulk API
  • Ifolokhwe yesinqumo
Thola umsebenzi

Bona zonke izilungiselelo kufaka phakathi izilungiselelo zophawu →

Imibuzo ebuzwa kaningi

I-TTS.ai iyipulatifomu yezwi le-AI ebanzi kakhulu, enikeza amamodeli we-text-to-speech angama-22 +, ukucloning kwezwi, ukuxoxa kwe-text-to-text, namathuluzi e-audio. Onke amamodeli avulekile ngaphandle kwe-vendor lock-in.

Yebo! TTS.ai inikeza umbhalo-ku-ukukhuluma mahhala nge-Kokoro, Piper, VITS, ne-MeloTTS models. Akukho akhawunti edingekayo. Bhala ukuze uthole ama-15,000 ama-characters mahhala futhi ufinyelele kuzo zonke imodeli. Ama-plans akhokhelwayo aqala ku- $ 9 / ngenyanga.

Ukukhawulela, sebenzisa iKokoro noma iPiper. Ukwenza kahle, sebenzisa iCosyVoice 2 noma iStyleTTS 2. Ukuklona umsindo, sebenzisa iChatterbox noma iGPT-SoVITS. Ukwenza ingxoxo, sebenzisa iDia TTS. Zama amamodeli amaningi ku mbhalo owodwa ukuwalinganisa.

Yebo. I-OpenAI-ehambisana ne-REST API ye-TTS, i-STT, ukucloning kwezwi, namathuluzi e-audio. Itholakala ku-Pro ($ 29 / mo) ne-Enterprise ($ 99 / mo) amaphrojekthi. Bona idatabase ku-tts.ai / api /.

Ikhwalithi yomsindo ihluka ngokwemodeli. Amamodeli aphezulu njenge-CosyVoice 2, StyleTTS 2, ne-Chatterbox akhiqiza umsindo osezingeni elifanayo nomuntu nge-intonation ne-emotions ezijwayelekile. Amamodeli amahhala njenge-Kokoro anikeza ikhwalithi engcono kakhulu yezimo eziningi zokusetshenziswa.

I-TTS.ai ixhasa izilimi ezingaphezu kuka-30 ngaphesheya kwelayibrari yayo yemodeli. IsiNgisi sinesixhaso semodeli esibanzi kakhulu, kodwa amamodeli afana ne-CosyVoice 2 afaka ama-Chinese, ama-Japanese, nama-Korean; i-GPT-SoVITS iphatha ama-Chinese, ama-Japanese, ama-Korean, nama-English; futhi i-MeloTTS ixhasa ama-English, ama-Spanish, ama-French, ama-Chinese, ama-Japanese, nama-Korean.

Yebo. Zonke izisebenziso zikhona kumaseva ethu akhethekile we-GPU. Asigcinanga umbhalo wakho ongeniswe noma umsindo okhiqizwe ngemuva kokuthunyelwa. Izinhlamvu zomsindo ezilayishwe phezulu zokuklonya zisetshenziswa kuphela kwisiqephu samanje futhi azigcinwanga. Asikwazi ukuhlukanisa idatha yakho namanye amaqembu noma ukusebenzisa imodeli yokuqeqesha.

Yebo. Zonke izisindo ezikhiqizwa ku-TTS.ai zikhona kuwe ukuze uzisebenzise ngokuhweba, kufaka phakathi i-YouTube video, i-podcast, ama-audiobooks, ama-apps, izikhangiso, nama-products. Amamodeli ethu avulekile ngaphansi kwelayisense elivumelayo (MIT, Apache 2.0). Akukho lungelo noma ukuphawula okudingekayo.

I-TTS.ai ikhiqiza umsindo ngefomethi ye-WAV ngokuzenzakalela ngekhwalithi ephezulu. Ungaguqula ube yi-MP3, FLAC, OGG, noma i-M4A usebenzisa ithuluzi lethu elimahhala le-Audio Converter. I-API ixhasa ukucacisa ifomethi yakho yokukhishwa okuthandekayo ngqo kwisicelo.

Layisha phezulu isampula lesandi esincane (esincane njengemizuzu emihlanu) sezwi ofuna ukuliklonela, bese ubhala noma iyiphi incwadi ukudala ukukhuluma kulolu zwi. Amamodeli afana ne-Chatterbox, GPT-SoVITS, ne-CosyVoice 2 axhasa ukuklonela kwezwi. Uzwi oluklonelwe luthatha into, umbala, nesimo sokukhuluma.

Amamodeli amahhala (iKokoro, iPiper, iVITS, iMeloTTS) adinga i-akhawunti futhi abiza ama-characters ayi-zero. Amamodeli ajwayelekile (ama-characters angama-2,000/1K) afaka iBark, iCosyVoice 2, iF5-TTS, neDia. Amamodeli aphezulu (ama-characters angama-4,000/1K) afaka iOpenVoice, iChatterbox, iStyleTTS 2, neTortoise. Amamodeli akhokhelwayo ngokuvamile anikeza umgangatho ophezulu, amazwi amaningi, kanye nezici ezingeziwe ezifana nokuklonyelwe kwezwi.

Yebo. I-API ixhasa ukucutshungulwa kwe-batch ukuguqula amavolumu amakhulu we-text ku-speech. Sebenzisa izicelo eziningi bese uthola izimpendulo nge-asynchronously usebenzisa i-job UUIDs. Izinhlelo ze-Enterprise ($99/mo) zifaka phakathi ukufinyelela kwe-queue yokuqala ukucubungula okusheshayo kwe-batch. Ilungele ukukhiqizwa kwe-audiobook, okuqukethwe kwe-course, namaphrojekthi amakhulu we-voiceover.
4.0/5 (8)

Qala ukusebenzisa umsindo we-AI namhlanje

Xhumana nabakhiqizi, abathuthukisi, namabhizinisi asebenzisa i-TTS.ai