I-Free AI Okubhaliweyo ukuya kuSpeechName

22+ open-source models, 100+ voices, 32+ iilwimi. Akukho akhawunti ifunekayo.

0/500 Iimpawu Ekhululekileyo
Akukho khadi letyala 50 credits free 32+ iilwimi Ukusetyenziswa korhwebo OK
0:00 / 0:00
Download Audio Ikhonkco liphelelwe lixesha kwiyure ezi-24
Like TTS.ai? Tell your friends!

Yonke into oyifunayo kwi Voice AI

Izixhobo ezili-26 ezixhaswa ziimodeli ze-24+ open-source AI

Iimodeli zesandi ze-22+ AI

Uluhlu olupheleleyo lweemodeli ze-TTS ezinomthombo ovulekileyo kwi-platform enye

KokoroKokoro Free

Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.

Elungileyo ku: High-quality TTS with minimal latency, streaming applications

Zama simahla

PiperPiper Free

Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.

Elungileyo ku: Quick previews, accessibility, and embedded applications

Zama simahla

VITSVITS Free

VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.

Elungileyo ku: General-purpose text-to-speech with natural prosody

Zama simahla

MeloTTSMeloTTS Free

MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.

Elungileyo ku: Iinkqubo zokuvelisa ezifuna i-TTS ekhawulezayo, eneelwimi ezininzi

Zama simahla

BarkBark Standard

Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.

Umbhekisi phambili: Suno · Ilayisenisi: MIT

Zama kwakhona

Bark SmallBark Small Standard

Lighter version of Bark with faster inference and lower memory usage.

Umbhekisi phambili: Suno · Ilayisenisi: MIT

Zama kwakhona

CosyVoice 2CosyVoice 2 Standard

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

Umbhekisi phambili: Alibaba (Tongyi Lab) · Ilayisenisi: Apache 2.0

Zama kwakhona

Dia TTSDia TTS Standard

Imodeli yokuveliswa kwencoko yababini yesandi esininzi eyenza unxibelelwano oluqhelekileyo phakathi kwamasandi.

Umbhekisi phambili: Nari Labs · Ilayisenisi: Apache 2.0

Zama kwakhona

Parler TTSParler TTS Standard

Describe the voice you want in natural language and Parler generates matching speech.

Umbhekisi phambili: Hugging Face · Ilayisenisi: Apache 2.0

Zama kwakhona

IndexTTS-2IndexTTS-2 Standard

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Umbhekisi phambili: Index Team · Ilayisenisi: Apache 2.0

Zama kwakhona

Spark TTSSpark TTS Standard

Voice cloning TTS with controllable emotion and speaking style via prompts.

Umbhekisi phambili: SparkAudio · Ilayisenisi: Apache 2.0

Zama kwakhona

GPT-SoVITSGPT-SoVITS Standard

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

Umbhekisi phambili: RVC-Boss · Ilayisenisi: MIT

Zama kwakhona

OrpheusOrpheus Standard

Human-level emotional TTS model trained on 100K hours of speech data.

Umbhekisi phambili: Canopy Labs · Ilayisenisi: Llama 3.2 Community

Zama kwakhona

Qwen3 TTSQwen3 TTS Standard

Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.

Umbhekisi phambili: Alibaba (Qwen) · Ilayisenisi: Apache 2.0

Zama kwakhona

ChatterboxChatterbox Premium

I-state-of-the-art zero-shot voice cloning ngolawulo lweemvakalelo ukusuka kwi-Resemble AI.

Ubunjani:

Zama kwakhona

Tortoise TTSTortoise TTS Premium

Umbhalo-uku-thetha ngelizwi elininzi olujolise kumgangatho kunye noyilo oluya ezantsi ngokuzenzekelayo.

Ubunjani:

Zama kwakhona

StyleTTS 2StyleTTS 2 Premium

Human-level text-to-speech through style diffusion and adversarial training.

Ubunjani:

Zama kwakhona

OpenVoiceOpenVoice Premium

Instant voice cloning with granular control over style, emotion, and accent.

Ubunjani:

Zama kwakhona

CosyVoice 2CosyVoice 2

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

Iilwimi: en, zh, ja, ko, fr, de, it, es

Ilizwi lika-Clone

IndexTTS-2IndexTTS-2

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Iilwimi: en, zh

Ilizwi lika-Clone

Spark TTSSpark TTS

Voice cloning TTS with controllable emotion and speaking style via prompts.

Iilwimi: en, zh

Ilizwi lika-Clone

GPT-SoVITSGPT-SoVITS

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

Iilwimi: en, zh, ja, ko

Ilizwi lika-Clone

ChatterboxChatterbox

I-state-of-the-art zero-shot voice cloning ngolawulo lweemvakalelo ukusuka kwi-Resemble AI.

Iilwimi: en

Ilizwi lika-Clone

Tortoise TTSTortoise TTS

Umbhalo-uku-thetha ngelizwi elininzi olujolise kumgangatho kunye noyilo oluya ezantsi ngokuzenzekelayo.

Iilwimi: en

Ilizwi lika-Clone

OpenVoiceOpenVoice

Instant voice cloning with granular control over style, emotion, and accent.

Iilwimi: en, zh, ja, ko, fr, de, es, it

Ilizwi lika-Clone

Qwen3 TTSQwen3 TTS

Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.

Iilwimi: en, zh, ja, ko, de, fr, ru, pt, es, it

Ilizwi lika-Clone

Umbhekisi phambili-Okuqalayo

I-REST API ehambelana ne-OpenAI. Incopho enye yesiphelo, iimodeli ezingaphezu kwe-22. Inkxaso yosasazo lwezicelo zexesha elibonakalayo.

  • Ifomati ehambelana ne-OpenAI
  • Unikezelo lwe-TTS lweenkqubo zexesha elibonakalayo
  • Uqhubekeko lweqela lomsebenzi omkhulu
  • Isaziso se Webhook
Bonisa Uxwebhu lwe API
Python
import requests

response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": "Bearer sk-tts-xxx"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

Ixabiso elilula, elicacileyo

Qala ngokukhululekileyo. Ubungakanani njengoko ukhula.

Ekhululekileyo

$0

50 credits

  • Kokoro, Piper, VITS, MeloTTS
  • Umda wophawu lwe 500
  • 3 gen/iyure (akukho akhawunti)
Ubhaliso simahla

Isiqalisi

$9/inyanga( ii)

500 credits/month

  • Zonke iimodeli ezingama-22+
  • 5,000 umda wophawu
  • I-Voice Cloning
Qala
Ethandwa Kakhulu

I-Pro

$29/inyanga( ii)

2,000 iikhredithi/inyanga

  • Yonke into kwisiqalisi
  • Ufikelelo lwe-API
  • Ukuqhubekeka okuphambili
Fumana iPro

I-Entreprise

$99/inyanga( ii)

10,000 iikhredithi/inyanga

  • Yonke into kwi-Pro
  • I-Bulk API
  • Ufolo oluphambili
Qhagamshelana neNtengiso

View all plans including credit packs →

Imibuzo ebuzwa rhoqo

I-TTS.ai yinkqubo yesandi ye-AI ebanzi kakhulu, ebonelela ngeemodeli ze-22+ zombhalo-ukuthetha, ukuclona umbhalo, umbhalo-ukuthetha, kunye nezixhobo zesandi. Zonke iimodyuli zivela kumthombo ovulekileyo ngaphandle kokutshixa umboneleli.

Ewe! I-TTS.ai ibonelela nge-text-to-speech yasimahla ngeemodeli zeKokoro, Piper, VITS, kunye neMeloTTS. Akukho akhawunti ifunekayo. Yibhalise ukuze ufumane i-50 credits yasimahla kwaye ufike kuzo zonke iimodeli. Iinkqubo ezihlawulwayo ziqala kwi-$9/inyanga.

Ukukhawuleza, sebenzisa i Kokoro okanye i Piper. Umgangatho, zama i CosyVoice 2 okanye i StyleTTS 2. Ukuklona kwelizwi, sebenzisa i Chatterbox okanye i GPT- SoVITS. Kwincoko yababini, sebenzisa i Dia TTS. Zama iimodeli ezininzi kumbhalo ofanayo ukuthelekiswa.

Ewe. I-REST API ehambelana ne-OpenAI ye-TTS, i-STT, ukuclona kwelizwi, kunye neezixhobo zesandi. Ifumaneka kwiPro ($29/mo) kunye ne-Enterprise ($99/mo) iinkqubo. Bona uxwebhu kwi-tts.ai/api/.

Ubunjani besandi buhluka ngokwemodeli. Iimodeli eziphezulu ezifana ne CosyVoice 2, StyleTTS 2, ne Chatterbox zivelisa ulwimi olunomgangatho ofanayo nolunobuntu obuqhelekileyo kunye novakalelo. Iimodeli ezikhululekileyo ezifana ne Kokoro zibonelela ngomgangatho olungileyo kwiziganeko ezininzi zokusetyenziswa.

I-TTS.ai ixhasa iilwimi ezingaphezu kwe-30 kwilayibrari yayo yemodeli. IsiNgesi sinokuxhasa imodeli ebanzi kakhulu, kodwa iimodeli ezifana ne-CosyVoice 2 ziquka isiTshayina, isiJapan, nesiKorea; i-GPT-SoVITS iphatha isiTshayina, isiJapan, isiKorea, nesiNgesi; kwaye i-MeloTTS ixhasa isiNgesi, isiSpanish, isiFrentshi, isiTshayina, isiJapan, nesiKorea.

Ewe. Zonke iinkqubo zenzeka kumaseva ethu e-GPU anikezelweyo. Asigcinanga ungeniso lombhalo wakho okanye isandi esiveliswe emva kokuthunyelwa. Iisampuli zesandi ezilayishwe phezulu zokukrola zisetyenziswa kuphela kwintlanganiso yangoku kwaye azigcinwanga. Asiyi kuzonwabisa i-data yakho nabani na wesithathu okanye siyisebenzise ukuqeqesha iimodeli.

Yes. All audio generated on TTS.ai is yours to use commercially, including for YouTube videos, podcasts, audiobooks, apps, advertisements, and products. Our models are open source under permissive licenses (MIT, Apache 2.0). No royalties or attribution required.

I-TTS.ai ivelisa isandi kwifomati ye-WAV ngokumiselweyo umgangatho ophezulu. Ungaguqula ukuya kwi-MP3, FLAC, OGG, okanye M4A usebenzisa isixhobo sethu esikhululekileyo sokuguqula isandi. I-API ixhasa ukukhankanya ifomati yakho ekhethiweyo yemveliso ngqo kwisicelo.

Upload a short audio sample (as little as 5 seconds) of the voice you want to clone, then type any text to generate speech in that voice. Models like Chatterbox, GPT-SoVITS, and CosyVoice 2 support voice cloning. The cloned voice captures tone, accent, and speaking style.

Iimodeli ezikhululekileyo (iKokoro, iPiper, iVITS, iMeloTTS) azidingi i-akhawunti kwaye zibiza i-zero credits. Iimodeli eziqhelekileyo (2 credits/1K characters) ziquka iBark, iCosyVoice 2, iF5-TTS, kunye neDia. Iimodeli eziphezulu (4 credits/1K characters) ziquka iOpenVoice, iChatterbox, iStyleTTS 2, kunye neTortoise. Iimodeli ezihlawulelweyo ngokubanzi zibonelela ngomgangatho ophezulu, amagama amaninzi, kunye neempawu ezongezelelweyo ezinjengokukrola kwelizwi.

Ewe. I-API ixhasa uqhubekeko lweqela lokuguqula ivolumu enkulu yombhalo kwilizwi. Thumela izicelo ezininzi kwaye ubuyisele iziphumo ngokuzenzekelayo usebenzisa umsebenzi we-UUIDs. Iinkqubo zeshishini ($99/mo) ziquka unikezelo lofolo oluphambili loqhubekeko lweqela olukhawulezayo. Ilungele ukuveliswa kweencwadi zesandi, imixholo yenkqubo, kunye neeprojekthi ezinkulu zesandi.
5.0/5 (1)

Qala Ukusebenzisa i-AI Voice Namhlanje

Join creators, developers, and businesses using TTS.ai