I-Free AI Umbhalo usuka kumazwi
22+ open-source models, 100+ voices, 32+ Izilimi. Akukho akhawunti edingekayo.
Yonke into oyidingayo ngezwi AI
Amathuluzi angama-26 asebenza nge-24+ open-source AI models
Amamodeli omsindo we-AI angama-22+
Iqoqo elibanzi kakhulu le-open-source TTS models kwi-platform eyodwa
Kokoro Free
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
Engcono kakhulu: High-quality TTS with minimal latency, streaming applications
Zama mahhala
Piper Free
Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
Engcono kakhulu: Quick previews, accessibility, and embedded applications
Zama mahhala
VITS Free
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
Engcono kakhulu: General-purpose text-to-speech with natural prosody
Zama mahhala
MeloTTS Free
MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
Engcono kakhulu: Izisebenziso zokukhiqiza ezidinga i-TTS esheshayo, enezilimi eziningi
Zama mahhala
Bark Standard
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
Umthuthukisi: Suno · Ilayisense: MIT
Zama
Bark Small Standard
Lighter version of Bark with faster inference and lower memory usage.
Umthuthukisi: Suno · Ilayisense: MIT
Zama
CosyVoice 2 Standard
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
Umthuthukisi: Alibaba (Tongyi Lab) · Ilayisense: Apache 2.0
Zama
Dia TTS Standard
Multi-speaker dialog generation model that creates natural conversations between speakers.
Umthuthukisi: Nari Labs · Ilayisense: Apache 2.0
Zama
Parler TTS Standard
Describe the voice you want in natural language and Parler generates matching speech.
Umthuthukisi: Hugging Face · Ilayisense: Apache 2.0
Zama
IndexTTS-2 Standard
Zero-shot TTS with fine-grained emotion control and high expressiveness.
Umthuthukisi: Index Team · Ilayisense: Apache 2.0
Zama
Spark TTS Standard
Voice cloning TTS with controllable emotion and speaking style via prompts.
Umthuthukisi: SparkAudio · Ilayisense: Apache 2.0
Zama
GPT-SoVITS Standard
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
Umthuthukisi: RVC-Boss · Ilayisense: MIT
Zama
Orpheus Standard
Human-level emotional TTS model trained on 100K hours of speech data.
Umthuthukisi: Canopy Labs · Ilayisense: Llama 3.2 Community
Zama
Qwen3 TTS Standard
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
Umthuthukisi: Alibaba (Qwen) · Ilayisense: Apache 2.0
Zama
CosyVoice 2
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
Izilimi: en, zh, ja, ko, fr, de, it, es
Umsindo we Clone
IndexTTS-2
Zero-shot TTS with fine-grained emotion control and high expressiveness.
Izilimi: en, zh
Umsindo we Clone
Spark TTS
Voice cloning TTS with controllable emotion and speaking style via prompts.
Izilimi: en, zh
Umsindo we Clone
GPT-SoVITS
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
Izilimi: en, zh, ja, ko
Umsindo we Clone
Chatterbox
I-state-of-the-art zero-shot voice cloning nge-emotion control kusuka ku-Resemble AI.
Izilimi: en
Umsindo we Clone
Tortoise TTS
Umbhalo-ku-ukukhuluma-ngezwi-eliningi elibhekisela kukhwalithi ngesakhiwo se-autoregressive.
Izilimi: en
Umsindo we Clone
OpenVoice
Instant voice cloning with granular control over style, emotion, and accent.
Izilimi: en, zh, ja, ko, fr, de, es, it
Umsindo we Clone
Qwen3 TTS
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
Izilimi: en, zh, ja, ko, de, fr, ru, pt, es, it
Umsindo we CloneUmthuthukisi-kuqala API
I-OpenAI-compatible REST API. Umkhawulo owodwa, 22+ amamodeli. Ukusakaza insizakalo yezicelo zesikhathi sangempela.
- Ifomethi ehambisana ne-OpenAI
- Ukusakazwa kwe-TTS kwezinhlelo zokusebenza zesikhathi sangempela
- Uhlelo lwe-batch lwemisebenzi enkulu
- Ulwaziso lwe-Webhook
import requests
response = requests.post(
"https://api.tts.ai/v1/tts/",
headers={"Authorization": "Bearer sk-tts-xxx"},
json={
"model": "kokoro",
"text": "Hello from TTS.ai!",
"voice": "af_bella",
}
)
with open("output.mp3", "wb") as f:
f.write(response.content)
Intengo elula, ecacile
Qalisa ngokukhululekileyo. Ukukala njengoba ukhula.
Ikhululekile
Ama-credits angama-50
- Kokoro, Piper, VITS, MeloTTS
- Umkhawulo wamaphawu angama-500
- 3 gen/ihora (akukho akhawunti)
Isiqalisi
500 credits/month
- Onke amamodeli angama-22+
- 5,000 umkhawulo wamaphawu
- Ukulungiswa kwezwi
i-Pro
2,000 credits/month
- Konke ku-Starter
- Ukungena kwe-API
- Ukulungiswa kokuqala
Ibhizinisi
10,000 credits/month
- Konke ku-Pro
- I-bulk API
- Ifolokhwe yesinqumo
Imibuzo ebuzwa kaningi
Qala ukusebenzisa umsindo we-AI namhlanje
Xhumana nabakhiqizi, abathuthukisi, namabhizinisi usebenzisa i-TTS.ai