Free AI Tọghata ngwe ka ọsụsọ
22+ open-source models, 100+ ụda, 32+ Achọrọghị akaụntụ.
Ihe niile ịchọrọ maka ụda AI
26 tools powered by 24+ open-source AI models
22+ AI Voice Models
Nchịkọta zuru ezu nke open-source TTS models n'ime ikpo okwu otu
Kokoro Free
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
Nke kacha mma maka: High-quality TTS with minimal latency, streaming applications
Chọpụta
Piper Free
Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
Nke kacha mma maka: Quick previews, accessibility, and embedded applications
Chọpụta
VITS Free
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
Nke kacha mma maka: General-purpose text-to-speech with natural prosody
Chọpụta
MeloTTS Free
MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
Nke kacha mma maka: Usoroiheomume mmepe na-achọ ngwa ngwa, TTS n'asụsụ dị iche iche
Chọpụta
Bark Standard
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
Debanye aha: Suno · Ikikere: MIT
Jiri ya
Bark Small Standard
Lighter version of Bark with faster inference and lower memory usage.
Debanye aha: Suno · Ikikere: MIT
Jiri ya
CosyVoice 2 Standard
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
Debanye aha: Alibaba (Tongyi Lab) · Ikikere: Apache 2.0
Jiri ya
Dia TTS Standard
Multi-speaker dialog generation model nke na-ebipụta nchọgharị n'etiti ndị na-ekwu okwu.
Debanye aha: Nari Labs · Ikikere: Apache 2.0
Jiri ya
Parler TTS Standard
Describe the voice you want in natural language and Parler generates matching speech.
Debanye aha: Hugging Face · Ikikere: Apache 2.0
Jiri ya
IndexTTS-2 Standard
Zero-shot TTS with fine-grained emotion control and high expressiveness.
Debanye aha: Index Team · Ikikere: Apache 2.0
Jiri ya
Spark TTS Standard
Voice cloning TTS with controllable emotion and speaking style via prompts.
Debanye aha: SparkAudio · Ikikere: Apache 2.0
Jiri ya
GPT-SoVITS Standard
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
Debanye aha: RVC-Boss · Ikikere: MIT
Jiri ya
Orpheus Standard
Human-level emotional TTS model trained on 100K hours of speech data.
Debanye aha: Canopy Labs · Ikikere: Llama 3.2 Community
Jiri ya
Qwen3 TTS Standard
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
Debanye aha: Alibaba (Qwen) · Ikikere: Apache 2.0
Jiri ya
CosyVoice 2
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
Asụsụ: en, zh, ja, ko, fr, de, it, es
Kpọnye ụda
IndexTTS-2
Zero-shot TTS with fine-grained emotion control and high expressiveness.
Asụsụ: en, zh
Kpọnye ụda
Spark TTS
Voice cloning TTS with controllable emotion and speaking style via prompts.
Asụsụ: en, zh
Kpọnye ụda
GPT-SoVITS
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
Asụsụ: en, zh, ja, ko
Kpọnye ụda
Chatterbox
State-of-the-art zero-shot ụda ọzụzụ na emotion njikwa site Resemble AI.
Asụsụ: en
Kpọnye ụda
Tortoise TTS
Multi-voice text-to-speech focused on quality with autoregressive architecture.
Asụsụ: en
Kpọnye ụda
OpenVoice
Klọnaịsị ụda n'oge na-adịghị anya na nlekọta nkịtị n'elu ụdị, mmetụta, nakwa ụda.
Asụsụ: en, zh, ja, ko, fr, de, es, it
Kpọnye ụda
Qwen3 TTS
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
Asụsụ: en, zh, ja, ko, de, fr, ru, pt, es, it
Kpọnye ụdaDeveloper-First API
OpenAI-compatible REST API. One endpoint, 22+ models. Streaming support for real-time applications.
- OpenAI-compatible format
- TTS na-edebata maka usoroiheomume oge n'eziokwu
- Nhazi batch maka ọrụ ndị dị ukwuu
- Ndesịta ozi ndị ahụ
import requests
response = requests.post(
"https://api.tts.ai/v1/tts/",
headers={"Authorization": "Bearer sk-tts-xxx"},
json={
"model": "kokoro",
"text": "Hello from TTS.ai!",
"voice": "af_bella",
}
)
with open("output.mp3", "wb") as f:
f.write(response.content)
Ajụjụ ndị a na-ajụkarị
Bido iji ụda AI taa
Join creators, developers, and businesses using TTS.ai