Free AI Tọghata ngwe ka ọsụsọ

22+ open-source models, 100+ ụda, 32+ Achọrọghị akaụntụ.

0/500 Ụdị Ọfụụ
Enweghị kaadị kredit 50 free credits 32+ Asụsụ Ọrụ ọhaneze OK
0:00 / 0:00
Download Audio Ndesịta njikọ ahụ ga-agwụ n'ime 24h
Dị ka TTS.ai? Kpọtụrụ ndị enyi gị!

Ihe niile ịchọrọ maka ụda AI

26 tools powered by 24+ open-source AI models

22+ AI Voice Models

Nchịkọta zuru ezu nke open-source TTS models n'ime ikpo okwu otu

KokoroKokoro Free

Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.

Nke kacha mma maka: High-quality TTS with minimal latency, streaming applications

Chọpụta

PiperPiper Free

Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.

Nke kacha mma maka: Quick previews, accessibility, and embedded applications

Chọpụta

VITSVITS Free

VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.

Nke kacha mma maka: General-purpose text-to-speech with natural prosody

Chọpụta

MeloTTSMeloTTS Free

MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.

Nke kacha mma maka: Usoroiheomume mmepe na-achọ ngwa ngwa, TTS n'asụsụ dị iche iche

Chọpụta

BarkBark Standard

Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.

Debanye aha: Suno · Ikikere: MIT

Jiri ya

Bark SmallBark Small Standard

Lighter version of Bark with faster inference and lower memory usage.

Debanye aha: Suno · Ikikere: MIT

Jiri ya

CosyVoice 2CosyVoice 2 Standard

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

Debanye aha: Alibaba (Tongyi Lab) · Ikikere: Apache 2.0

Jiri ya

Dia TTSDia TTS Standard

Multi-speaker dialog generation model nke na-ebipụta nchọgharị n'etiti ndị na-ekwu okwu.

Debanye aha: Nari Labs · Ikikere: Apache 2.0

Jiri ya

Parler TTSParler TTS Standard

Describe the voice you want in natural language and Parler generates matching speech.

Debanye aha: Hugging Face · Ikikere: Apache 2.0

Jiri ya

IndexTTS-2IndexTTS-2 Standard

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Debanye aha: Index Team · Ikikere: Apache 2.0

Jiri ya

Spark TTSSpark TTS Standard

Voice cloning TTS with controllable emotion and speaking style via prompts.

Debanye aha: SparkAudio · Ikikere: Apache 2.0

Jiri ya

GPT-SoVITSGPT-SoVITS Standard

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

Debanye aha: RVC-Boss · Ikikere: MIT

Jiri ya

OrpheusOrpheus Standard

Human-level emotional TTS model trained on 100K hours of speech data.

Debanye aha: Canopy Labs · Ikikere: Llama 3.2 Community

Jiri ya

Qwen3 TTSQwen3 TTS Standard

Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.

Debanye aha: Alibaba (Qwen) · Ikikere: Apache 2.0

Jiri ya

ChatterboxChatterbox Premium

State-of-the-art zero-shot ụda ọzụzụ na emotion njikwa site Resemble AI.

Nhazi:

Jiri ya

Tortoise TTSTortoise TTS Premium

Multi-voice text-to-speech focused on quality with autoregressive architecture.

Nhazi:

Jiri ya

StyleTTS 2StyleTTS 2 Premium

Human-level text-to-speech through style diffusion and adversarial training.

Nhazi:

Jiri ya

OpenVoiceOpenVoice Premium

Klọnaịsị ụda n'oge na-adịghị anya na nlekọta nkịtị n'elu ụdị, mmetụta, nakwa ụda.

Nhazi:

Jiri ya

CosyVoice 2CosyVoice 2

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

Asụsụ: en, zh, ja, ko, fr, de, it, es

Kpọnye ụda

IndexTTS-2IndexTTS-2

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Asụsụ: en, zh

Kpọnye ụda

Spark TTSSpark TTS

Voice cloning TTS with controllable emotion and speaking style via prompts.

Asụsụ: en, zh

Kpọnye ụda

GPT-SoVITSGPT-SoVITS

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

Asụsụ: en, zh, ja, ko

Kpọnye ụda

ChatterboxChatterbox

State-of-the-art zero-shot ụda ọzụzụ na emotion njikwa site Resemble AI.

Asụsụ: en

Kpọnye ụda

Tortoise TTSTortoise TTS

Multi-voice text-to-speech focused on quality with autoregressive architecture.

Asụsụ: en

Kpọnye ụda

OpenVoiceOpenVoice

Klọnaịsị ụda n'oge na-adịghị anya na nlekọta nkịtị n'elu ụdị, mmetụta, nakwa ụda.

Asụsụ: en, zh, ja, ko, fr, de, es, it

Kpọnye ụda

Qwen3 TTSQwen3 TTS

Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.

Asụsụ: en, zh, ja, ko, de, fr, ru, pt, es, it

Kpọnye ụda

Developer-First API

OpenAI-compatible REST API. One endpoint, 22+ models. Streaming support for real-time applications.

  • OpenAI-compatible format
  • TTS na-edebata maka usoroiheomume oge n'eziokwu
  • Nhazi batch maka ọrụ ndị dị ukwuu
  • Ndesịta ozi ndị ahụ
Gosi dọkumenti API
Python
import requests

response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": "Bearer sk-tts-xxx"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

Dị mfe, na-egosi ọnụọgụgụ

Bido n'efu. Nhazi dịka ị na-etolite.

Ọfụụ

$0

50 credits

  • Kokoro, Piper, VITS, MeloTTS
  • 500 characters limit
  • 3 gen/ụbọchị (enweghị akaụntụ)
Akaụntụ

Òtù

$9/ọnwa

500 credits/ọnwa

  • Ụdị 22+ niile
  • 5,000 characters limit
  • Klọnsị ụda
Bido
Nke kacha amasị

Pro

$29/ọnwa

2,000 credits/ọnwa

  • Ihe nile na mbido
  • Ikikere API
  • Nhazi ihenlereanya
Nweta Pro

Ụlọọrụ

$99/ọnwa

10,000 credits/ọnwa

  • Ihe niile na Pro
  • Bulk API
  • Òtù n'ihu
Akaụntụ

View all plans including credit packs →

Ajụjụ ndị a na-ajụkarị

TTS.ai bụ ikpo okwu olu AI kachasị zuru oke, na-enye 22 + ụdị ederede-na-asụsụ, okwu cloning, okwu-na-asụsụ, na ngwaọrụ ụda. Models niile bụ isi mmalite mepere emepe na enweghị onye na-ere ahịa.

Ee! TTS.ai na-enye ntinye ederede n'efu na Kokoro, Piper, VITS, na MeloTTS models. Ọ dịghị akaụntụ chọrọ. Tinye ka ị nweta 50 free credits na ịnweta ụdị niile. Nkwekọrịta na-akwụ ụgwọ na $ 9 / ọnwa.

Maka ọsọ, jiri Kokoro mọọbụ Piper. Maka ogo, jiri CosyVoice 2 mọọbụ StyleTTS 2. Maka ụda ọbụla, jiri Chatterbox mọọbụ GPT-SoVITS. Maka dialọ́ọ̀gụ̀, jiri Dia TTS. Jiri móòdù dị iche iche na ngwe ahụ ka ịtụle.

Ee. OpenAI-compatible REST API maka TTS, STT, okwu cloning, na audio tools. Available na Pro ($ 29 / mo) na Enterprise ($ 99 / mo) plans. View documentation at tts.ai/api /.

Nhazi ụda dị iche iche site na móòdù. Premium móòdù dị ka CosyVoice 2, StyleTTS 2, na Chatterbox na-eweta ụda dị ka ụda mmadụ na-egosipụta ụda na mmetụta uche. Free mòdù dị ka Kokoro na-enye ụda dị mma maka ihe ndị a na-ejikarị.

TTS.ai na-akwado 30 + asụsụ n'ime model library ya. English nwere nkwado model zuru oke, mana ụdị dị ka CosyVoice 2 na-ekpuchi Chinese, Japanese, na Korean; GPT-SoVITS na-elekọta Chinese, Japanese, Korean, na English; na MeloTTS na-akwado English, Spanish, French, Chinese, Japanese, na Korean.

Ee. Ọrụ niile na-eme na sava GPU anyị. Anyị na-echekwaghị ngwe gị ma ọ bụ ụda e mepụtara mgbe e mepụtara ya. A na-eji ụda ndị a na-ebubata maka ịkọnye naanị maka oge mmem ọfụụ ma a na-echekwaghị ha. Anyị na-echekwaghị data gị n'aka ndị ọzọ ma ọ bụ jiri ya rụọ ọrụ maka ịkụzi móòdù.

Yes. All audio generated on TTS.ai is yours to use commercially, including for YouTube videos, podcasts, audiobooks, apps, advertisements, and products. Our models are open source under permissive licenses (MIT, Apache 2.0). No royalties or attribution required.

TTS.ai na-emepụta ụda na WAV format site na difọ́ọ̀ltụ̀ maka ogo kachasị elu. I nwere ike ịgbanwee ka MP3, FLAC, OGG, ma ọ bụ M4A site na iji ngwaọrụ anyị n'efu Audio Converter. API na-akwado ịkọwapụta format output gị n'ụzọ ziri ezi na arịrịọ ahụ.

Upload a short audio sample (as little as 5 seconds) of the voice you want to clone, then type any text to generate speech in that voice. Models like Chatterbox, GPT-SoVITS, and CosyVoice 2 support voice cloning. The cloned voice captures tone, accent, and speaking style.

Free models (Kokoro, Piper, VITS, MeloTTS) anaghị achọ akaụntụ ọbụla ma ọ bụ akwụ ụgwọ ego ọbụla. Standard models (2 credits/1K characters) gụnyere Bark, CosyVoice 2, F5-TTS, na Dia. Premium models (4 credits/1K characters) gụnyere OpenVoice, Chatterbox, StyleTTS 2, na Tortoise. Paid models na-enyekarị ogo dị elu, ụda ndị ọzọ, nakwa atụmatụ ndị ọzọ dị ka ụda cloning.

Ee. API na-akwado usoroiheomume batch maka ịgbanwe nnukwu ọnụọgụgụ nke ngwe ka ọsụsọ. Tinye ọtụtụ arịrịọ ma nweta nsonaazụ n'ụzọ asynchronous site na iji ọrụ UUIDs. Enterprise plans ($99/mo) na-agụnye nbanye ntọala ntọala maka usoroiheomume batch ngwa ngwa. Ideal for audiobook production, course content, and large-scale voiceover projects.
5.0/5 (1)

Bido iji ụda AI taa

Join creators, developers, and businesses using TTS.ai