Qorannoo Bu'aa / Deebii Fa'ii

Akkaata gara-dhaabduu

Akkaataa gara yaadaa sammuutti jijjiira, moolaa AI fuula-afuraatiin. Fakkeenyaaf, akkaataa hin barbaachisu.

Haata'u malee, afaan keessaniin dhageettiwwan TTS hin qabnu. Jaalchiina keenyatti addeessi! Dhaamsa keessan Sell
Jijjiirama 5,000 character limit

Daangeessii kitaaba keessan keessaa tag SSML akka itti fayyadamtan:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Dabalatti, kan akka:

Haalli fuula

-12 +12
0.5x 2.0x
Birrii fi Piper, VITS, MeloTTS
Oduu kee kan uumame yooka'u yooka'u. Suuraa moolaa, galchi kitaaba, fi bu'u Jijjiira.
Audion itti fufuu
0:00
Fuula Oduu Fuula Liqii dhumaa 24 sa'a keessatti
TTS.ai jaallatan? Sochii keessanitti hiika!

Fakkeenya Modelii

Kokoro

Kokoro

Free

Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.

Deebi'aa: Hexgrad
Lizenz: Apache 2.0
Jijjiiramni Fast
Qiyaasa:
Afaan Oromoo 8 Afaan Oromoo
VRAM 1.5GB
Dhaabbilee Hin deeggaramne
Fakkii:
82M parameters Ultra-fast Expressive voices Multilingual Streaming support
Fakkeenyaaf:: High-quality TTS with minimal latency, streaming applications

Tips for Better Results

  • Fuula ittisa-gaafatamaa akkanumatti fayyadama akkanumatti
  • Eegifnee nootasonni fi itti-ga'iiwwan akka hin mul'atin
  • Akkasumas, kamshaawwan akka fuula duraa tokkootti galchiin
  • Fuulaa alatti (...) akka duraa
  • Kokoro ykn CosyVoice 2 fayyadami akka argattu
  • Dhiibbaa fayyadami akka walga'ii fi fakkaattota podcast-ka'ee

Akkasumas

Daandiin Akkasumas
Birrii 1:1 (free)
Standartaa 2x karaaktoota
Premium 4x karaaktoota

Akkamitti AI Teeksta-Gara-Haala-Kuni-Hojjaa

Jijjiirraa dhaamsa walfakkaatu sadii saffisa. Onnee teknoolojii hin barbaachisu.

Ji'a 1

Galchiin Teessuma

Type, paste, or upload the text you want to convert to speech. Supports up to 5,000 characters per generation for free accounts, or 100,000 for paid plans. Use plain text or add SSML tags for advanced control over pronunciation, pauses, and emphasis.

Akkaataa 2

Suuraa Modaa fi Dhaada

20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu. 20+ AI modeelii keessaa tokko keessaa tokko filadhu.

Akkaataa 3

Jijjiiramnii fi Ibsa

Cuqaasi Dhiibu fi Oduu keessan yoo ta'e sekondii tokkoon booda. Akkasumas, qabduu dabalataa, dabalataa furmaata barbaadde, ykn kophii galmee wal-ga'ii. Akkasumas, API'n akka itti fayyadamtu fi akka wal-ga'ii keessanitti akka galmeessan fayyadami.

Akkaataa gara-dhaabduu

Akkaataa-hiika-dhaamsa AI-barreeffame kan namatti fidu, kan nama barbaachisu, fi kan walqabatee kan namatti fidu, kan walqabatee, kan walqabatee, fi kan walqabatee, kan walqabatee, fi kan walqabatee.

Akkaata-hiika-dhaamsa

Fakkeenyaaf, kan TTS.ai keessatti argamuu danda'u, kan TTS.ai keessatti hin argamuu ta'u, kan TTS.ai keessatti hin argamuu ta'u, fi kan TTS.ai keessatti hin argamuu ta'u.

KokoroKokoro

Free

Kokoro waa'ee 82 million parameetraa kitaab-to-speech moolaa kan ta'e kan akka fuula isaa. Haalli isaa kan ciccimaa'e, kan ta'ee, kan ta'ee fi kan ta'ee. Kokoro afaanoota hedduu kan akka Aadaa, Jaapan, Siiniifi Koree kan ta'e, kan ta'ee fi kan ta'ee. Kan ta'ee fi kan ta'ee - kan audio 100x ta'ee kan ta'e fi kan ta'ee fi kan ta'ee kan GPU.

Deebi'aa::
Hexgrad
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, ja, zh, fr, it, pt, es, hi
VRAM:
1.5GB
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
Parameetra 82M Fakkii Dhaamsa-gaaffii Afaan hedduu Sukkanneessaa daawwachuu
Fakkeenyaaf:: TTS walqunnamtii olaanaa fi laataa, appilikeeshiniin daawwachuu

PiperPiper

Free

Piper'n fuula-to-speech engine'n gara-galmee'e kan Rhasspy'n qopheessee fi VITS fi larynx architecture'n fayyadamu. Akkasumas, kan akka CPU'n kan hojjetame ta'ee, kan akka alaabaa'ee, awtomaatikii ho'aa, fi appilikeeshiinii TTS of-line'n barbaachisu ta'ee. Akkasumas, kan akka fuula-to-speech'n, akkasumas, akka Raspberry Pi 4'tti, kan akka fuula-to-speech'n, akkasumas, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech'n, akka fuula-to-speech

Deebi'aa::
Rhasspy
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, de, fr, es, it, pt, nl, pl, ru, zh, ar, cs, da, fi, el, hu, is, ka, kk, ne, no, ro, sk, sr, sv, sw, tr, uk, vi, ca, cy, fa, lv, sl, lb, eu, id, ku, ml, sq, te, ur
VRAM:
0 (CPU only)
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
CPU-friendly Sagantaa Dhaabbilee 35+ afaanii Guutuu SSML
Fakkeenyaaf:: Fuula-argee, cimsiisaa, fi hojii-oolchuu-barreeffamaa

VITSVITS

Free

VITS (Variation Inference with adversarial learning for end-to-end Text-to-Speech) tarree TTS kan akka "a" fi "b" kan wal-qabatee tarree tokkotti kan itti fayyadamu ta'uu isaati.

Deebi'aa::
Jaehyeon Kim et al.
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, de, es, fr, pt, nl, fi, hu, bg, ja, pl
VRAM:
1GB
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
Sinteesii dabre-dabre Proosidii Nageenyaa Infarektii Dhaabbilee hedduu
Fakkeenyaaf:: Akkaata-hiika-dhaamsa-gaggeeffamaa-kan-gocha-dhaamsa-gaggeeffamaa-kan-gaggeeffamaa-kan-gaggeeffamaa

MeloTTSMeloTTS

Free

MeloTTS by MyShell.ai waa'ee TTS labiiraa'ee afaan-kaaniin-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'ee-ka'

Deebi'aa::
MyShell.ai
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, es, fr, zh, ja, ko
VRAM:
0.5GB (GPU optional)
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
CPU-optimized Afaan hedduu Akkaataa Sagantaa Dabalataa Laataa
Fakkeenyaaf:: Applikaashinaawwan fooyya'insaa TTS yeroo dheeraaf, afaan hedduuf barbaadan

BarkBark

Standard

Bark by Suno waa'ee fuula-to-audio-modelaa transformaatoraatti hundaa'e kan ta'e, afaan-kaaniin kan dubbatamu fi kan afaan-kaaniin kan dubbatamu, akkasumas, audiowwan biroo akka muuzikaa, fuula-duratti, fi sammuu-ga'ee. Kan ta'e, wal-ga'ii hin-dhabdeen akka ka'uu, haadha-duuraa fi oof-dhabduu. Bark fuula-duuraa 100 fi afaan-13+ nuuf gargaara.

Deebi'aa::
Suno
Lizenz::
MIT
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
VRAM:
5GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Sammuu Qilleensa Jijjiiramni Muuziqaa 100+ fuula Afaan hedduu
Fakkeenyaaf:: Fuula awditoo, kitaaba awditoo kan wal-ga'ii, wal-ga'ii sammuu

Bark SmallBark Small

Standard

Bark Kibbaan waa'ee distillate versii moodeelee Bark kan ta'e kan wal-ga'ii audio tokko tokkof akka inference velocities fi memory requirements cimaan. Kan itti-qabatee Bark's capacity to generate speech with emotions, laughs, and multiple languages.

Deebi'aa::
Suno
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
VRAM:
2GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Fakkii Fakkeenyaaf, Haati-gaafatama Afaan hedduu
Fakkeenyaaf:: Odeeyfannoon salphaa yoo barki guutuun dafee ta'e

CosyVoice 2CosyVoice 2

Standard

CosyVoice 2 by Alibaba's Tongyi Lab nuuf kennuuf dandeettii dhaadannoo nama waliin walqabatee kan qabu, kan nama yeroo dhabeef nuuf ta'e. Kunis akkasumaan akka quuntamsiisaa fuula-duraa fi nuuf gargaaru. Kunis akkasumaan akka nuuf gargaaru.

Deebi'aa::
Alibaba (Tongyi Lab)
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, fr, de, it, es
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
2x
Daangeessuu Kloonii Zero-shot Afaan-Dhugumatti Irreechi Pariitii namaa
Fakkeenyaaf:: Applikaashina yeroo-dhabdee, TTS daawwachuu, assistentoota-dhaabdee

Dia TTSDia TTS

Standard

Dia by Nari Labs isa 1.6B parameetirra teeksta-to-waamichaa moolaa kan walgahii-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamichaa-waamicha

Deebi'aa::
Nari Labs
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
VRAM:
4GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Haati-dhageessota-aadaa-aadaa Dhimmii dabalataa Jijjiiramni uumamaa Irreechi Parameetra
Fakkeenyaaf:: Podcasts, daawwawwan kitaaba awwaanaa, faaydaa marii

Parler TTSParler TTS

Standard

Parler TTS is a text-to-speech model that uses natural language voice descriptions to control the generated speech. Instead of selecting from preset voices, you describe the voice you want (e.g., "a warm female voice with a slight British accent, speaking slowly and clearly") and Parler generates speech matching that description. This makes it uniquely flexible for creative applications.

Deebi'aa::
Hugging Face
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
VRAM:
4GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Fakkeenyaa Dhaaba Fuula Afaan Oromoo Dhiibaa dhalootaan Dhaamsawwaniiwwanii
Fakkeenyaaf:: Fuulawwan Biroo kan keessatti si'aama akka fuula duraa

Indic Parler TTSIndic Parler TTS

Standard

Parler TTS by AI4Bharat akka afaanii Hindiitti, akka afaan Tamil, Bengali, Marathi, Gujarati, Kannada, Punjabi, Odia, Assamese, Hindi, Telugu, Malayalam fi Ingiliiffatti, akka Parler, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti, akka afaanii salphaatti,

Deebi'aa::
AI4Bharat
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
ta, bn, mr, gu, kn, pa, or, as, hi, te, ml, en
VRAM:
8GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Afaan Oromoo Fakkeenyaa Dhaaba Fuula Afaan Oromoo Yaada Indiyaa
Fakkeenyaaf:: Dhaaba Hindii-dhaabsi, mul'ata naannoo, appilikeeshiniin Hindii afaan-kaaniin

KhanomTan TTSKhanomTan TTS

Standard

KhanomTan TTS waa'ee Afaan Thai gara-dhaabduu-tti-bu'u-moodelaa-bu'aa-dhaabduu-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti-ti

Deebi'aa::
Wannaphong Phatthiyaphaibun
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
th
VRAM:
2GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Taayilandii TTS Dhaabbilee hedduu Arkkitektuurii YourTTS Lizenzoota amantoota
Fakkeenyaaf:: Dhaabawwan, fakkaattota fi fayyadamtoota Afaan Thai

IndexTTS-2IndexTTS-2

Standard

IndexTTS-2 sistimni gara-gaaffii-gaaffii-gaafatamaa kan ta'e kan wal-fakkaatu tarree-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-

Deebi'aa::
Index Team
Lizenz::
Bilibili Model License
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
2x
Irreechi Zero-shot Vektoroota abdii Haata'u malee Ijaarsa
Fakkeenyaaf:: Fakkeenyaaf, kitaaba-audio, assistentoota virtuuellii

Spark TTSSpark TTS

Standard

Spark TTS by SparkAudio isa modelaa kitaab-to-speech kan akka fuula-duratti-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-

Deebi'aa::
SparkAudio
Lizenz::
CC BY-NC-SA 4.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
2x
Dhaabbilee Irreechi Ijaarsa Saala Fuula-base 5-sekondii
Fakkeenyaaf:: Dhimmii fi fuula galmee

GPT-SoVITSGPT-SoVITS

Standard

GPT-SoVITS modelaa afaanii GPT-style waliin SoVITS (Singing Voice Inference via Translation and Synthesis) walitti qabaa akka saffisaan saffisaa-shoot-ka'ee. Akka sekondii 5 ofiif, saffisaa akka saffisaan saffisaa fi saffisaa haaraa uumuu ni dandeessa, saffisaan haala saffisaa kan hin taanee fi saffisaan saffisaa kan hin taanee. Saffisaan saffisaa saffisaa fi saffisaan saffisaa kan hin taanee fi saffisaan saffisaa fi saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffisaan kan hin taanee fi saffi

Deebi'aa::
RVC-Boss
Lizenz::
MIT
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko
VRAM:
6GB
Dhaabbilee:
Ya
Akkasumas:
2x
5-sekondii Dhaamsa kan ka'e Barnoota High fidelity Afaan-Dhugumatti
Fakkeenyaaf:: Kloonii dhaadannoo, sintesisii kannii, replikii dhaadannoo kannii

OrpheusOrpheus

Standard

Orpheus is a large-scale text-to-speech model that achieves human-level emotional expression. Trained on over 100,000 hours of diverse speech data, it excels at generating speech with natural emotions, emphasis, and speaking styles. Orpheus can produce speech that is virtually indistinguishable from human recordings.

Deebi'aa::
Canopy Labs
Lizenz::
Llama 3.2 Community
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
VRAM:
4GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Jijjiirama sadarkaa namaa 100K sa'aati Natural emphasis Haata'u malee
Fakkeenyaaf:: Haaluma walfakkaatuun, kitaabota awdiyoo, fi fuula fuula

ChatterboxChatterbox

Premium

Chatterbox by Resemble AI waa'ee fuula duraa ta'e, mo'ellaa kloonaa dhalootaan hin taanee ti. Akkasumas, dhalootaan tokko irraa dhaloota tokkotti akka walfakkaatu, timbraa malee, haalaa fi fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula fuula

Deebi'aa::
Resemble AI
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
4x
Kloonii Zero-shot Irreechi High fidelity Jijjiiramni saalaa Sammuu tokko
Fakkeenyaaf:: Kloonii dhaadannoo professionalii kan walqabatee kan walqixxummaa, fi uumuu faaydaa

Tortoise TTSTortoise TTS

Premium

Tortoise TTS sistiimii gara-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-

Deebi'aa::
James Betker
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en
VRAM:
8GB
Dhaabbilee:
Ya
Akkasumas:
4x
Qiyaasa olaanaa Dhaaba-galmee Arkitektuurii DALL-E Dhaabbilee Otoorregressive
Fakkeenyaaf:: Kitaaba-audio, faaydaa-qabeenya, hojii-qabeenya-qabeenya

StyleTTS 2StyleTTS 2

Premium

StyleTTS 2 sintesis TTS sadarkaa namatti argata, kan itti fayyadamu diffuusii stylii fi barnoota walqabatee, kan fayyadamu moodeeloota afaanii marii guddaa. Kunis, kan moodeeloota kan tokko tokkotti, kan akka reekaman namaatti, kan dubbatamu, kan dhaga'amu, fi kan dhaga'amu ta'uu danda'a. StyleTTS 2 diffuusii-based style modeling fayyadama, kan akka haala walfakkaatu kan homaatu hin qabne.

Deebi'aa::
Columbia University
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
VRAM:
4GB
Dhaabbilee:
Haata'u
Akkasumas:
4x
Akka namaatti Fakkeenyaaf Barruu Fuula High fidelity
Fakkeenyaaf:: Sinteesiin dhaadannoo tokko-qullaa-studio, dhaadannoo professional

OpenVoiceOpenVoice

Premium

OpenVoice by MyShell.ai'n akka fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula duraa fuula dura

Deebi'aa::
MyShell.ai / MIT
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, fr, es
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
4x
Kloonii Fuula Irreechi Ijaarsa Aansi Afaan hedduu
Fakkeenyaaf:: Kloonii dhaadataa waliin kan itti fayyadamu, dabalataa dabalataa, dabalataa dabalataa

Qwen3 TTSQwen3 TTS

Standard

Qwen3-TTS is a 1.7 billion parameter text-to-speech model from Alibaba's Qwen team. It supports two modes: preset voices with emotion control (9 speakers), and a unique voice design mode where you describe the voice you want in natural language. It covers 10 languages with high expressiveness and natural prosody.

Deebi'aa::
Alibaba (Qwen)
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, de, fr, ru, pt, es, it
VRAM:
7GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
9 dhalootaan itti-saamaman Dhimmii dhalootaa Irreechi Afaan Oromoo
Fakkeenyaaf:: Fuula Afaan-aadaa-aadaa waliiniifi dhalootaan wal-fakkaatan ykn dhimmoota dhalootaaf adda addaa

VieNeu-TTS-v2VieNeu-TTS-v2

Standard

VieNeu-TTS-v2 waa'ee 300M parameetiraa fi mo'ellaa TTS Viyetnamii-tamaa kan barreeffame sa'aa 10,000+ daataa afaanii. Kan deeggaru en-vi kodee-ba'uu, dhaloota 7 kan wal-qabatee, fi kloonaa dhaloota 3-5 sekondii ofirratti. Kan hojjetame hunda CPU irratti GGUF Q4 infereensii + ONNX dekoder - GPU hin barbaachisu, jijjirama ~7 sekondii keessatti.

Deebi'aa::
Phạm Nguyễn Ngọc Bảo
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
vi, en
VRAM:
CPU
Dhaabbilee:
Ya
Akkasumas:
2x
7 dhalootaan itti-haata'e (Accents Kibbaa + Kibbaa) En-Vi code-switching Kloonii dhaadataa (3-5s referensi) Fuula Podcast / fuula-duratti-ka'ee CPU-dhaan — GPU hin barbaachisu
Fakkeenyaaf:: Fuula fi fuula Viyetnamii fi fuula en-vi afaanii

Sesame CSMSesame CSM

Premium

Sesame CSM (Modeelii Haawwacha Haawwachaa) yoo ta'u, modeeyilii parameetira 1 biliyoona kan ta'e kan haawwacha haawwachaa uumuudhaaf. Kunis modeeyilii haalata haawwacha namaa keessaatti, yeroon yerootti, deebiin backchannel, deebiin abdii, fi daangeessuu haawwacha. CSM audio akka haawwacha namaa ta'e uumuu danda'a, ka'umsa haawwacha namaa osoo hin ta'in haawwacha sinteettikii.

Deebi'aa::
Sesame
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en
VRAM:
8GB
Dhaabbilee:
Haata'u
Akkasumas:
4x
Konvokeshinii Tarree Jijjiiramni Kanaala boodaa Parameetra
Fakkeenyaaf:: Asistaantii AI, chatbots, appilikeeshiniin AI marii

Chatterbox TurboChatterbox Turbo

Standard

Chatterbox Turbo by Resemble AI waa'ee 350M parameetiraa akka Chatterboxtti, gara yeroo-dhaabbata 6x'tti akka hin deebine, yeroo-dhaabbata 200ms'tti akka hin deebine. Taggaa paralinguistic akka [laugh], [cough], fi [chuckle] keessatti akka ta'e gargaara. Perth watermarking irratti oodiyoowwan hundaa'e akka itti fufuu akka ta'e gargaara.

Deebi'aa::
Resemble AI
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en
VRAM:
2GB
Dhaabbilee:
Ya
Akkasumas:
2x
Laataa'ii gadi-200ms Tarreewwan Paralinguistic 6x yeroo-reellii Dhaabbilee Watermarking
Fakkeenyaaf:: Ajjeestota-dhaamsa-gaafatama, dhaamsa-gaafatama-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa-dhaamsa

VoxCPMVoxCPM

Standard

VoxCPM 1.5 by OpenBMB isa mo'ellaa TTS Tokenizer-free kan hin qabne kan hojiirra oolchu bakka itti jirutti osoo hin ta'in Token-discrete. Kan 44.1kHz audio, 3-10 sekondii, fi kan itti fufuu akkanumatti. Kloon-cross-languages siif kennee akka fuula Afaan Ingiliizii akka fuula Afaan Siiniitti fayyadamtu fi akkanumatti.

Deebi'aa::
OpenBMB
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
2x
44.1kHz audio Tokenizer-free Kloonii Afaan-Dhugumatti Akkasumas LoRA finfinne-tuning
Fakkeenyaaf:: Audio-fidelity, audiobooks, long-form content with voice consistency

Kani TTS 2Kani TTS 2

Free

Kani-TTS-2 by NineNineSix isa 400M parameetira moolaa ultra-dhaba'aa kan ijaare kan LFM2 backbone AI Liqii kan qabu NVIDIA NanoCodec. Kan dirqee 3GB VRAM qofa fi ~10 sekondii dhaamsa ~2 sekondii keessatti kan A100 (RTF 0.2). Fudhachiinsi ummatoota kan jiru kan 'kani-tts-2-en` checkpoint-in-English-only fi hook-in-speaker-embedding-hook-in-voice-cloning-required-not-exposed-to-cloning-voice-cloning-use-Chatterbox / IndexTTS2 / F5-TTS, or Kokoro / MeloTTS for non-English.

Deebi'aa::
NineNineSix
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en
VRAM:
3GB
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
3GB VRAM Fakkii Fakkii NanoCodec Birrii
Fakkeenyaaf:: Jijjiiramni Afaan Ingiliizii gaggaarii irratti haaraa'aa VRAM-qaba, qabduu gaggaarii

OuteTTSOuteTTS

Free

OuteTTS moodeeloota afaanii baay'ee kan akka teeksta-to-waamichaa fi fakkii-to-waamichaa kan fooyyessuu yoo ta'u, fakkii-to-waamichaa fi fakkii-waamichaa kan akka lama.cpp (CPU/GPU), Hugging Face Transformers, ExLlamaV2, VLLM, fi braazira infereensii kan akka Transformers.js.

Deebi'aa::
OuteAI
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en
VRAM:
2GB
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
CPU inference Barruu Bakka-duubbii baay'ee Fuula
Fakkeenyaaf:: Dabalataan, TTS-based browser, dabalataan-low-resource environments

VibeVoiceVibeVoice

Standard

VibeVoice of Microsoft kan dhaggeeffatu dhaggeeffama dheere hanga 90 daqiiqaa'tti kan gargaaru dhaggeeffatoota 4'f, kan itti fayyadamu akka podcasts fi walga'ii. Wabi-dhaggeeffataa Realtime 0.5B kan argatu ~300ms latency akka walga'ii fayyadamu. Kan gargaaru dhaggeeffataa tag'oota akka walga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-ga'ii-

Deebi'aa::
Microsoft
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
4GB
Dhaabbilee:
Haata'u
Akkasumas:
2x
Haati-dhageessota-aadaa-aadaa Fuula dheeraa (90 min) Jijjiiramni Podcast Dabalataanis Laataa
Fakkeenyaaf:: Podcasts, diyaloogii, narraashinii foorma dheeraa, fakkaattota-kakuu-kakuu

Pocket TTSPocket TTS

Free

Pocket TTS by Kyutai (creators of Moshi) isa 100M parameetira 100M parameetira-to-speech modelaa kan wal-qabatee kan wal-qabatee. Kan CPU irratti hojiirra oola, kloonaa dhageenyii zero-shot kan deeggara, fi dhageenyii sammuu-qabeenya kan uumuu. Sa'aawwan modelaa ciccimaa kan akka edge deployment fi reef-low environments.

Deebi'aa::
Kyutai
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, fr
VRAM:
1GB
Dhaabbilee:
Ya
Akkasumas:
Birrii
Parameetra CPU inference Dhaabbilee Kloonii sampulaa tokkoo Jijjiirama
Fakkeenyaaf:: Dabalataan saffisaa, dabalataan CPU-hojjaa, dabalataan dhageetti

Kitten TTSKitten TTS

Free

Kitten TTS by KittenML isa mo'ellaa kitaaba-to-waamichaa ol-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-dhabdee-

Deebi'aa::
KittenML
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en
VRAM:
0GB
Dhaabbilee:
Haata'u
Akkasumas:
Birrii
CPU-only inference Fakkeenyaaf, 80MB 8 dhalootaan Ijaarsa daandiin ONNX-based 24kHz output
Fakkeenyaaf:: TTS saffisaan, edge deployment, applications lateensii gadii

CosyVoice3CosyVoice3

Standard

CosyVoice3 kan tarree haaraan kan tarree FunAudioLLM Alibaba's tiimii ti. Kan bi-streaming inference kan qabu ~150ms latency, kan itti gaafatamummaa-qabee/rakkoo/volume, fi kan akka kan dubbatu kan cimee akka 0-shot kloon. Kan fuula 9 plus 18 Chinese dialects. RL-tuned variant kan kennutti state-of-the-art prosody.

Deebi'aa::
Alibaba (FunAudioLLM)
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, de, es, fr, it, ru
VRAM:
4GB
Dhaabbilee:
Ya
Akkasumas:
2x
Bi-Stiriimingii Irreechi Dhaabbilee Ijaarsa/Volumaa Itti fufiinsa
Fakkeenyaaf:: TTS, appilikeeshiinii yeroo dhabdummaa, kloonii dhaamsa

NAMAA Saudi TTSNAMAA Saudi TTS

Standard

NAMAA Saudi TTS is a Saudi Arabic fine-tune of Resemble AI's ChatterboxMultilingual. Trained by NAMAA Space on authentic Saudi-dialect speech, it produces natural Modern Standard Arabic and Saudi colloquial pronunciation that generic multilingual models cannot match. Inherits Chatterbox's zero-shot voice cloning and emotion control via reference audio prompts. The first open-weights Arabic TTS deployed on TTS.ai.

Deebi'aa::
NAMAA Space
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
ar
VRAM:
6GB
Dhaabbilee:
Ya
Akkasumas:
2x
Afaan Arabaa Sa'uudii Araabii Standartii Haaraa Kloonii dhaadataa Zero-shot Irreechi Haala afaanii
Fakkeenyaaf:: Arabic content for Saudi audiences, MSA narration, Khaleeji-dialect voice agents, Arabic audiobooks

Darwin TTSDarwin TTS

Standard

Darwin-TTS-1.7B-Cross by FINAL-Bench waa'ee qwen3-TTS-1.7B kan barreeffame yoo ta'e, 84 talker-FFN tensors (8.6%) akka α=3% keessatti wal-qabatee tensors Qwen3-1.7B-Base waliin wal-qabatee. Fuula kun kan wal-qabatee ta'e yoo ta'e, akka korea, anglish, jaapan, fi sinhaayi keessatti.

Deebi'aa::
FINAL-Bench
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, ko, ja, zh
VRAM:
7GB
Dhaabbilee:
Ya
Akkasumas:
2x
Dhaabbilee Afaan-Dhugumatti FFN-dhabdee 4 afaanota qorannoo Qwen3 backbone
Fakkeenyaaf:: Kloonii dhalootaan wal-fakkaatu gidduu Aadaa / Koree / Jaapan / Siiniifi dhalootaan tokko

MOSS-TTSDMOSS-TTSD

Standard

MOSS-TTSD v1.0 kan OpenMOSS'n ta'e, 7B'n kan barreessuu-to-waamu'u'n kan walga'ii itti fufuu'n. Akka 5'tti kan waamu'u'n ta'ee, [S1]/[S2] tagg'ee, 3-10s'n kan waamu'u'n, fi akka 60'tti kan waamu'u'n afaanota 20'f. MOSS-TTSD'n kan adda ta'e - TTSD'n kan walga'ii podcast/audiobook/dubbing'e.

Deebi'aa::
OpenMOSS
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
12GB
Dhaabbilee:
Ya
Akkasumas:
2x
Dabalataanis Akka 5tti 60min coherent audio Dhaabbilee Podcast-optimised
Fakkeenyaaf:: Podcastii, kitaaba-audio, marii-dubbifamu, fakkaata-konverzaashiinii-kan-ga'ee-ka'ee

Ming-Omni TTSMing-Omni TTS

Free

Ming-omni-tts-0.5B by inclusionAI isa moolaa dhageettinnaa omni-modal compact kan ijaare bakka bu’aa BailingMM kan wal-qabatee kan qabu yoo ta’u, dekoder-aawdii Patch-by-Patch-matching. 44.1kHz output (CD quality) kan kennuu, kloonaa dhageettinnaa zero-shot kan deeggaru kan ta’e 3+ sekondii, fi kan akka fuula / dialek / BGM kan ta’e kan JSON.

Deebi'aa::
inclusionAI
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
VRAM:
3GB
Dhaabbilee:
Ya
Akkasumas:
Birrii
44.1kHz output Dhaabbilee Irreechi Ijaarsa Diileektii Jijjiiramni BGM Fakkii
Fakkeenyaaf:: Naannoo Afaan lamaan, dhaamsa fuula, fakkaata kitaaba audioa Siiniffaa

MOSS-TTS NanoMOSS-TTS Nano

Free

MOSS-TTS-Nano-100M waa'ee parameetiraa 100M-aafaa MOSS-TTS kan OpenMOSS's, kan delay-transformer architecture waliin walqabatee. Qabeenya fooyya'aa mo'eelaa 8B's kan ~80x mi'aa'aa fi VRAM per-request kan ciminaan, kan itti fayyadamuuf ta'e.

Deebi'aa::
OpenMOSS
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh, de, es, fr, ja, it, ko, ru, ar, pt
VRAM:
2GB
Dhaabbilee:
Ya
Akkasumas:
Birrii
100M Compact Infarektii Afaan hedduu Dhaabbilee Akkasumas
Fakkeenyaaf:: TTS-tiir-free, fooyya'insa-volumaa-hoo'aa, fayyadama-interaktii-latensii-caala

KokoroKokoro

Birrii

Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.

Deebi'aa::
Hexgrad
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, ja, zh, fr, it, pt, es, hi
Fakkeenyaaf:: High-quality TTS with minimal latency, streaming applications

PiperPiper

Birrii

Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.

Deebi'aa::
Rhasspy
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, de, fr, es, it, pt, nl, pl, ru, zh, ar, cs, da, fi, el, hu, is, ka, kk, ne, no, ro, sk, sr, sv, sw, tr, uk, vi, ca, cy, fa, lv, sl, lb, eu, id, ku, ml, sq, te, ur
Fakkeenyaaf:: Quick previews, accessibility, and embedded applications

VITSVITS

Birrii

VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.

Deebi'aa::
Jaehyeon Kim et al.
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, de, es, fr, pt, nl, fi, hu, bg, ja, pl
Fakkeenyaaf:: General-purpose text-to-speech with natural prosody

MeloTTSMeloTTS

Birrii

MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.

Deebi'aa::
MyShell.ai
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, es, fr, zh, ja, ko
Fakkeenyaaf:: Production applications needing fast, multilingual TTS

Kani TTS 2Kani TTS 2

Birrii

Kani-TTS-2 by NineNineSix is an ultra-lightweight 400M parameter model built on a Liquid AI LFM2 backbone with NVIDIA NanoCodec. It runs in just 3GB VRAM and produces ~10 seconds of speech in ~2 seconds on an A100 (RTF 0.2). The current public release ships an English-only `kani-tts-2-en` checkpoint and does not expose the speaker-embedding hook needed for voice cloning — use Chatterbox / IndexTTS2 / F5-TTS for cloning, or Kokoro / MeloTTS for non-English.

Deebi'aa::
NineNineSix
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en
Fakkeenyaaf:: Fast English generation on low-VRAM hardware, quick previews

OuteTTSOuteTTS

Birrii

OuteTTS extends large language models with text-to-speech capabilities while preserving the original architecture. It supports multiple backends including llama.cpp (CPU/GPU), Hugging Face Transformers, ExLlamaV2, VLLM, and even browser inference via Transformers.js. Features zero-shot voice cloning through speaker profiles saved as JSON.

Deebi'aa::
OuteAI
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo: en
Fakkeenyaaf:: Edge deployment, browser-based TTS, low-resource environments

Pocket TTSPocket TTS

Birrii

Pocket TTS by Kyutai (creators of Moshi) is a compact 100M parameter text-to-speech model that punches well above its weight. It runs efficiently on CPU, supports zero-shot voice cloning from a single audio sample, and produces natural-sounding speech. The small model size makes it ideal for edge deployment and low-resource environments.

Deebi'aa::
Kyutai
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, fr
Fakkeenyaaf:: Lightweight deployment, CPU-only environments, quick voice cloning

Kitten TTSKitten TTS

Birrii

Kitten TTS by KittenML is an ultra-lightweight text-to-speech model built on ONNX. With variants from 15M to 80M parameters (25-80 MB on disk), it delivers high-quality voice synthesis on CPU without requiring a GPU. Features 8 built-in voices, adjustable speech speed, and built-in text preprocessing for numbers, currencies, and units. Ideal for edge deployment and low-latency applications.

Deebi'aa::
KittenML
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en
Fakkeenyaaf:: Fast lightweight TTS, edge deployment, low-latency applications

Ming-Omni TTSMing-Omni TTS

Birrii

Ming-omni-tts-0.5B by inclusionAI is a compact omni-modal speech model built on the BailingMM dense backbone with a Patch-by-Patch flow-matching audio decoder. Delivers 44.1kHz output (near CD quality), supports zero-shot voice cloning from a 3+ second reference, and includes built-in emotion / dialect / BGM control via JSON instructions. Excellent stability — 0.83% WER on Chinese benchmarks.

Deebi'aa::
inclusionAI
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo: en, zh
Fakkeenyaaf:: High-fidelity bilingual narration, emotion-controlled voice acting, Chinese audiobook content

MOSS-TTS NanoMOSS-TTS Nano

Birrii

MOSS-TTS-Nano-100M is OpenMOSS's compact 100M-parameter variant of the MOSS-TTS family, sharing the delay-transformer architecture. Trades the 8B model's peak quality for ~80x smaller weights and dramatically lower per-request VRAM, making it suitable for free-tier and high-throughput deployments. Same 20-language reach.

Deebi'aa::
OpenMOSS
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo: en, zh, de, es, fr, ja, it, ko, ru, ar, pt
Fakkeenyaaf:: Free-tier TTS, high-volume production, low-latency interactive use

BarkBark

Standartaa

Bark by Suno is a transformer-based text-to-audio model that can generate highly realistic, multilingual speech as well as other audio like music, background noise, and sound effects. It can produce nonverbal communications like laughing, sighing, and crying. Bark supports over 100 speaker presets and 13+ languages.

Deebi'aa::
Suno
Lizenz::
MIT
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
Dhaabbilee:
Haata'u
Sound effectsLaughing/sighingMusic generation100+ speakersMultilingual
Fakkeenyaaf:: Creative audio content, audiobooks with emotion, sound effects

Bark SmallBark Small

Standartaa

Bark Small is a distilled version of the Bark model that trades some audio quality for significantly faster inference speeds and lower memory requirements. It retains Bark's ability to generate speech with emotions, laughter, and multiple languages.

Deebi'aa::
Suno
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
Dhaabbilee:
Haata'u
LightweightFaster than full BarkEmotional speechMultilingual
Fakkeenyaaf:: Quick creative audio when full Bark is too slow

CosyVoice 2CosyVoice 2

Standartaa

CosyVoice 2 by Alibaba's Tongyi Lab achieves human-comparable speech quality with extremely low latency, making it ideal for real-time applications. It uses a finite scalar quantization approach for streaming synthesis and supports zero-shot voice cloning, cross-lingual synthesis, and fine-grained emotion control. It outperforms many commercial TTS systems in subjective evaluations.

Deebi'aa::
Alibaba (Tongyi Lab)
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, fr, de, it, es
Dhaabbilee:
Ya
StreamingZero-shot cloningCross-lingualEmotion controlHuman-parity
Fakkeenyaaf:: Real-time applications, streaming TTS, voice assistants

Dia TTSDia TTS

Standartaa

Dia by Nari Labs is a 1.6B parameter text-to-speech model designed specifically for generating multi-speaker dialogue. It can produce natural-sounding conversations between two speakers with appropriate turn-taking, prosody, and emotional expression. Dia is perfect for creating podcast-style content, audiobook dialogues, and interactive conversational AI.

Deebi'aa::
Nari Labs
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Haata'u
Multi-speakerDialog generationNatural turn-takingEmotional expression1.6B parameters
Fakkeenyaaf:: Podcasts, audiobook dialogues, conversational content

Parler TTSParler TTS

Standartaa

Parler TTS is a text-to-speech model that uses natural language voice descriptions to control the generated speech. Instead of selecting from preset voices, you describe the voice you want (e.g., "a warm female voice with a slight British accent, speaking slowly and clearly") and Parler generates speech matching that description. This makes it uniquely flexible for creative applications.

Deebi'aa::
Hugging Face
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Haata'u
Voice descriptionNatural language controlFlexible voice creationNo preset voices needed
Fakkeenyaaf:: Creative applications where you need custom voice characteristics

Indic Parler TTSIndic Parler TTS

Standartaa

Indic Parler TTS by AI4Bharat extends the Parler architecture to Indian languages, generating natural speech in Tamil, Bengali, Marathi, Gujarati, Kannada, Punjabi, Odia, Assamese, Hindi, Telugu, Malayalam and English. Like Parler, you describe the voice you want in plain language and the model matches it — no preset voices required. Trained on AI4Bharat speech corpora for authentic pronunciation and prosody across the Indian subcontinent.

Deebi'aa::
AI4Bharat
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
ta, bn, mr, gu, kn, pa, or, as, hi, te, ml, en
Dhaabbilee:
Haata'u
11 Indian languagesVoice descriptionNatural language controlAuthentic Indic pronunciation
Fakkeenyaaf:: Indian-language voiceovers, regional content, multilingual Indic applications

KhanomTan TTSKhanomTan TTS

Standartaa

KhanomTan TTS is an open Thai text-to-speech model built on the YourTTS multilingual architecture. Trained on CC0 and permissively-licensed Thai corpora (TSync) alongside several other languages, it delivers natural Thai speech with multiple speaker voices. A clean, commercially-usable option for Thai — the language most open TTS models only cover under non-commercial licenses.

Deebi'aa::
Wannaphong Phatthiyaphaibun
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
th
Dhaabbilee:
Haata'u
Thai TTSMultiple speakersYourTTS architectureCommercial-safe license
Fakkeenyaaf:: Thai voiceovers, Thai-language content and apps

IndexTTS-2IndexTTS-2

Standartaa

IndexTTS-2 is an advanced text-to-speech system that excels at zero-shot voice synthesis with fine-grained emotion control. It can generate speech with specific emotional tones like happy, sad, angry, or fearful without requiring emotion-specific training data. The model uses emotion vectors to precisely control the emotional expression of generated speech.

Deebi'aa::
Index Team
Lizenz::
Bilibili Model License
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
Dhaabbilee:
Ya
Emotion controlZero-shotEmotion vectorsExpressive speechFine-grained control
Fakkeenyaaf:: Emotionally expressive content, audiobooks, virtual assistants

Spark TTSSpark TTS

Standartaa

Spark TTS by SparkAudio is a text-to-speech model that combines voice cloning with controllable emotion and speaking style. Using just 5 seconds of reference audio, it can clone a voice and then generate speech with different emotions, speeds, and styles while maintaining the cloned voice identity. Spark TTS uses a prompt-based control system.

Deebi'aa::
SparkAudio
Lizenz::
CC BY-NC-SA 4.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
Dhaabbilee:
Ya
Voice cloningEmotion controlStyle controlPrompt-based5-second cloning
Fakkeenyaaf:: Content creation with cloned voices and emotional control

GPT-SoVITSGPT-SoVITS

Standartaa

GPT-SoVITS combines GPT-style language modeling with SoVITS (Singing Voice Inference via Translation and Synthesis) for powerful few-shot voice cloning. With as little as 5 seconds of reference audio, it can accurately clone a voice and generate new speech while preserving the speaker's unique characteristics. It excels at both speaking and singing voice synthesis.

Deebi'aa::
RVC-Boss
Lizenz::
MIT
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko
Dhaabbilee:
Ya
5-second cloningSinging voiceFew-shot learningHigh fidelityCross-lingual
Fakkeenyaaf:: Voice cloning, singing synthesis, content creator voice replication

OrpheusOrpheus

Standartaa

Orpheus is a large-scale text-to-speech model that achieves human-level emotional expression. Trained on over 100,000 hours of diverse speech data, it excels at generating speech with natural emotions, emphasis, and speaking styles. Orpheus can produce speech that is virtually indistinguishable from human recordings.

Deebi'aa::
Canopy Labs
Lizenz::
Llama 3.2 Community
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Haata'u
Human-level emotion100K hours trainingNatural emphasisExpressive speech
Fakkeenyaaf:: High-quality emotional speech, audiobooks, voice acting

Qwen3 TTSQwen3 TTS

Standartaa

Qwen3-TTS is a 1.7 billion parameter text-to-speech model from Alibaba's Qwen team. It supports two modes: preset voices with emotion control (9 speakers), and a unique voice design mode where you describe the voice you want in natural language. It covers 10 languages with high expressiveness and natural prosody.

Deebi'aa::
Alibaba (Qwen)
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, de, fr, ru, pt, es, it
Dhaabbilee:
Haata'u
9 preset voicesVoice design from textEmotion control10 languages
Fakkeenyaaf:: Multilingual content with preset voices or custom voice design

VieNeu-TTS-v2VieNeu-TTS-v2

Standartaa

VieNeu-TTS-v2 is a 300M parameter Vietnamese-first TTS model trained on 10,000+ hours of bilingual data. It supports seamless en-vi code-switching, 7 preset voices spanning Northern and Southern accents, and instant voice cloning from 3-5 seconds of reference audio. Runs entirely on CPU via GGUF Q4 inference + ONNX audio decoder — no GPU needed, generations finish in ~7 seconds. Built on a Qwen3 backbone.

Deebi'aa::
Phạm Nguyễn Ngọc Bảo
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
vi, en
Dhaabbilee:
Ya
7 preset voices (North + South accents)En-Vi code-switchingVoice cloning (3-5s reference)Podcast / multi-speaker supportCPU-only — no GPU required
Fakkeenyaaf:: Vietnamese content and bilingual en-vi narration

Chatterbox TurboChatterbox Turbo

Standartaa

Chatterbox Turbo by Resemble AI is a 350M parameter upgrade to Chatterbox, delivering up to 6x real-time speed with sub-200ms latency. It supports paralinguistic tags like [laugh], [cough], and [chuckle] directly in text. Includes Perth watermarking on all generated audio for provenance tracking.

Deebi'aa::
Resemble AI
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Ya
Sub-200ms latencyParalinguistic tags6x real-timeVoice cloningWatermarking
Fakkeenyaaf:: Real-time voice agents, expressive speech with natural sounds

VoxCPMVoxCPM

Standartaa

VoxCPM 1.5 by OpenBMB is a novel tokenizer-free TTS model that operates in continuous space rather than discrete tokens. It produces high-fidelity 44.1kHz audio, supports zero-shot voice cloning from 3-10 seconds, and maintains consistency across paragraphs. Cross-language cloning lets you apply an English voice to Chinese speech and vice versa.

Deebi'aa::
OpenBMB
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh
Dhaabbilee:
Ya
44.1kHz audioTokenizer-freeCross-lingual cloningContext-awareLoRA fine-tuning
Fakkeenyaaf:: High-fidelity audio, audiobooks, long-form content with voice consistency

VibeVoiceVibeVoice

Standartaa

VibeVoice from Microsoft generates long-form speech up to 90 minutes with support for 4 simultaneous speakers, making it ideal for podcasts and dialogues. The Realtime 0.5B variant achieves ~300ms latency for interactive use. Supports speaker tags for multi-turn dialogue generation.

Deebi'aa::
Microsoft
Lizenz::
MIT
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh
Dhaabbilee:
Haata'u
Multi-speakerLong-form (90 min)Podcast generationDialogueLow latency
Fakkeenyaaf:: Podcasts, dialogues, long-form narration, multi-speaker content

CosyVoice3CosyVoice3

Standartaa

CosyVoice3 is the latest evolution from Alibaba's FunAudioLLM team. It features bi-streaming inference with ~150ms latency, instruction-based control for emotion/speed/volume, and improved speaker similarity for zero-shot cloning. Supports 9 languages plus 18 Chinese dialects. RL-tuned variant delivers state-of-the-art prosody.

Deebi'aa::
Alibaba (FunAudioLLM)
Lizenz::
Apache 2.0
Jijjiiramni:
Fast
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, de, es, fr, it, ru
Dhaabbilee:
Ya
Bi-streamingEmotion controlVoice cloningSpeed/volume controlInstruction following
Fakkeenyaaf:: Multilingual production TTS, real-time applications, voice cloning

NAMAA Saudi TTSNAMAA Saudi TTS

Standartaa

NAMAA Saudi TTS is a Saudi Arabic fine-tune of Resemble AI's ChatterboxMultilingual. Trained by NAMAA Space on authentic Saudi-dialect speech, it produces natural Modern Standard Arabic and Saudi colloquial pronunciation that generic multilingual models cannot match. Inherits Chatterbox's zero-shot voice cloning and emotion control via reference audio prompts. The first open-weights Arabic TTS deployed on TTS.ai.

Deebi'aa::
NAMAA Space
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
ar
Dhaabbilee:
Ya
Saudi Arabic dialectModern Standard ArabicZero-shot voice cloningEmotion controlNative pronunciation
Fakkeenyaaf:: Arabic content for Saudi audiences, MSA narration, Khaleeji-dialect voice agents, Arabic audiobooks

Darwin TTSDarwin TTS

Standartaa

Darwin-TTS-1.7B-Cross by FINAL-Bench is a research variant of Qwen3-TTS-1.7B where 84 talker-FFN tensors (8.6%) are blended at α=3% with the matching tensors from Qwen3-1.7B-Base. The blend is built without retraining and produces noticeably crisper cross-lingual voice cloning across Korean, English, Japanese, and Chinese. Operates in zero-shot voice-clone mode (3 seconds reference audio).

Deebi'aa::
FINAL-Bench
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, ko, ja, zh
Dhaabbilee:
Ya
Voice cloningCross-lingualFFN-blended4 core languagesQwen3 backbone
Fakkeenyaaf:: Cross-lingual voice cloning between English / Korean / Japanese / Chinese with a single reference voice

MOSS-TTSDMOSS-TTSD

Standartaa

MOSS-TTSD v1.0 from OpenMOSS is a 7B dialogue text-to-speech model that continues conversations from a short audio prompt. Supports up to 5 simultaneous speakers via [S1]/[S2] tags, zero-shot voice cloning from 3-10s reference audio, and up to 60 minutes of coherent multi-turn dialogue across 20 languages. Distinct from MOSS-TTS — TTSD is specialized for podcast/audiobook/dubbing workflows.

Deebi'aa::
OpenMOSS
Lizenz::
Apache 2.0
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh
Dhaabbilee:
Ya
Multi-speaker dialogueUp to 5 speakers60min coherent audioVoice cloningPodcast-optimised
Fakkeenyaaf:: Podcasts, audiobooks, dubbed dialogue, conversational content with multiple voices

ChatterboxChatterbox

Premium

Chatterbox by Resemble AI is a cutting-edge zero-shot voice cloning model. It can replicate any voice from a single audio sample with remarkable accuracy, capturing not just the timbre but also the speaking style and emotional nuances. Chatterbox also features fine-grained emotion control, allowing you to adjust the emotional tone of the generated speech independently from the voice identity.

Deebi'aa::
Resemble AI
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Ya
VRAM:
4GB
Akkasumas:
4x
Zero-shot cloningEmotion controlHigh fidelityStyle transferSingle sample cloning
Fakkeenyaaf:: Professional voice cloning with emotional control, content creation

Tortoise TTSTortoise TTS

Premium

Tortoise TTS is an autoregressive multi-voice text-to-speech system that prioritizes audio quality over speed. It uses DALL-E-inspired architecture to generate highly natural speech with excellent prosody and speaker similarity. While slower than many alternatives, Tortoise produces some of the most realistic synthetic speech available in the open-source ecosystem.

Deebi'aa::
James Betker
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Ya
VRAM:
8GB
Akkasumas:
4x
Highest qualityMulti-voiceDALL-E architectureVoice cloningAutoregressive
Fakkeenyaaf:: Audiobooks, premium content, quality-first applications

StyleTTS 2StyleTTS 2

Premium

StyleTTS 2 achieves human-level TTS synthesis by combining style diffusion with adversarial training using large speech language models. It generates the most natural sounding speech among single-speaker models, rivaling human recordings. StyleTTS 2 uses diffusion-based style modeling to capture the full range of human speech variation.

Deebi'aa::
Columbia University
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Haata'u
VRAM:
4GB
Akkasumas:
4x
Human-levelStyle diffusionAdversarial trainingNatural variationHigh fidelity
Fakkeenyaaf:: Studio-quality single-speaker synthesis, professional narration

OpenVoiceOpenVoice

Premium

OpenVoice by MyShell.ai enables instant voice cloning with granular control over voice style, emotion, accent, rhythm, pauses, and intonation. It can clone a voice from a short audio clip and generate speech in multiple languages while maintaining the speaker identity. OpenVoice also functions as a voice converter, allowing real-time voice transformation.

Deebi'aa::
MyShell.ai / MIT
Lizenz::
MIT
Jijjiiramni:
Medium
Qiyaasa::
Afaan Oromoo:
en, zh, ja, ko, fr, es
Dhaabbilee:
Ya
VRAM:
4GB
Akkasumas:
4x
Instant cloningVoice conversionEmotion controlAccent controlMultilingual
Fakkeenyaaf:: Voice cloning with fine-grained style control, voice conversion

Sesame CSMSesame CSM

Premium

Sesame CSM (Conversational Speech Model) is a 1 billion parameter model designed specifically for generating conversational speech. It models the natural patterns of human conversation including turn-taking timing, backchannel responses, emotional reactions, and conversational flow. CSM generates audio that sounds like a natural human conversation rather than synthetic speech.

Deebi'aa::
Sesame
Lizenz::
Apache 2.0
Jijjiiramni:
Slow
Qiyaasa::
Afaan Oromoo:
en
Dhaabbilee:
Haata'u
VRAM:
8GB
Akkasumas:
4x
ConversationalNatural timingTurn-takingBackchannel1B parameters
Fakkeenyaaf:: AI assistants, chatbots, conversational AI applications

Tarree walqabatee

Modelii Deebi'aa: Daandiin Qiyaasa: Jijjiiramni Afaan Oromoo Dhaabbilee VRAM Lizenz: Qindaa'ina
Kokoro Hexgrad Free Fast 8 1.5GB Apache 2.0 Birrii Fuula
Piper Rhasspy Free Fast 42 0 (CPU only) MIT Birrii Fuula
VITS Jaehyeon Kim et al. Free Fast 11 1GB MIT Birrii Fuula
MeloTTS MyShell.ai Free Fast 6 0.5GB (GPU optional) MIT Birrii Fuula
Bark Suno Standard Slow 13 5GB MIT 2 Fuula
Bark Small Suno Standard Medium 13 2GB MIT 2 Fuula
CosyVoice 2 Alibaba (Tongyi Lab) Standard Medium 8 4GB Apache 2.0 2 Fuula
Dia TTS Nari Labs Standard Medium 1 4GB Apache 2.0 2 Fuula
Parler TTS Hugging Face Standard Medium 1 4GB Apache 2.0 2 Fuula
Indic Parler TTS AI4Bharat Standard Slow 12 8GB Apache 2.0 2 Fuula
KhanomTan TTS Wannaphong Phatthiyaphaibun Standard Fast 1 2GB Apache 2.0 2 Fuula
IndexTTS-2 Index Team Standard Medium 2 4GB Bilibili Model License 2 Fuula
Spark TTS SparkAudio Standard Medium 2 4GB CC BY-NC-SA 4.0 2 Fuula
GPT-SoVITS RVC-Boss Standard Slow 4 6GB MIT 2 Fuula
Orpheus Canopy Labs Standard Medium 1 4GB Llama 3.2 Community 2 Fuula
Chatterbox Resemble AI Premium Medium 1 4GB MIT 4 Fuula
Tortoise TTS James Betker Premium Slow 1 8GB Apache 2.0 4 Fuula
StyleTTS 2 Columbia University Premium Medium 1 4GB MIT 4 Fuula
OpenVoice MyShell.ai / MIT Premium Medium 6 4GB MIT 4 Fuula
Qwen3 TTS Alibaba (Qwen) Standard Medium 10 7GB Apache 2.0 2 Fuula
VieNeu-TTS-v2 Phạm Nguyễn Ngọc Bảo Standard Fast 2 CPU Apache 2.0 2 Fuula
Sesame CSM Sesame Premium Slow 1 8GB Apache 2.0 4 Fuula
Chatterbox Turbo Resemble AI Standard Fast 1 2GB MIT 2 Fuula
VoxCPM OpenBMB Standard Fast 2 4GB Apache 2.0 2 Fuula
Kani TTS 2 NineNineSix Free Fast 1 3GB Apache 2.0 Birrii Fuula
OuteTTS OuteAI Free Slow 1 2GB Apache 2.0 Birrii Fuula
VibeVoice Microsoft Standard Fast 2 4GB MIT 2 Fuula
Pocket TTS Kyutai Free Fast 2 1GB MIT Birrii Fuula
Kitten TTS KittenML Free Fast 1 0GB Apache 2.0 Birrii Fuula
CosyVoice3 Alibaba (FunAudioLLM) Standard Fast 9 4GB Apache 2.0 2 Fuula
NAMAA Saudi TTS NAMAA Space Standard Medium 1 6GB MIT 2 Fuula
Darwin TTS FINAL-Bench Standard Medium 4 7GB Apache 2.0 2 Fuula
MOSS-TTSD OpenMOSS Standard Medium 2 12GB Apache 2.0 2 Fuula
Ming-Omni TTS inclusionAI Free Medium 2 3GB Apache 2.0 Birrii Fuula
MOSS-TTS Nano OpenMOSS Free Fast 11 2GB Apache 2.0 Birrii Fuula

Akkasumas, akkasumas, akkasumas

Maaliif TTS.ai filatan akka barreessuu fi barreessuu?

TTS.ai modeeloota teeksti-to-waamu-bara-baraa-dunyaa-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-qunnamtii-

Every model is open source under MIT, Apache 2.0, or similar permissive licenses, ensuring you have full commercial rights to use the generated audio in your projects. Whether you need fast, lightweight synthesis for real-time applications or premium studio-quality output for audiobooks and podcasts, TTS.ai has the right model for every use case.

Modelii hin qabne, Konta hin qabdu

Haaromsa yeroon itti eegalu mo'ellaa TTS sadii: Piper (ultra-rakkoo, ga'ee), VITS (sinteesisi neural-qunnamtii olaanaa), fi MeloTTS (golgachuu afaanoota hedduu). Akkasumas, kaardii credit hin qabu, ga'eewwanii hin qaban. Mo'ellaa sadii Afaan Ingiliizii fi afaanota biroo hedduu waliin galchuu danda'a.

Fuula-GPU

Modeeloota TTS hunda kan dirqisiifaman GPU'oota NVIDIA kan itti fayyadaman yoo ta'e, yeroo uumuu yeroon itti fufuu. Modeeloota hin-qabne, yeroo baay'ee, audio'n yeroo 2 keessatti uumuu. Modeeloota Standartii akka Kokoro, CosyVoice 2, fi Bark, yeroo 3-5 keessatti. Modeeloota Premium, akka Tortoise fi Chatterbox, yeroo 5-15 keessatti fayyadamu, kan hundaa'e gara dheeraa teekstaatti.

30+ Afaan Oromoo

Haalli kun akka afaanota 30 ta'an akka afaan Ingliizii, Espaani'el, Faraansaa, Jarman, Italii, Portuugal, Siini'ee, Jaapan, Koree, Araabii, Hindii, Ruush, fi kanneen biroon. Modeeloonni hedduun sintesis cross-languageii deeggara, kan jedhu akka afaanota tokko tokkotti haalli kun akka hin barbaachisu. CosyVoice 2 fi GPT-SoVITS akka haallii cross-languageiitti wal-qabsiisan.

API-n qophaa'e

TTS.ai keessatti fayyadamuu dandeessu OpenAI-n itti fayyadamu REST API. Akkasumas, 20+ moolaawwan hundaf. Python, JavaScript, cURL, fi Go SDKs. Sukkanneessaan daawwachuu yeroo dhabamuu. Baatii hojiirra oolchuu kan ta'e kan inni guddaan. Webhooks kan ta'e kan hin ta'in.

Su'aalota yeroo dheeraaf dhiyaatan

Text to speech (TTS) is an AI technology that converts written text into natural-sounding spoken audio. Modern neural TTS models like Kokoro, Chatterbox, and CosyVoice 2 use deep learning to produce speech that sounds remarkably human, with natural prosody, emotion, and rhythm.

Akka barbaaddetti fayyadami. Akka filatamuuf, Piper ykn MeloTTS (free, fast) fayyadami. Akka walfakkaatu, Kokoro ykn CosyVoice 2 (standard tier) fayyadami. Akka dhaga'amuf, Chatterbox ykn GPT-SoVITS (premium) fayyadami. Akka walgahii/podcast'ii fayyadami, Dia TTS fayyadami. Modala tokko tokko humna adda addaa qaba - fuula duraa fayyadami akka gaariitti.

Yes! TTS.ai offers free text-to-speech with Kokoro, Piper, VITS, and MeloTTS models. No account required for up to 500 characters and 3 generations per hour. Sign up for a free account to get 15,000 characters and access all models.

Modeelli keenya TTS waliin 30 + afaanii akka inni keessaa English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Russian, Hindi, fi hedduu.

Ya, oodii TTS.ain uumame akka fuula fuula fuula fayyadamu dandeessa. Modeeloonni keenya hundaa'an laayinsii fuula-abboota (MIT, Apache 2.0) fayyadamu. Laayinsii modeeloota tokkoon tokkoo eeg. Laayinsii modeeloota tokkoon tokkoon fayyadamuu dandeessa.

TTS.ai supports MP3, WAV, OGG, and FLAC output formats. MP3 is the default for web playback. WAV is recommended for further audio processing. You can convert between formats using our Audio Converter tool.

Voice cloning uses AI to replicate a specific voice from a short audio sample (typically 5-30 seconds). Upload a clear recording of the target voice, and models like Chatterbox, GPT-SoVITS, or OpenVoice will generate new speech in that voice. The quality improves with cleaner, longer reference audio.

Free users can generate up to 500 characters per request. Registered users get up to 5,000 characters per request. For longer texts, the audio is generated in chunks and stitched together automatically. API users can process up to 10,000 characters per request.

Piper fi moodeeloota biroon SSML taggaawwanii aasaasii akka duraan, itti gaafatamummaa, fi itti gaafatamummaa itti gaafatamummaa SSML deeggaru. Moodeeloota SSML hin deeggarin, fuula fi lakkoobsi saalaa akka itti gaafatamummaa SSML fayyadamu dandeessu.

Ya, mo'ellaa hedduutu jajjabeessuun jajjabeessuun 0.5x irraa gara 2.0x. Mo'ellaa hedduu akka Bark fi Parler'ttis jajjabeessuun jajjabeessuun fi fuula fi fuula fuula. Ijoollee jajjabeessuun jajjabeessuun akka jajjabeessuun.

Ya, hojiirra oolchuu baatii kaa'uu danda'a API keenyaan. Akkasumas, si'awwan mata duree hedduu akka API tokkotti ykn skriptatti galchuu dandeessa, fi tokko tokko akka fayilii awdiitii adda addaatti hojjechaa fi deebi'aa. Kun akka kitaaba awdiitii, moodiiloota e-learning, ykn skriptoota daaylog galmee adda addaatti barbaachisaadha.

Generate an API key from your account dashboard, then send POST requests to our REST API endpoint with your text, model, and voice parameters. We provide code examples in Python, JavaScript, and cURL. The API is OpenAI-compatible, so existing integrations work with minimal changes.
5.0/5 (4)

Maaliif nu barbaachisa? Dhugaa kee nu gargaara rakkoolee ittisaa.

Jijjiiramni fuula duraa

Jiru namoota miiliyoonaan lakkaa'aman TTS.ai fayyadamuun. Qabdu 15,000 karaaktara bilaa'ee akka akkountaatti. Mootiiwwan bilaa'ee akka hin jirre.