ጽሑፍ ናብ ቃላት
ጽሑፍ ናብ ድምጺ-ኣካል ንምቕያር ምስ open-source AI models. Free to use, no account required.
SSML tags ሒዝካ ጽሑፍካ ሒዝካ ንምውሳድ:
<speak><prosody rate="slow">Slow speech</prosody></speak>
ስነ-ሓሳብ ንምዕቃብ ስነ-ሓሳብ ንምዕቃብ (ምኽኻብ ሞዴል ይርዳእ)
ድምጺ ተለፎን (ቓል = ድምጺ)
ዝርዝር ሓበሬታ
Kokoro
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
| ደራሲ: | Hexgrad |
| ውልቀ-መዚ: | Apache 2.0 |
| ፍጥነት | Fast |
| ጥራሕ: | |
| ቋንቋታት | 8 ቋንቋታት |
| VRAM | 1.5GB |
| ድምጺ | ኣይተደገፈን |
ምክልኻል ንምርካብ
- ልክዕን ጥንታዊን መደብ ኣርእስቲ ንምጥቃም ተጠቒምካ
- ቍጽሪን ኣርእስቲን ንምፍታሕ ቍጽሪን ኣርእስቲን ጻሕፍቲ
- ኮምታት ሒዝካ ኣብ መንጎ ቃላት ዝንቡር ዑደት ንምፍጣር
- ሰለስተ ነጥብታት (...) ንኸጥቀም ንደሊ
- Kokoro ወይ CosyVoice 2 ንኸተጥቀመሉ ንጥቀመሉ
- Dia ንኸምዚ ዝስዕብ መተካእታ ቃላትን መተካእታ podcastን ተጠቒምካ
ቍጽሪ ኣርእስቲ
| ቍጽሪ | ዋጋ 1K ኣሃዝ |
|---|---|
| ነጻ | 1:1 (ብነጻ) |
| ቍጽሪ | 2x ኣርእስቲ |
| ተለቪዥን | 4x ጽሑፋት |
AI Text to Speech ብኸመይ ከም ዝሰርሕ
ኣብ ሰለስተ ቀላል ጕዕዞታት፡ ብጥበብ-ጥበብ ዝፍጠር ድምጺ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ ኣትክልቲ
ጽሑፍካ ኣትሒዝካ
ጻሕፍቲ ንምጽሓፍ ዝደሊ ጽሑፍ ጻሕፍ፣ ጻሕፍ፣ ወይ ኣቕረብ። ክሳብ 5,000 ኣርእስቲ ኣብ ሓደ ጅነሬሽን ንምዕባይ ይሕግዝ። ጻሕፍቲ ቐላል ተጠቒምካ ወይ SSML tags ረኺብካ ንምዕባይ፡ ኣብ ድምጺ፣ ዑቕባ፣ ወይ ኣብ ድምጺ ምጥቃስ ላዕለዋይ ቁጥጥር ግበር።
ሞዴል & ድምጽን ምረጽ
ካብ 20+ AI ሞዴላት ኣብ 3 ደረጃታት ምርኣይ. ድምጺ ይምረጽ እቲ ምስቲ ርክብካ ዝስዕብ, ቋንቋ ርክብካ ምርኣይ, ፍጥነት መጻኢ መጻኢ ካብ 0.5x ናብ 2.0x ምቕያር, ከምኡ'ውን ናይ ምርጫኻ ቅርጺ ምርኣይ (MP3, WAV, OGG, or FLAC) ምርኣይ.
ምዝራብ
ኣብ መሰረታዊ መተግበሪያታት ዩኒኮድ፡ ድምጺ ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት። ኣብ ማይክሮፎን ይምልከት።
ጽሑፍ ናብ ቃላት
እቲ ብAI ዝዓበየ ጽሑፍ-እቲ-ብቓል ዝጥቀመሉ መሳርሒ፡ ነቲ ሰባት ኣብ ሰለስተ ፋብሪካታት ዝፍጠሩዎ፣ ዝጥቀሙዎ፣ ከምኡ'ውን ምስ ድምጺ ዝዛረቡዎ መልክዕ ይቕይር ኣሎ።
መዝገበ-ቃላት
TTS.ai
Kokoro
Free
Kokoro ሓደ 82 ሚልዮን ፐራሜትሪ ጽሑፍ-እቲ-ብ-ቓል-ብ-ቓል ሞዴል እዩ እቲ ኣብ ልዕሊ ክብደት ክላሱ ዝሰርሕ። ኣብ ልዕሊ ንእሽቶ ክብደትኡ፡ ብቐሊሉ ንጹር ቓል ይፍጠር። Kokoro ቋንቋታት ብዛዕባ ኣንግሊዝኛ፣ ጃፓንኛ፣ ቻይናዊ፣ን ኮሪያን ብምእታው ብብዙሕ ቃላት ይገልጽ። ብቐሊሉ ይሰርሕ - ድምጺ ብቐሊሉ 100x ይፍጠር ካብቲ ኣብ GPU ዝግበር ራእይ-ብ-ርእይ ድምጺ ዝኸውን።
Hexgrad
Apache 2.0
Fast
en, ja, zh, fr, it, pt, es, hi
1.5GB
ኣይ
ነጻ
Piper
Free
Piper ሓደ ቀላል ጽሑፍ-እቲ-ብ-ቓል-መኪና እዩ፣ ብRhasspy ዝተፈጠረ፣ VITSን larynxን ንድፊታት ዝጥቀመሉ። ኣብ CPU ሙሉእ ብምሉእ ይሰርሕ፣ እዚ ድማ ን Edge devices፣ Home automation፣ንተወሳኺ TTS ዝደልይዎ ኣፕሊኬሽናት ጠቃሚ ገይሩዎ ኣሎ። ምስ 100+ ቃላት ኣብ 30+ ቛንቋታት፣ Piper ኣብ Raspberry Pi 4 እውን ብቐጥታ ድምጺ ስነ-ኣእምሮኣዊ ቃላት ይቕበል።
Rhasspy
MIT
Fast
en, de, fr, es, it, pt, nl, pl, ru, zh, ar, cs, da, fi, el, hu, is, ka, kk, ne, no, ro, sk, sr, sv, sw, tr, uk, vi, ca, cy, fa, lv, sl, lb, eu, id, ku, ml, sq, te, ur
0 (CPU only)
ኣይ
ነጻ
VITS
Free
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) ሓደ ተመሳሳሊ ዘረባ ናይ TTS እዩ፣ እቲ ዘረባ ድማ ካብቲ ኣብ እዋን’ዚ ዝርከብ ናይ ክልተ ደረጃ ሞዴላት ዝያዳ ድምጺ ስነ-ኣእምሮኣዊ ዝዀነ ድምጺ ይፍጠር።
Jaehyeon Kim et al.
MIT
Fast
en, de, es, fr, pt, nl, fi, hu, bg, ja, pl
1GB
ኣይ
ነጻ
MeloTTS
Free
MeloTTS by MyShell.ai እዩ ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋይ ቋንቋ TTS ላዕለዋ
MyShell.ai
MIT
Fast
en, es, fr, zh, ja, ko
0.5GB (GPU optional)
ኣይ
ነጻ
Bark
Standard
Bark by Suno ሓደ ኣብ ትራንስፎርመር ዝተመርኮሰ ጽሑፍ-እቲ-ኣዲኡ ሞዴል እዩ፣ እቲ ዝፍጠር ሓሳባት ድማ ብቛንቋታት ሒዙ እዩ፣ ከምኡውን ካልእ ድምጺ ከም ሙዚቃ፣ ድምጺ ደገ፣ንድምጺ ተግባርን ይርኢ እዩ። ሓሳባት ዘይተዛረቡ ከም ምጽዋር፣ ምጽዋር፣ንጸልማትን ይርኢ እዩ። Bark ኣብ ልዕሊ 100 ድምጺ ድምጺ 13+ ቋንቋታት ይደግፍ።
Suno
MIT
Slow
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
5GB
ኣይ
2x
Bark Small
Standard
Bark Small ሓደ ተለፎን ናይቲ Bark ሞዴል እዩ እቲ ዝዓበየ ጥራሕ ድምጺ ንምርካብ ዝዓበየ ፍጥነት ምትእስሳር ዝህብ ከምኡውን ዝዝከር ምትእስሳር ዝህብ። ብቐንዱ ድማ ብቕዓት Bark ንምፍጣር ቃላት ምስ ስነ-ልቦና፣ ፊንፊንፊን፣ንብዙሕ ቋንቋታት ዝያዳ ይጥቀመሉ እዩ።
Suno
MIT
Medium
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
2GB
ኣይ
2x
CosyVoice 2
Standard
ድምጺ 2 ካብ ቶንጊ ላብ ኣሊባባ፡ ብጥበብ ድምጺ ዝስዕብ ጥራሕ ድምጺ ዝህብ፡ ብጥበብ ድምጺ ዝስዕብ ጥራሕ ድምጺ ዝህብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ ድምጺ ዝስዕብ፡ ብጥበብ
Alibaba (Tongyi Lab)
Apache 2.0
Medium
en, zh, ja, ko, fr, de, it, es
4GB
ኣየ
2x
Dia TTS
Standard
Dia by Nari Labs እዩ 1.6B parameter text-to-speech model specially designed for generating multi-speaker dialogue. It can produce natural-sounding conversations between two speakers with appropriate turn-taking, prosody, and emotional expression. Dia እዩ ሙሉእ ንምፍጣር podcast-style content, audiobook dialogues, and interactive conversational AI.
Nari Labs
Apache 2.0
Medium
en
4GB
ኣይ
2x
Parler TTS
Standard
Parler TTS ሓደ ጽሑፍ-እቲ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ውሑድ-ብ-ው
Hugging Face
Apache 2.0
Medium
en
4GB
ኣይ
2x
Indic Parler TTS
Standard
ንድዊክ ፓርለር TTS ብ AI4Bharat ን ንድዊክ ፓርለር ኣርክቴክቸር ናብ ንድዊክ ቋንቋታት ይኸፍተሉ፣ ኣብ ታይምሊ፣ ባንጋሊ፣ ማርታሽ፣ ጉጅለራ፣ ካናዳ፣ ፑንጃቢ፣ ኦዲያ፣ ኣልማሽ፣ ሃንዲ፣ ቴሉጉ፣ ማራሊማላ፣ ኣንግሊዝኛን ይነድፍ። ከም ንድዊክ ፓርለር፣ ነቲ ትደሊ ድምጺ ብቛንቋ ደንቢኻ ትገልጽ፡ እቲ ሞዴል ድማ ምስኡ ይመሳሰል። — ድምጺታት ዘይተቐመጡ ዝግባእ። ኣብ AI4Bharat ድምጺ ኮርፖሬሽንታት ንምብጻሕ ናጽነት ድምጺን ድምጺ ድምጺን ኣብ መላእ ንድዊክ ምድረ በዳ ኣመሪካ ዝተማህረ።
AI4Bharat
Apache 2.0
Slow
ta, bn, mr, gu, kn, pa, or, as, hi, te, ml, en
8GB
ኣይ
2x
KhanomTan TTS
Standard
KhanomTan TTS ሓደ ነጻ ታይላንድዊ ጽሑፍ-እቲ-ብቓል-ብቓል ሞዴል እዩ፣ ኣብ YourTTS ቛንቋ-ብዙሕ-ኣርክቴክቸር ዝተሰርሐ። ኣብ CC0 ዝተማህረ፣ ኣብ ታይላንድዊ ኮርፖሬሽናት (TSync) ብቐሊል-ምኽራይ ዝረኸበ፣ ምስ ካልኦት ቋንቋታት ድማ፡ ንጹር ታይላንድዊ ቃላትን ድምጺታትን ይህብ። ሓደ ንጹር፣ ኣብ ንግዲ ዝውሰድ ምርጫ ንታይላንድኛ — እቲ ቋንቋ እቲ ነጻ TTS ሞዴላት ኣብ ዘይኮማርያዊ ውልቀ-መዚታት ጥራይ ይጥቀሙ።
Wannaphong Phatthiyaphaibun
Apache 2.0
Fast
th
2GB
ኣይ
2x
IndexTTS-2
Standard
IndexTTS-2 ሓደ ኣድላይነት ዘለዎ ፅሑፍ-እቲ-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓል-ብ-ቓ
Index Team
Bilibili Model License
Medium
en, zh
4GB
ኣየ
2x
Spark TTS
Standard
Spark TTS ብ SparkAudio ዝርከብ ቴክስት-ወደ-ቃል ሞዴል እዩ፣ እቲ ድምጺ ክሎን ዝገብር፡ ምስ ስነ-ሓሳብ ዝውክል፡ ከምኡ’ውን ስነ-ሓሳብ ዝዛረብ፡ ዝዓበየ ስነ-ሓሳብ ዝህብ እዩ። ብ5 ሰከንዶታት ረቛሒ ድምጺ ጥራይ ብምጥቃም፡ ድምጺ ክሎን ክገብር ይኽእል እዩ። እቲ ድምጺ ክሎን ምስ ስነ-ሓሳብ፣ በብግዜኡን ብባህሪኡን ዝለዓለ ስነ-ሓሳብ ክፈጥር ይኽእል እዩ። Spark TTS ድማ፡ ብተደጋጋሚ ዝግበር ምትእስሳር ክተጥቀመሉ ይኽእል እዩ።
SparkAudio
CC BY-NC-SA 4.0
Medium
en, zh
4GB
ኣየ
2x
GPT-SoVITS
Standard
GPT-SoVITS፡ GPT-style ቋንቋ ሞዴሊንግን SoVITS (Singing Voice Inference via Translation and Synthesis)ን ንምጥቃም ድምጺ ክሎን ክገብር ይኽእል እዩ። ብ5 ሰከንዶታት ናይ ሬፈረንስ ድምጺ፡ ብቐሊሉ ድምጺ ክሎን ክገብር ይኽእል እዩ። ኣብ ስነ-ጽሑፍን ስነ-ግጥምን ድማ ክጥቀመሉ ይኽእል እዩ።
RVC-Boss
MIT
Slow
en, zh, ja, ko
6GB
ኣየ
2x
Orpheus
Standard
Orpheus ሓደ ኣብ ዓብዪ መጠን ዝርከብ ጽሑፍ-እቲ-ብ-ቓል ዝርከብ ሞዴል እዩ፣ እቲ ኣብ ደረጃ ሰብኣዊ ስነ-ሓሳብ ዝርከብ ስነ-ሓሳብ ዝረክብ። ኣብ ልዕሊ 100,000 ሰዓታት ናይ ስነ-ሓሳብ ሓበሬታ ዝተማህረ፣ ኣብ ስነ-ሓሳብ ንምፍጣር ብባህላዊ ስነ-ሓሳብ፣ ብተደጋጋሚ ዝግበር ሓሳባትን ስነ-ሓሳብ ንምፍጣር ዝጥቀመሉ ስነ-ሓሳብ እዩ ዚመርጽ። Orpheus ስነ-ሓሳብ ንምፍጣር ዝኽእለሉ ስነ-ሓሳብ እዩ፣ እቲ ስነ-ሓሳብ ካብ ሰብኣዊ መዝሙራት ዘይተለይን እዩ።
Canopy Labs
Llama 3.2 Community
Medium
en
4GB
ኣይ
2x
Chatterbox
Premium
Chatterbox by Resemble AI ሓደ ናይ ዕለታዊ ተግባር ናይ ድምጺ ክሎኒንግ ሞዴል እዩ። ካብ ሓደ ድምጺ ምሳሌ ምስ ኣዝዩ ርግኣት፡ ድምጺ ንክመልስ ይኽእል እዩ። ድምጺ ንክመልስ ዘይኰነስ፡ ድምጺ ንክረክብን ስነ-ጥበብ ንክገልጽን ስነ-ልቦናዊ ፍልልያት ንክረክብን ይኽእል እዩ። Chatterbox እውን ስነ-ልቦናዊ ፍልልያት ንክመርሕ ይኽእል እዩ።
Resemble AI
MIT
Medium
en
4GB
ኣየ
4x
Tortoise TTS
Premium
Tortoise TTS ሓደ ብቐንዱ ዝሕልወሉ ናይ ድምጺ-ብዙሕ ጽሑፍ-እቲ-ብቓል-ብቓል-ሲስተም እዩ እቲ ድምጺ-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-ብቓል-
James Betker
Apache 2.0
Slow
en
8GB
ኣየ
4x
StyleTTS 2
Premium
StyleTTS 2 ብቐንዱ ኣብ ደረጃ ሰብ TTS ስነ-ጽሑፍ ብምፍጣር፡ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ
Columbia University
MIT
Medium
en
4GB
ኣይ
4x
OpenVoice
Premium
OpenVoice ብ MyShell.ai ድምጺ ክሎን ክገብር ይኽእለ እዩ፣ ብግቡእ ንድምጺ ክቆጣጠር፣ ስነ-ሓሳብ፣ ኣተሓሳስባ፣ ሪትም፣ ዑቕባ፣ ተንተንትን. ድምጺ ከውጽእ ይኽእል እዩ፣ ካብ ሓደ ድምጺ ክሊፕ ዝረሓቐን ቃላት ኣብ ዝተፈላለዩ ቋንቋታት ክፈጥር ይኽእል እዩ፣ እቲ ዝዛረብ ግን ናቱ ናቱ እዩ ዝኸውን። OpenVoice እውን ከም ድምጺ ክተኸተል ይኽእል እዩ፣ ድምጺ ኣብ ግዜ ተግባር ክቕይር ይኽእል እዩ።
MyShell.ai / MIT
MIT
Medium
en, zh, ja, ko, fr, es
4GB
ኣየ
4x
Qwen3 TTS
Standard
Qwen3-TTS ሓደ 1.7 ቢሊዮን ፐራሜትሪ ጽሑፍ-እቲ-ብቓል-ብቓል ሞዴል ካብ فريق Qwen of Alibaba እዩ. ክልተ ሞድታት ይደግም: ድምጺታት ዝተቐመጡ ብጸገም-መምራት (9 ተዛረብቲ),ን ሓደ ውልቀ-ድምጺ ሞድ ዲዛይን ኣብኡ እቲ ድምጺ እትደሊ ኣብ ቋንቋ ስነ-ኣእምሮኣዊ ትገልጽ። 10 ቋንቋታት ይጥቀመሉ ምስ ዝለዓለ ስነ-ኣእምሮኣዊን ስነ-ጽሑፋዊን ፍልጠት
Alibaba (Qwen)
Apache 2.0
Medium
en, zh, ja, ko, de, fr, ru, pt, es, it
7GB
ኣይ
2x
VieNeu-TTS-v2
Standard
VieNeu-TTS-v2 ሓደ 300M ፐራሜትር ቪያነዝ-መጀመሪያ TTS ሞዴል እዩ ኣብ 10,000+ ሰዓታት ናይ ቛንቋ ሁለት ዝተማህረ። 7 ቀጻሊ ቃላትን 3-5 ሰከንዶም ናይ ድምጺ ረቛሒን ይደግም። ኣብ CPU GGUF Q4 inference + ONNX audio decoder ይሰርሕ - GPU ኣይተደልየን፣ ወለዶታት ኣብ ~7 ሰከንዶም ይፍጸም። ኣብ Qwen3 backbone ተተኪኡ እዩ።
Phạm Nguyễn Ngọc Bảo
Apache 2.0
Fast
vi, en
CPU
ኣየ
2x
Sesame CSM
Premium
Sesame CSM (Conversational Speech Model) ሓደ 1 ቢልዮን ፐራሜትሪ ሞዴል እዩ፣ ብቐጥታ ንቃልን ቃላትን ንምፍጣር ዝተሰርሐ። እቲ ሞዴል፡ ንባህላዊ መልክዕ ቃላትን ቃላትን ሰብን ይመርጽ፡ ከምኡ'ውን ግዜን ሰዓትን ዑቕባታትን፣ መልሲታት ንክሻል፣ ስነ-ኣእምሮኣዊ መልሲታትን፣ ቃላትን ቃላትን ዝፍሰስን ዝፍሰስን። CSM ድማ ድምጺ ከም ሰብኣዊ ቃላት ዝኣመሰለ ድምጺ ይፍጠር፡ ልክዕ ከም ሰብኣዊ ቃላት ዘይኮነስ ከም ቃላት ስነ-ኣእምሮኣዊ።
Sesame
Apache 2.0
Slow
en
8GB
ኣይ
4x
Chatterbox Turbo
Standard
Chatterbox Turbo by Resemble AI ሓደ 350M parameter upgrade ን Chatterbox እዩ፣ ክሳብ 6x ርኡይ-ጊዜ ፍጥነት ምስ sub-200ms latency ይህብ። ስነ-ቛንቋዊ tags ከም [laugh], [cough], and [chuckle] ኣብ ጽሑፍ ቀጥታ ይደግም። Perth watermarking ኣብ ኩሉ ዝተፈጠረ ድምጺ ንምድላው ይውስኽ።
Resemble AI
MIT
Fast
en
2GB
ኣየ
2x
VoxCPM
Standard
VoxCPM 1.5 ብ OpenBMB ሓደ ናይ TTS ሞዴል እዩ እቲ ኣብ ቀጥተኛ ቦታ ዝሰርሕ ዘይኮነስ ኣብ ተለፎንታት ዝሰርሕ እዩ። 44.1kHz ድምጺ ዝረኸብ፣ ድምጺ ዝሰርሕ ካብ 3-10 ሰከንድ ዝረክብ፣ ከምኡውን ኣብ ርሑቕ ርሑቕ ዝሰርሕ እዩ። Cross-language cloning ድማ ድምጺ እንግሊዝኛ ናብ ቻይናዊ ቃላት ክትጥቀመሉ ትኽእል ኢኻ፣ ከምኡውን ብቐንዱ ድማ ብቐንዱ ድማ ብቐንዱ ድማ ብቻይናዊ ቃላት ክትጥቀመሉ ትኽእል ኢኻ።
OpenBMB
Apache 2.0
Fast
en, zh
4GB
ኣየ
2x
Kani TTS 2
Free
Kani-TTS-2 ካብ NineNineSix ዝርከብ 400M ርሑቕ ርሑቕ ሞዴል እዩ፣ ኣብ Liquid AI LFM2 backbone ምስ NVIDIA NanoCodec ዝተሰርሐ። ኣብ 3GB VRAM ጥራይ ይሰርሕ፡ ኣብ A100 (RTF 0.2) ድማ ኣብ 2 ሰከንዶም ~10 ሰከንዶም ቃላት ይፍጠር። እቲ ሕጂ ኣብ መላእ ዓለም ዝርከብ ርሑቕ ርሑቕ ረኸብታ `kani-tts-2-en` ንቛንቋ እንግሊዝ ጥራይ እዩ ዝርከብ፡ እቲ ድምጺ ንምጥቃም ዝግበር ምትእስሳር ድማ ኣይፍለጥን። Chatterbox / IndexTTS2 / F5-TTS ንጥቀመሉ፣ ወይ Kokoro / MeloTTS ንኻልኦት ቋንቋታት ዝኸውን ይጥቀመሉ።
NineNineSix
Apache 2.0
Fast
en
3GB
ኣይ
ነጻ
OuteTTS
Free
OuteTTS ዓበይቲ ቋንቋታት ሞዴላት ምስ ጽሑፍ-እቲ-ብ-ቓል-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃሪ-መምሃ
OuteAI
Apache 2.0
Slow
en
2GB
ኣይ
ነጻ
VibeVoice
Standard
VibeVoice ካብ Microsoft፡ 90 ደቂቃታት ዝጸንሐ ቃላትን 4 ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተ
Microsoft
MIT
Fast
en, zh
4GB
ኣይ
2x
Pocket TTS
Free
Pocket TTS ካብ Kyutai (መጻሕፍቲ Moshi) ዝውሃብ 100M ርክብ-ጽሑፍ-ለ-ቃል ሞዴል እዩ፣ እቲ ዝዓበየ ድማ ኣብ CPU ይሰርሕ፣ ድምጺ ክሎን ከቢድ ኣይኰነን፣ ካብ ሓደ ድምጺ ምሳሌ ይደግም፣ ከምኡውን ድምጺ ብባህሪይ ዝሰማማዕ ይፍጠር። እቲ ዝንቡር ሞዴል መጠን፡ ን Edge Deployment (መዳያት ዳርጋ-ኣዝዩ ዝጸንሐ)ን ን Low-Resource Environments (መዳያት ዝዓበየ)ን ይመርሕ።
Kyutai
MIT
Fast
en, fr
1GB
ኣየ
ነጻ
Kitten TTS
Free
Kitten TTS by KittenML እዩ ዝኸውን ናጻ ጽሑፍ-እቲ-ብ-ቓል-ብ-ቓል ሞዴል ዝተመስረተ ኣብ ONNX. ምስ 15M-80M ርክብ (25-80 MB ኣብ ዲስክ), ድምጺ ናይ ላዕለዋይ ጥራሕ ስነ-ጽሑፍ ኣብ CPU ይወሃብ ብዘይ GPU. 8 ድምጺታት ተዋሂቡ፣ ርዝመት ቃላት ዝቕይር፣ ናጻ ጽሑፍ ናይ ቍጽርታት፣ ኣመዛዛሚታት፣ን yunitታት. ልክዕ ን Edge Deployment and low-latency applications.
KittenML
Apache 2.0
Fast
en
0GB
ኣይ
ነጻ
CosyVoice3
Standard
CosyVoice3 እቲ ናይ Alibaba's FunAudioLLM ቡድን ናይ ቅርጺ ለውጢ እዩ. ፍልልይ ናይ ድምጺ ምስ ~150ms ርዝመት፣ ብእምነት/ስፒድ/ውጽኢት ዝተመርኮሰ ቁጥጥር፣ን ተመሳሳሊነት ናይ ድምጺ ምስ zero-shot cloning ዝዓበየ እዩ ዝርከብ። 9 ቋንቋታት 18 ቻይናውያን ቋንቋታት ይደግም። RL-tuned ቅርጺ ድምጺ ድማ ናይ ጥንታዊት ድምጺ ፍልልይ ይህብ።
Alibaba (FunAudioLLM)
Apache 2.0
Fast
en, zh, ja, ko, de, es, fr, it, ru
4GB
ኣየ
2x
NAMAA Saudi TTS
Standard
NAMAA Saudi TTS ሓደ ሱዳን-ኣረባዊ ፋይን-ቶን ናይ Resemble AI's ChatterboxMultilingual እዩ. ብ NAMAA Space ኣብ ናቱት ሱዳን-ዳሌክት ንግግር ዝተማህረ፣ ናይ ስነ-ኣእምሮ ናቱት ስቶንድርድ ሱዳንን ሱዳንን ቃለ-መጠይቕን ዝህብ እዩ፣ እቲ ናይ ሱዳን ስነ-ኣእምሮ ናቱት ስቶንድርድ ሱዳንን ቃለ-መጠይቕን ዝህብ እዩ። እቲ ናይ Chatterbox zero-shot voice cloningን emotion controlን ብተዛማዲ ድምጺ መተካእታ ምትእስሳር ይወሃብ። እቲ መጀመርታ ናይ ሱዳን TTS ኣብ TTS.ai ተተኪኡ።
NAMAA Space
MIT
Medium
ar
6GB
ኣየ
2x
Darwin TTS
Standard
Darwin-TTS-1.7B-Cross by FINAL-Bench እዩ ሓደ ናይ ምርምር ዓይነት ናይ Qwen3-TTS-1.7B 84 talker-FFN tensors (8.6%) ኣብ α=3% ምስቲ ተመሳሳሊ tensors ካብ Qwen3-1.7B-Base ዝተኣሳሰር። እቲ ዝተኣሳሰር ብተደጋጋሚ ዘይተሃድሶ ዝተሰርሐ እዩ፣ ከምኡውን ብተደጋጋሚ ዝፍለጥ ናይ ቋንቋታት ፍልልይ ናይ ድምጺ ክሎን ኣብ ኮሪያን፣ እንግሊዝን፣ ጃፓንን፣ ቻይናን ይፍጠር። ኣብ zero-shot voice-clone mode ይሰርሕ (3 ሰከንድ ረፈረንት ኦዲዮ)
FINAL-Bench
Apache 2.0
Medium
en, ko, ja, zh
7GB
ኣየ
2x
MOSS-TTSD
Standard
MOSS-TTSD v1.0 ካብ OpenMOSS ሓደ 7B ዳይሎግ ቴክስት-ወደ-ቓል ሞዴል እዩ እቲ ቃለ-መጠይቕ ካብ ሓደ ድምጺ ዝቑጸር ረድኤት ዝጅምር። ክሳብ 5 ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተዛማዲ ተ
OpenMOSS
Apache 2.0
Medium
en, zh
12GB
ኣየ
2x
Ming-Omni TTS
Free
Ming-omni-tts-0.5B ብ inclusionAI ዝርከብ ናይ ቃላት ሞዴል ብቐጥታ ዝተሰርሐ ኣብ BailingMM ዝርከብ ናይ ደገ መስመር ምስ ሓደ Patch-by-Patch ዝርከብ ናይ ድምጺ ዲኮደር እዩ። 44.1kHz ምርካብ (ብCD ጥራሕ ዝስዕብ) ይህብ፣ ክሎን ድምጺ ካብ 3+ ሰከንድ ረድኤት ይደግም፣ ከምኡ’ውን ብ JSON ዝተሰርሐ ናይ ስነ-ልቦና / ተናጋሪ / BGM ቁጥጥር ይውስን።
inclusionAI
Apache 2.0
Medium
en, zh
3GB
ኣየ
ነጻ
MOSS-TTS Nano
Free
MOSS-TTS-Nano-100M፡ ናይ MOSS-TTS ቤተሰብ 100M-parameter ቅርጺ፡ ናይ 8B ሞዴል ረኽቢ ጥራሕ፡ ኣብ ~80x ዝንቡር ክብደት፡ ዝንቡር VRAM፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር 20-language reach፡ ዝንቡር
OpenMOSS
Apache 2.0
Fast
en, zh, de, es, fr, ja, it, ko, ru, ar, pt
2GB
ኣየ
ነጻ
Kokoro
ነጻ
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
Hexgrad
Apache 2.0
Fast
Piper
ነጻ
Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
Rhasspy
MIT
Fast
VITS
ነጻ
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
Jaehyeon Kim et al.
MIT
Fast
MeloTTS
ነጻ
MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
MyShell.ai
MIT
Fast
Kani TTS 2
ነጻ
Kani-TTS-2 by NineNineSix is an ultra-lightweight 400M parameter model built on a Liquid AI LFM2 backbone with NVIDIA NanoCodec. It runs in just 3GB VRAM and produces ~10 seconds of speech in ~2 seconds on an A100 (RTF 0.2). The current public release ships an English-only `kani-tts-2-en` checkpoint and does not expose the speaker-embedding hook needed for voice cloning — use Chatterbox / IndexTTS2 / F5-TTS for cloning, or Kokoro / MeloTTS for non-English.
NineNineSix
Apache 2.0
Fast
OuteTTS
ነጻ
OuteTTS extends large language models with text-to-speech capabilities while preserving the original architecture. It supports multiple backends including llama.cpp (CPU/GPU), Hugging Face Transformers, ExLlamaV2, VLLM, and even browser inference via Transformers.js. Features zero-shot voice cloning through speaker profiles saved as JSON.
OuteAI
Apache 2.0
Slow
Pocket TTS
ነጻ
Pocket TTS by Kyutai (creators of Moshi) is a compact 100M parameter text-to-speech model that punches well above its weight. It runs efficiently on CPU, supports zero-shot voice cloning from a single audio sample, and produces natural-sounding speech. The small model size makes it ideal for edge deployment and low-resource environments.
Kyutai
MIT
Fast
Kitten TTS
ነጻ
Kitten TTS by KittenML is an ultra-lightweight text-to-speech model built on ONNX. With variants from 15M to 80M parameters (25-80 MB on disk), it delivers high-quality voice synthesis on CPU without requiring a GPU. Features 8 built-in voices, adjustable speech speed, and built-in text preprocessing for numbers, currencies, and units. Ideal for edge deployment and low-latency applications.
KittenML
Apache 2.0
Fast
Ming-Omni TTS
ነጻ
Ming-omni-tts-0.5B by inclusionAI is a compact omni-modal speech model built on the BailingMM dense backbone with a Patch-by-Patch flow-matching audio decoder. Delivers 44.1kHz output (near CD quality), supports zero-shot voice cloning from a 3+ second reference, and includes built-in emotion / dialect / BGM control via JSON instructions. Excellent stability — 0.83% WER on Chinese benchmarks.
inclusionAI
Apache 2.0
Medium
MOSS-TTS Nano
ነጻ
MOSS-TTS-Nano-100M is OpenMOSS's compact 100M-parameter variant of the MOSS-TTS family, sharing the delay-transformer architecture. Trades the 8B model's peak quality for ~80x smaller weights and dramatically lower per-request VRAM, making it suitable for free-tier and high-throughput deployments. Same 20-language reach.
OpenMOSS
Apache 2.0
Fast
Bark
ቍጽሪ
Bark by Suno is a transformer-based text-to-audio model that can generate highly realistic, multilingual speech as well as other audio like music, background noise, and sound effects. It can produce nonverbal communications like laughing, sighing, and crying. Bark supports over 100 speaker presets and 13+ languages.
Suno
MIT
Slow
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
ኣይ
Bark Small
ቍጽሪ
Bark Small is a distilled version of the Bark model that trades some audio quality for significantly faster inference speeds and lower memory requirements. It retains Bark's ability to generate speech with emotions, laughter, and multiple languages.
Suno
MIT
Medium
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
ኣይ
CosyVoice 2
ቍጽሪ
CosyVoice 2 by Alibaba's Tongyi Lab achieves human-comparable speech quality with extremely low latency, making it ideal for real-time applications. It uses a finite scalar quantization approach for streaming synthesis and supports zero-shot voice cloning, cross-lingual synthesis, and fine-grained emotion control. It outperforms many commercial TTS systems in subjective evaluations.
Alibaba (Tongyi Lab)
Apache 2.0
Medium
en, zh, ja, ko, fr, de, it, es
ኣየ
Dia TTS
ቍጽሪ
Dia by Nari Labs is a 1.6B parameter text-to-speech model designed specifically for generating multi-speaker dialogue. It can produce natural-sounding conversations between two speakers with appropriate turn-taking, prosody, and emotional expression. Dia is perfect for creating podcast-style content, audiobook dialogues, and interactive conversational AI.
Nari Labs
Apache 2.0
Medium
en
ኣይ
Parler TTS
ቍጽሪ
Parler TTS is a text-to-speech model that uses natural language voice descriptions to control the generated speech. Instead of selecting from preset voices, you describe the voice you want (e.g., "a warm female voice with a slight British accent, speaking slowly and clearly") and Parler generates speech matching that description. This makes it uniquely flexible for creative applications.
Hugging Face
Apache 2.0
Medium
en
ኣይ
Indic Parler TTS
ቍጽሪ
Indic Parler TTS by AI4Bharat extends the Parler architecture to Indian languages, generating natural speech in Tamil, Bengali, Marathi, Gujarati, Kannada, Punjabi, Odia, Assamese, Hindi, Telugu, Malayalam and English. Like Parler, you describe the voice you want in plain language and the model matches it — no preset voices required. Trained on AI4Bharat speech corpora for authentic pronunciation and prosody across the Indian subcontinent.
AI4Bharat
Apache 2.0
Slow
ta, bn, mr, gu, kn, pa, or, as, hi, te, ml, en
ኣይ
KhanomTan TTS
ቍጽሪ
KhanomTan TTS is an open Thai text-to-speech model built on the YourTTS multilingual architecture. Trained on CC0 and permissively-licensed Thai corpora (TSync) alongside several other languages, it delivers natural Thai speech with multiple speaker voices. A clean, commercially-usable option for Thai — the language most open TTS models only cover under non-commercial licenses.
Wannaphong Phatthiyaphaibun
Apache 2.0
Fast
th
ኣይ
IndexTTS-2
ቍጽሪ
IndexTTS-2 is an advanced text-to-speech system that excels at zero-shot voice synthesis with fine-grained emotion control. It can generate speech with specific emotional tones like happy, sad, angry, or fearful without requiring emotion-specific training data. The model uses emotion vectors to precisely control the emotional expression of generated speech.
Index Team
Bilibili Model License
Medium
en, zh
ኣየ
Spark TTS
ቍጽሪ
Spark TTS by SparkAudio is a text-to-speech model that combines voice cloning with controllable emotion and speaking style. Using just 5 seconds of reference audio, it can clone a voice and then generate speech with different emotions, speeds, and styles while maintaining the cloned voice identity. Spark TTS uses a prompt-based control system.
SparkAudio
CC BY-NC-SA 4.0
Medium
en, zh
ኣየ
GPT-SoVITS
ቍጽሪ
GPT-SoVITS combines GPT-style language modeling with SoVITS (Singing Voice Inference via Translation and Synthesis) for powerful few-shot voice cloning. With as little as 5 seconds of reference audio, it can accurately clone a voice and generate new speech while preserving the speaker's unique characteristics. It excels at both speaking and singing voice synthesis.
RVC-Boss
MIT
Slow
en, zh, ja, ko
ኣየ
Orpheus
ቍጽሪ
Orpheus is a large-scale text-to-speech model that achieves human-level emotional expression. Trained on over 100,000 hours of diverse speech data, it excels at generating speech with natural emotions, emphasis, and speaking styles. Orpheus can produce speech that is virtually indistinguishable from human recordings.
Canopy Labs
Llama 3.2 Community
Medium
en
ኣይ
Qwen3 TTS
ቍጽሪ
Qwen3-TTS is a 1.7 billion parameter text-to-speech model from Alibaba's Qwen team. It supports two modes: preset voices with emotion control (9 speakers), and a unique voice design mode where you describe the voice you want in natural language. It covers 10 languages with high expressiveness and natural prosody.
Alibaba (Qwen)
Apache 2.0
Medium
en, zh, ja, ko, de, fr, ru, pt, es, it
ኣይ
VieNeu-TTS-v2
ቍጽሪ
VieNeu-TTS-v2 is a 300M parameter Vietnamese-first TTS model trained on 10,000+ hours of bilingual data. It supports seamless en-vi code-switching, 7 preset voices spanning Northern and Southern accents, and instant voice cloning from 3-5 seconds of reference audio. Runs entirely on CPU via GGUF Q4 inference + ONNX audio decoder — no GPU needed, generations finish in ~7 seconds. Built on a Qwen3 backbone.
Phạm Nguyễn Ngọc Bảo
Apache 2.0
Fast
vi, en
ኣየ
Chatterbox Turbo
ቍጽሪ
Chatterbox Turbo by Resemble AI is a 350M parameter upgrade to Chatterbox, delivering up to 6x real-time speed with sub-200ms latency. It supports paralinguistic tags like [laugh], [cough], and [chuckle] directly in text. Includes Perth watermarking on all generated audio for provenance tracking.
Resemble AI
MIT
Fast
en
ኣየ
VoxCPM
ቍጽሪ
VoxCPM 1.5 by OpenBMB is a novel tokenizer-free TTS model that operates in continuous space rather than discrete tokens. It produces high-fidelity 44.1kHz audio, supports zero-shot voice cloning from 3-10 seconds, and maintains consistency across paragraphs. Cross-language cloning lets you apply an English voice to Chinese speech and vice versa.
OpenBMB
Apache 2.0
Fast
en, zh
ኣየ
VibeVoice
ቍጽሪ
VibeVoice from Microsoft generates long-form speech up to 90 minutes with support for 4 simultaneous speakers, making it ideal for podcasts and dialogues. The Realtime 0.5B variant achieves ~300ms latency for interactive use. Supports speaker tags for multi-turn dialogue generation.
Microsoft
MIT
Fast
en, zh
ኣይ
CosyVoice3
ቍጽሪ
CosyVoice3 is the latest evolution from Alibaba's FunAudioLLM team. It features bi-streaming inference with ~150ms latency, instruction-based control for emotion/speed/volume, and improved speaker similarity for zero-shot cloning. Supports 9 languages plus 18 Chinese dialects. RL-tuned variant delivers state-of-the-art prosody.
Alibaba (FunAudioLLM)
Apache 2.0
Fast
en, zh, ja, ko, de, es, fr, it, ru
ኣየ
NAMAA Saudi TTS
ቍጽሪ
NAMAA Saudi TTS is a Saudi Arabic fine-tune of Resemble AI's ChatterboxMultilingual. Trained by NAMAA Space on authentic Saudi-dialect speech, it produces natural Modern Standard Arabic and Saudi colloquial pronunciation that generic multilingual models cannot match. Inherits Chatterbox's zero-shot voice cloning and emotion control via reference audio prompts. The first open-weights Arabic TTS deployed on TTS.ai.
NAMAA Space
MIT
Medium
ar
ኣየ
Darwin TTS
ቍጽሪ
Darwin-TTS-1.7B-Cross by FINAL-Bench is a research variant of Qwen3-TTS-1.7B where 84 talker-FFN tensors (8.6%) are blended at α=3% with the matching tensors from Qwen3-1.7B-Base. The blend is built without retraining and produces noticeably crisper cross-lingual voice cloning across Korean, English, Japanese, and Chinese. Operates in zero-shot voice-clone mode (3 seconds reference audio).
FINAL-Bench
Apache 2.0
Medium
en, ko, ja, zh
ኣየ
MOSS-TTSD
ቍጽሪ
MOSS-TTSD v1.0 from OpenMOSS is a 7B dialogue text-to-speech model that continues conversations from a short audio prompt. Supports up to 5 simultaneous speakers via [S1]/[S2] tags, zero-shot voice cloning from 3-10s reference audio, and up to 60 minutes of coherent multi-turn dialogue across 20 languages. Distinct from MOSS-TTS — TTSD is specialized for podcast/audiobook/dubbing workflows.
OpenMOSS
Apache 2.0
Medium
en, zh
ኣየ
ቍጽሪ መሳርሒታት
| ምሳሌ | ደራሲ: | ቍጽሪ | ጥራሕ: | ፍጥነት | ቋንቋታት | ድምጺ | VRAM | ውልቀ-መዚ: | ዋጋ | |
|---|---|---|---|---|---|---|---|---|---|---|
| Kokoro | Hexgrad | Free | Fast | 8 | 1.5GB | Apache 2.0 | ነጻ | መተግበሪያ | ||
| Piper | Rhasspy | Free | Fast | 42 | 0 (CPU only) | MIT | ነጻ | መተግበሪያ | ||
| VITS | Jaehyeon Kim et al. | Free | Fast | 11 | 1GB | MIT | ነጻ | መተግበሪያ | ||
| MeloTTS | MyShell.ai | Free | Fast | 6 | 0.5GB (GPU optional) | MIT | ነጻ | መተግበሪያ | ||
| Bark | Suno | Standard | Slow | 13 | 5GB | MIT | 2 | መተግበሪያ | ||
| Bark Small | Suno | Standard | Medium | 13 | 2GB | MIT | 2 | መተግበሪያ | ||
| CosyVoice 2 | Alibaba (Tongyi Lab) | Standard | Medium | 8 | 4GB | Apache 2.0 | 2 | መተግበሪያ | ||
| Dia TTS | Nari Labs | Standard | Medium | 1 | 4GB | Apache 2.0 | 2 | መተግበሪያ | ||
| Parler TTS | Hugging Face | Standard | Medium | 1 | 4GB | Apache 2.0 | 2 | መተግበሪያ | ||
| Indic Parler TTS | AI4Bharat | Standard | Slow | 12 | 8GB | Apache 2.0 | 2 | መተግበሪያ | ||
| KhanomTan TTS | Wannaphong Phatthiyaphaibun | Standard | Fast | 1 | 2GB | Apache 2.0 | 2 | መተግበሪያ | ||
| IndexTTS-2 | Index Team | Standard | Medium | 2 | 4GB | Bilibili Model License | 2 | መተግበሪያ | ||
| Spark TTS | SparkAudio | Standard | Medium | 2 | 4GB | CC BY-NC-SA 4.0 | 2 | መተግበሪያ | ||
| GPT-SoVITS | RVC-Boss | Standard | Slow | 4 | 6GB | MIT | 2 | መተግበሪያ | ||
| Orpheus | Canopy Labs | Standard | Medium | 1 | 4GB | Llama 3.2 Community | 2 | መተግበሪያ | ||
| Chatterbox | Resemble AI | Premium | Medium | 1 | 4GB | MIT | 4 | መተግበሪያ | ||
| Tortoise TTS | James Betker | Premium | Slow | 1 | 8GB | Apache 2.0 | 4 | መተግበሪያ | ||
| StyleTTS 2 | Columbia University | Premium | Medium | 1 | 4GB | MIT | 4 | መተግበሪያ | ||
| OpenVoice | MyShell.ai / MIT | Premium | Medium | 6 | 4GB | MIT | 4 | መተግበሪያ | ||
| Qwen3 TTS | Alibaba (Qwen) | Standard | Medium | 10 | 7GB | Apache 2.0 | 2 | መተግበሪያ | ||
| VieNeu-TTS-v2 | Phạm Nguyễn Ngọc Bảo | Standard | Fast | 2 | CPU | Apache 2.0 | 2 | መተግበሪያ | ||
| Sesame CSM | Sesame | Premium | Slow | 1 | 8GB | Apache 2.0 | 4 | መተግበሪያ | ||
| Chatterbox Turbo | Resemble AI | Standard | Fast | 1 | 2GB | MIT | 2 | መተግበሪያ | ||
| VoxCPM | OpenBMB | Standard | Fast | 2 | 4GB | Apache 2.0 | 2 | መተግበሪያ | ||
| Kani TTS 2 | NineNineSix | Free | Fast | 1 | 3GB | Apache 2.0 | ነጻ | መተግበሪያ | ||
| OuteTTS | OuteAI | Free | Slow | 1 | 2GB | Apache 2.0 | ነጻ | መተግበሪያ | ||
| VibeVoice | Microsoft | Standard | Fast | 2 | 4GB | MIT | 2 | መተግበሪያ | ||
| Pocket TTS | Kyutai | Free | Fast | 2 | 1GB | MIT | ነጻ | መተግበሪያ | ||
| Kitten TTS | KittenML | Free | Fast | 1 | 0GB | Apache 2.0 | ነጻ | መተግበሪያ | ||
| CosyVoice3 | Alibaba (FunAudioLLM) | Standard | Fast | 9 | 4GB | Apache 2.0 | 2 | መተግበሪያ | ||
| NAMAA Saudi TTS | NAMAA Space | Standard | Medium | 1 | 6GB | MIT | 2 | መተግበሪያ | ||
| Darwin TTS | FINAL-Bench | Standard | Medium | 4 | 7GB | Apache 2.0 | 2 | መተግበሪያ | ||
| MOSS-TTSD | OpenMOSS | Standard | Medium | 2 | 12GB | Apache 2.0 | 2 | መተግበሪያ | ||
| Ming-Omni TTS | inclusionAI | Free | Medium | 2 | 3GB | Apache 2.0 | ነጻ | መተግበሪያ | ||
| MOSS-TTS Nano | OpenMOSS | Free | Fast | 11 | 2GB | Apache 2.0 | ነጻ | መተግበሪያ |
እቲ ቐዳማይ ቴክስት-ወደ-ቃል-ፕላትፎርም
TTS.ai እንታይ እዩ ዝገብር?
TTS.ai ኣብ ሓደ ሓደ ቐላል ዝኸውን መረብ መተግበሪ፡ ኣብ ዓለም ዝዓበዩ ናይ መሰረታዊ ጽሑፍ-እቲ-ብቓል-ብቓል ሞዴላትን TTS.ai ኣብ ሓደ ሓደ ቐላል ዝኸውን መረብ መተግበሪ፡ ኣብ ሓደ ሓደ ድምጺ-ኢንጂን ዝዕቅብ ናይ ባዕሉ መሳርሒታትን TTS.ai ኣብ 20+ ሞዴላትን ካብቶም ቀዳሞት ናይ ምርምር ላቦራቶሪታት TTS.ai ይቕበልካ፡ ኣብ ላቦራቶሪታት ኰኪ፣ ማይሰል፣ ኣምፊዮን፣ NVIDIA፣ ሱኖ፣ ሑጊንግፌስ፣ ዩኒቨርሲቲ ቺንግሁኣ፣ ወዘተ.
ኩሉ ሞዴል ኣብ MIT, Apache 2.0, ወይ ኣብ ካልእ ተመሳሳሊ ውልቀ-መዚ ዝሕለፍ ኮይኑ፡ ንተጠቃሚ ሙሉእ መሰል ንግዲ ንምርካብ ኣብ ፕሮጀክትታትካ ዝውሃብ ድምጺ ይጥቀመሉ እዩ። TTS.ai ንኹሉ ዓይነት ተግባር ዝውሃብ ሞዴል የብሉን።
ነጻ ሞዴላት, No Account ዝግባእ
ብቐጥታ ምስ ሰለስተ ነጻ TTS ሞዴላት መጀመርያ: Piper (ultra-fast, lightweight), VITS (high-quality neural synthesis), and MeloTTS (multi-language support). No sign-up, no credit card, no limits on generations. ነጻ ሞዴላት እንግሊዝኛን ካልእን ቋንቋታትን ብናይ ስነ-ጥበባዊ-ድምጺ ውጽኢት ዝስዕብ ንብዙሓት ኣፕሊኬሽናት ይደግፉ
GPU-ኣዝዩ ዝደፍአ ምርምር
ኵሎም ሞዴላት TTS ኣብ ዝተሓላለፉ NVIDIA GPUs ይሰርሑ፡ ንምፍጣር ግዜታት በብግዜኡ ዝቕጽል፡ ምትእምማን ዘለዎም. ነጻ ሞዴላት፡ ኣብ ዝያዳ 2 ሰከንዶም ድምጺ ይፍጠሩ። ስታንደርድ ሞዴላት፡ ከም Kokoro፣ CosyVoice 2፣ን Bark፡ ኣብ 3-5 ሰከንዶም ድምጺ ይፍጠሩ። ፕሪሚየም ሞዴላት፡ ከም Tortoise፣ Chatterbox፣ ኣብ 5-15 ሰከንዶም ድምጺ ይፍጠሩ፡ ብጽሑፍ ዝርከብ ርዝመት መሰረት ዝገበረ።
30+ ቋንቋታት ተተኪኦም
ድምጺ ኣብ ልዕሊ 30 ቋንቋታት ምጽሓፍ፡ ኣብኣቶም ኣንግሊዝኛ፣ ስፓኒሽ፣ ፈረንሳዊ፣ ጀርመንኛ፣ ኢጣልያንኛ፣ ፖርቱጋሊኛ፣ ቻይንኛ፣ ጃፓንኛ፣ ኮርያንኛ፣ ዓረብኛ፣ ሃንዲሽ፣ ሩስያ፣ንብዙሕ ካልእን። ዓበይቲ ሞዴላት ድምጺ ኣብ ልዕሊ 30 ቋንቋታት ምጽሓፍ ይደግፉ፣ ማለት፡ ድምጺ ኣብ ልዕሊ 30 ቋንቋታት ምጽሓፍ ትኽእል ኢኻ።
መተግበሪያታት
TTS.ai ኣብ መተግበሪያታትካ ምስ REST API ናጻ ምትእስሳር TTS.ai. ሓደ መጨረሻ ን 20+ ሞዴላት. Python, JavaScript, cURL, Go SDKs. ደገፍ ስትሪምינג ን ረጅም ጊዜ መተግበሪያታት. ዑቕባ ምትእስሳር ን ውጽኢት ናጻ ምትእስሳር. Webhooks for async notifications.
ሕቶታት ዝቐረቡሉ ግዜ
እንታይ ክንገብር ንኽእል? ምላሽካ ንዘሎና ሕቶታት ንምፍታሕ ይሕግዘና እዩ።
መጀመርታ ጽሑፍ ናብ ቃላት ምቕያር
TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai ንጥፈታት TTS.ai