TTS Arena - AI Voice Model Leaderboard

E whakataurite ana ki ngā tauira tuhi-ki-te-kōrero 20+: ngā tohutoro ā-kāwanatanga, ngā arotakenga hapori, me te whakataurite taha-ki-te-taha.

He whakataurite taha-ki-te taha

E tātai ana i te kupu, e kōwhiri ana i ngā tauira e rua, ā, ka whakataurite i ngā hua. Kāore e hiahiatia ana ngā tauira taumata-whāiti.

Ka mahi ngā tauira wātea me te kore tatauranga. Whakawhanake Hei whakataurite i ngā tauira utu.

Kāhua Leaderboard

# Kāhua Kāwanatanga Te iwi whānui Ko tātou arotakenga Āhuatanga Te āhua
1
Kokoro
Kokoro
Lightweight 82M parameter model delivering studio-quality speech with blazing-fast inference.
82M 1200h 2024
4.8 /5 5.0 /5
1 Ko te pōti
fast Free
2
CosyVoice 2
CosyVoice 2
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
300M 200000h 2024
4.26 /5 Kāore he pōti
medium Standard
3
Chatterbox
Chatterbox
State-of-the-art zero-shot voice cloning with emotion control from Resemble AI.
300M 2025
4.25 /5 Kāore he pōti
medium Premium
4
StyleTTS 2
StyleTTS 2
Human-level text-to-speech through style diffusion and adversarial training.
100M 585h 2024
4.23 /5 Kāore he pōti
medium Premium
5
Piper
Piper
A fast, local neural text to speech system optimized for Raspberry Pi and embedded devices.
15M 2023
4.15 /5 Kāore he pōti
fast Free
6
MeloTTS
MeloTTS
High-quality multilingual text-to-speech that runs on CPU with minimal latency.
25M 2024
4.13 /5 Kāore he pōti
fast Free
7
Dia TTS
Dia TTS
Multi-speaker dialog generation model that creates natural conversations between speakers.
1.6B 2024
4.09 /5 Kāore he pōti
medium Standard
8
VITS
VITS
Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech.
25M 585h 2021
4.0 /5 Kāore he pōti
fast Free
9
Orpheus
Orpheus
Human-level emotional TTS model trained on 100K hours of speech data.
3B 100000h 2025
4.0 /5 Kāore he pōti
medium Standard
10
OpenVoice
OpenVoice
Instant voice cloning with granular control over style, emotion, and accent.
300M 2024
4.0 /5 Kāore he pōti
medium Premium
11
IndexTTS-2
IndexTTS-2
Zero-shot TTS with fine-grained emotion control and high expressiveness.
300M 2025
3.91 /5 Kāore he pōti
medium Standard
12
Spark TTS
Spark TTS
Voice cloning TTS with controllable emotion and speaking style via prompts.
500M 2025
3.9 /5 Kāore he pōti
medium Standard
13
Parler TTS
Parler TTS
Describe the voice you want in natural language and Parler generates matching speech.
880M 45000h 2024
3.83 /5 Kāore he pōti
medium Standard
14
Tortoise TTS
Tortoise TTS
Multi-voice text-to-speech focused on quality with autoregressive architecture.
400M 50000h 2022
3.7 /5 Kāore he pōti
slow Premium
15
Bark
Bark
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
350M 100000h 2023
3.57 /5 Kāore he pōti
slow Standard
16
Bark Small
Bark Small
Lighter version of Bark with faster inference and lower memory usage.
150M 100000h 2023
Kāore he pōti
medium Standard
17
GPT-SoVITS
GPT-SoVITS
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
200M 2024
Kāore he pōti
slow Standard
18
Qwen3 TTS
Qwen3 TTS
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
1.7B 2025
Kāore he pōti
medium Standard

Ko ngā pūkete taurite mōhiohio

E toru nga ahu o ngā pūkete tohu TTS.ai whai mana: te mātauranga, te tika, me te tere.

KokoroKokoro

Free
Māoritanga 4.8/5
Te tika 4.7/5
Āhuatanga 4.9/5
Ko te katoa 4.8/5

CosyVoice 2CosyVoice 2

Standard
Māoritanga 4.5/5
Te tika 4.4/5
Āhuatanga 3.8/5
Ko te katoa 4.26/5

ChatterboxChatterbox

Premium
Māoritanga 4.7/5
Te tika 4.5/5
Āhuatanga 3.4/5
Ko te katoa 4.25/5

StyleTTS 2StyleTTS 2

Premium
Māoritanga 4.5/5
Te tika 4.3/5
Āhuatanga 3.8/5
Ko te katoa 4.23/5

PiperPiper

Free
Māoritanga 3.5/5
Te tika 4.2/5
Āhuatanga 4.95/5
Ko te katoa 4.15/5

MeloTTSMeloTTS

Free
Māoritanga 3.8/5
Te tika 4.1/5
Āhuatanga 4.6/5
Ko te katoa 4.13/5

Dia TTSDia TTS

Standard
Māoritanga 4.6/5
Te tika 4.3/5
Āhuatanga 3.2/5
Ko te katoa 4.09/5

VITSVITS

Free
Māoritanga 3.4/5
Te tika 4.0/5
Āhuatanga 4.8/5
Ko te katoa 4.0/5

OrpheusOrpheus

Standard
Māoritanga 4.3/5
Te tika 4.1/5
Āhuatanga 3.5/5
Ko te katoa 4.0/5

OpenVoiceOpenVoice

Premium
Māoritanga 4.0/5
Te tika 4.1/5
Āhuatanga 3.9/5
Ko te katoa 4.0/5

IndexTTS-2IndexTTS-2

Standard
Māoritanga 4.3/5
Te tika 4.1/5
Āhuatanga 3.2/5
Ko te katoa 3.91/5

Spark TTSSpark TTS

Standard
Māoritanga 4.2/5
Te tika 4.0/5
Āhuatanga 3.4/5
Ko te katoa 3.9/5

Parler TTSParler TTS

Standard
Māoritanga 4.1/5
Te tika 3.9/5
Āhuatanga 3.4/5
Ko te katoa 3.83/5

Tortoise TTSTortoise TTS

Premium
Māoritanga 4.6/5
Te tika 4.4/5
Āhuatanga 1.8/5
Ko te katoa 3.7/5

BarkBark

Standard
Māoritanga 4.2/5
Te tika 3.8/5
Āhuatanga 2.5/5
Ko te katoa 3.57/5

Aratuka tohutoro

Ka whakaritea te whakamātautau

  • Pāpāho: 4x NVIDIA Tesla P40 (24GB VRAM ia), 96GB katoa
  • Ka whakamātauria te kupu: Ko ngā wāhanga paerewa e 5 e taupoki ana i ngā tauira kōrero rerekē (whakaari, kōrero, hangarau, ā-hinengaro, ā-hinengaro).
  • Arotakenga: Ka whakakotahitia ngā tauine aunoa (ngā aromatawai MOS, WER, RTF) me ngā whakamātautau whakarongo tangata.
  • Ka haere: 10 ngā wā i whakamātauria ia tauira i ia whakawhitinga, i whakawaengatia ngā pūkete

Whakataurite i ngā paerewa

  • Ko te māoritanga (40%): Prosody, intonation, rhythm, emotion - he pēhea te āhua o te tangata?
  • Te tika (30%): Te tika o te kōrero, te ōrautanga hē o te kupu, te mārama
  • Te tere (30%): Taumaha ake te āhuatanga wā tūturu (wae reo / wae whakatupuranga). Tiketike ake = tere ake.
  • Ko te katoa: Tauwaenga taumaha: 0.4 x Naturalness + 0.3 x Te tika + 0.3 x Te tere

Kitenga: Ko ngā tohu tohu e tohu ana i te mahi i runga i a tātau pūrere me ngā kupu whakamātautau tauwhāiti. Ka rerekē pea te āhuatanga o te ao tūturu i runga anō i te kupu tāuru, te reo, me te kōwhiringa reo. Ko ngā arotakenga hapori e whakarato ana i tētahi tohu tāpiri i runga anō i ngā whakamahinga pono rerekē.

E pā ana ngā pātai

Ko te TTS Arena he papa whakahaere e whakarārangi ana i ngā tauira kupu-ki-rongo AI i runga anō i ngā whakamātautau tohutoro ā-kāwanatanga me ngā arotakenga hapori. E whakataurite ana i ngā tauira taha-ki-taha, e whakarongo ana i ngā tauira, me te pōti mō ērā e pai ake ana ki a koe.

Ka whakahaeretia e tātau ngā whakamātautau paerewa i ia tauira mā te whakamahi i ngā wāhanga kupu ōrite, ngā pūrere, me ngā paerewa arotake. Ko ngā pūkete e āpiti ana i te mātauranga (pehea te āhua o te tangata), te tika (whakaahua me te mōhiotanga), me te tere (wā whakawhanaketanga). Ka whakamahi ngā whakamātautau katoa i tātau pūnaha GPU me ngā NVIDIA Tesla P40 GPUs.

Ināianei! Ki te kōwhiri i ngā whetū e tata ana ki tētahi tauira hei arotake i a ia mai i te 1 ki te 5. Me tāuru koe kia pōti ai. Ka āwhina tātou arotakenga ki te uara iwi whānui e whakaaturia ana i runga i te arotakenga. Ka taea e koe te whakarerekē i tōtou arotakenga i ngā wā katoa.

E tuhituhi ana i tētahi kupu, e tīpako ana i ngā tauira e rua, ā, ka kōwhiria te Whakapae. Ko ngā tauira e rua e whakanao ana i te kōrero mai i te kupu ōrite i te wā kotahi. Heoi anō, ka pōti mō te mea pai ake. Mā tēnei whakataurite matapōkere e āwhina ki te tohu i te tauira pai rawa mō ngā hiahia motuhake o koe.

Ko te tūturu e ine ana i te āhua o te tangata o ngā oro kōrero (pūāhua, tono, whakateretanga). Ko te tika e ine ana i te tika me te mārama o te kōrero. Ko te tere e ine ana i te tere o te tauira e whakaputa ai i te oro e pā ana ki te wā tūturu. Ko te ahuwhānui he mārō o ngā inenga katoa.

Ko ngā tauira kāore i te tohu taurite kua tāpirihia hou, ā, e tūmanako ana kia whakamātauria, e hiahiatia ana rānei he whakaritenga motuhake (pēnei i ngā tohu uru ki te kēti) e tūmanakohia ana.

Ka whakahōutia ngā tohu ā-kāwanatanga ina whiwhi whakahōutanga nui ngā tauira, ina tāpirihia rānei ngā tauira hōu. Ko ngā arotakenga hapori e whakahōutia ana i te wā tūturu i te pōti o te kaimahi. Ko te raraunga tauwhāiti e penapenatia ana mō ngā minu 5 mō te mahi.

Ko ngā tauira wāteatanga (Kokoro, Piper, VITS, MeloTTS) e utu ana i ngā pūtea 0. Ko ngā tauira paerewa e utu ana i ngā pūtea 2 ia 1,000 pūāhua. Ko ngā tauira utu 4 ngā pūtea ia 1,000 pūāhua, ā, ko te nuinga o te wā e whakarato ana i te āhuatanga tiketike rawa, ngā āhuatanga ahurei rānei pēnei i te tārua reo.

Mō te nuinga o ngā take whakamahi, ka tukua e Kokoro (tūnga wātea) he āhuatanga pai. Mō te tārua reo, kia whakamātauria te Chatterbox, te CosyVoice 2 rānei. Mō ngā ihirangi reo maha, ko MeloTTS, ko CosyVoice 2 rānei. Mō te kōrero ā-waha, ko Bark, ko Dia rānei. Ka whakamahia te utauta whakataurite hei whakamātau me tō rātou kupu tauwhāiti.

Ināianei, ka taea e koe te waihanga me te whakataurite i te oro mai i ngā tauira e rua me te kore taupānga mā te whakamahi i ngā tauira wāteatanga. E hiahiatia ana e te pōti i ngā tauira he pūtea wāteatanga. E hiahiatia ana e ngā whakataurite tauira Premium he pūtea.

E ngana ana tātau ki te āheitanga mā te whakamahi i ngā kupu whakamātautau paerewa, ngā pūrere ōrite, me ngā paearu arotakenga ōrite puta noa i ngā tauira katoa. Ko ngā arotakenga hapori e whakarato ana i tētahi tohu takitahi tāpiri. E whakaahuatia ana tātau aratuka i roto i te wāhanga Aratuka Taurite i raro nei.

Ko ngā tauira i whakawāteatia tuatahitia e te pūngao taurite āhuwhānui, kātahi e whakawāteatia ana e te iwi i runga i te whakawāteatanga. Ko ngā tauira kāore i te tauwhāiti i whakawāteatia i raro iho i ērā me ngā tauwhāiti, i whakaritea e te whakawāteatanga iwi.
5.0/5 (1)

Ki te kimi i tōna oro pai rawa

Whakamātau i tētahi tauira wātea me Kokoro, Piper, VITS, MeloTTS rānei. Kāore he tatau e hiahiatia ana.