AI Te kupu ki te kōrero

Ka tahuri te kupu ki roto i te kōrero māori me ngā tauira pūtake tūwhera o AI. Whai te whakamahi, kāore he tatau e hiahiatia ana.

Ka tāuru i te wātea

Kua mahia e tātau Whakapā atu i tō tou reo

0/500 Huānga · Sign up for 5,000 per generation →

Whakawhanake mō te tepe o ngā tohu 5,000

Kāhua SSML (Ko te reo tohu whakahua kōrero mō te whakahaere tika)

Whāriki i tōna kupu i roto i ngā tohu SSML mō te whakahaere tika:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Mā ngā tohu āhua / āhua

Tāpiri i ngā tohu ā-āhuatanga hei whakaawe i te tuku (he rerekē te tautoko tauira):

Ko te kuputuhi tohutohu

Ka tautuhia ngā tohutohu ā-ringa (wāhi = tohutohu):

Whakahauhau 0

-12 +12

He pūāhua taupānga te taupānga taupānga: Ka whakamahia ngā tohu [S1] me [S2] hei tohu i ngā kaikōrero rerekē. Hei tauira:

[S1] Hei hei! [S2] Hei, he pēhea koe?



                
                
                    
                    
                        Tauira AI
                        
                    

                    
                    
                        Pāpāho
                        
                    
                
                

                
                
                    
                    
                        Reo
                        
                    

                    
                    
                        Hanganga huaputa
                        
                    

                    
                    
                        
                            Āhuatanga
                            1.0x
                        
                        
                        
                            0.5x
                            2.0x
                        
                    
                

                
                
                    
                    
                        
                        Waihoki me Piper, VITS, MeloTTS



        
        
            
                Ka puta tēnei te oro i waihangatia e koe. Ka kōwhiria tētahi tauira, ka tāurua te kupu, a, ka kōwhiria te Whakatū.
            
            
            
                
                
                    Kāore te whakatūnga i angitu
                    
                
            
        

            
                
                    
                        Kua angitu te whakaputanga oro
                        
                    
                    
                        


    
        
            
            
                
                    
                
                
            
        
    


                        
                            
                                Waihoki i te oro
                            
                            
                            
                            Ka ngaro te pānga i roto i te 24h
                            
                                
                                    
                                    
                                    
                                    
                                    
                                
                            
                        
                    
                
            
        

        

    
        
            
                
                    E manakohia ana e TTS.ai? Whakapāpāho ki ōna hoa!



    
    
        
        
            
                Whakamāramatanga tauira
            
            
                
                
                    
                    Piper
                
                Free
                Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
                
                    
                        
                            kaiwhakawhanake:
                            Rhasspy
                        
                        
                            Whakawhiwhinga:
                            MIT
                        
                        
                            Āhuatanga
                            
                                Fast
                            
                        
                        
                            Kāwai:
                            
                                
                            
                        
                        
                            reo
                            31 reo
                        
                        
                            VRAM
                            0 (CPU only)
                        
                        
                            Ko te tāruatanga reo
                             Kāore i tautokona
                        
                    
                
                
                
                    Āhuatanga:
                    
                        
                        CPU-friendly
                        
                        Offline capable
                        
                        100+ voices
                        
                        30+ languages
                        
                        SSML support
                        
                    
                
                
                
                Ko te tino pai mo:: 
                Quick previews, accessibility, and embedded applications
                
                
            
        

        
        
            
                Ko ngā tohu mō ngā hua pai ake
            
            
                
                    Ka whakamahia te whakawāteatanga tika mō ngā whakawāteatanga māori me ngā whakawāteatanga.
                    E whakamāori ana i ngā tau me ngā whakawhāititanga mō te kōrero mārama ake.
                    E tāpiri ana i ngā kōwae hei waihanga i ngā wā pōturi i waenganui i ngā rerenga
                    Ka whakamahia ngā kōaro (...) mō ngā wā roa ake
                    Whakamātau i te Kokoro, i te CosyVoice 2 rānei mō ngā hua tino māori
                    Ka whakamahia a Dia mō te tauwhitinga kaikōrero-maha me ngā ihirangi podcast
                
            
        

        
        
            
                Ko ngā utu pūtea
            
            
                
                    
                        
                            Te āhua
                            Ko te utu mō ia pūāhua 1K
                        
                    
                    
                        
                            Waihoki
                            0 ngā pūtea (kore te tepe)
                        
                        
                            Paerewa
                            2 ngā pūtea / 1K ngā pūāhua
                        
                        
                            Whakawhiwhinga
                            4 ngā pūtea / 1K ngā pūāhua
                        
                    
                
            
            
                Ka whiwhi whakawhiwhinga anō

Te āhua	Ko te utu mō ia pūāhua 1K
Waihoki	0 ngā pūtea (kore te tepe)
Paerewa	2 ngā pūtea / 1K ngā pūāhua
Whakawhiwhinga	4 ngā pūtea / 1K ngā pūāhua






    
        
            
                
                
                    
                    
    Kāore he pānuitanga
    Kāore i te whakawhāititia te whakamahinga
    Whakawhiwhinga āhuahira
    Ka taea te uru ki ngā āhuatanga hōu


                
                

                
                    
                        Ka whiwhi whakawhiwhinga anō






    
        He pēhea te mahi a AI Text-to-Speech
        E toru ngā hipanga ngāwari hei waihanga i ngā kōrero ā-mahi. Kāore he mōhiotanga hangarau e hiahiatia ana.
        
            
                
                    
                        
                            
                        
                        Hipanga 1
                        Ka tāuru i ōna kupu
                        Type, paste, whakaata rānei i te kupu e hiahiatia ana e koe kia tahuri ki te kōrero. E tautoko ana ki te 5,000 ngā tohu i ia whakatupuranga mō ngā kaiwhakaari tāurunga. Ka whakamahia te kupu pūnoa, ka tāpiri rānei i ngā tohu SSML mō te whakahaere matatini i runga i te kōrero, i ngā whakapeka, me ngā whakahua.
                    
                
            
            
                
                    
                        
                            
                        
                        Hipanga 2
                        Hiko te tauira me te reo
                        Ka kōwhiria mai i ngā tauira AI 20+ puta noa i ngā taumata e toru. Ka kōwhiria he reo e ōrite ana ki tōna ihirangi, e kōwhiria ana i tōna reo ūnga, e whakarerekē ana i te tere tākaro mai i te 0.5x ki te 2.0x, me te kōwhiri i tōna āhua huaputa e manakohia ana (MP3, WAV, OGG, FLAC rānei).
                    
                
            
            
                
                    
                        
                            
                        
                        Hipanga 3
                        Ka whakaputaina me te tangohia
                        Tirohia me te kaiwhakaari whāiti, tuku i roto i tōna hanga e kōwhiria ana, tārua rānei i tētahi pātahitanga tiritiri. Ka whakamahia te API mō te tukanga rōpū me te whakaurutanga ki roto i tōna rerenga mahi.
                    
                
            
        
    






    
        Ka whakamahia te kupu ki te kōrero
        Ko te kupu-ki-whakaahua AI e huri ana i te āhua o te waihanga, te whakapaunga, me te tauwhitinga a te tangata ki ngā ihirangi oro i roto i ngā mahi maha.
        
            
                
                    
                        
                        He pukapuka oro
                        Ka tahuri i ngā pukapuka katoa ki roto i ngā pukapuka oro māori me te kōrero ā-taiwhanga. Mā te tautoko i ngā kaikōrero maha me Dia mō te tauwhitinga āhua.
                    
                
            
            
                
                    
                        
                        Whakapāpāpānga Vitio
                        Ka waihanga i ngā pūoro ngaio mō YouTube, TikTok, Instagram Reels, me Shorts. 100+ ngā reo, tārua rānei i ōna ake.
                    
                
            
            
                
                    
                        
                        Podcasts
                        Ka waihanga i ngā wāhanga podcast mai i ngā tuhipānui me ngā reo AI maha. Ka whakamahia te Dia mō ngā kōrero māori e rua.
                    
                
            
            
                
                    
                        
                        Pāpāho
                        Ka mahi te AI i te reo mō ngā kēmu takitahi, ngā pukapuka whakaahua, me ngā kōrero ā-waha. Ka kōrerorerotia e NPC, ngā reo cutscene, 30+ ngā reo.
                    
                
            
            
                
                    
                        
                        E-mātautau
                        Ka tahuri i ngā rawa akoranga, ngā whakaakoranga, me ngā ihirangi whakaakoranga ki te oro. Mā te tautoko reo maha mō ngā pūwāhi ao whānui.
                    
                
            
            
                
                    
                        
                        Āhei ki te uru
                        Ka taea te uru ki ngā whatunga, ngā tuhinga, me ngā taupānga. Ko te whakakotahitanga o te kaitiaki mata API me te whakawhitinga tuhipānui-ki-he oro.
                    
                
            
            
                
                    
                        
                        IVR me ngā pūnaha whatunga
                        Ka whakahaua ngā pūnaha IVR, ngā rārangi tāngata, me te ratonga ngaio ki ngā oro māori o AI.
                    
                
            
            
                
                    
                        
                        Pāpāho pāpori
                        Ko ngā kōrero TikTok, Instagram Reels, ngā kōrero Twitter / X, YouTube Shorts. He tere te whakawhanaketanga me ngā tauira wātea.
                    
                
            
            
                
                    
                        
                        Pāpāho
                        Ko ngā mōhiohio Twitch TTS, te kōrero-ki-te-rongo, ngā kaihautū AI, me ngā tāngata Discord. He iti iho te ātetetanga, he 100+ ngā reo, e ōrite ana ki StreamElements.
                    
                
            
            
                
                    
                        
                        Mākete
                        Ko ngā whakapuaki whakapuaki, ngā ataata whakamārama, ngā whakaaturanga hua, me ngā whakaaturanga hokonga. Ka whakatōpūtia te whakaputanga ihirangi oro i roto i ngā whakataetae.
                    
                
            
            
                
                    
                        
                        Pāpāho me te tauwāhi
                        Ka whakamāori me te whakarerekē i ngā ataata ki ngā reo 30+ me te AI e ōrite ana ki te reo.
                    
                
            
            
                
                    
                        
                        Meditation & Wellness
                        Ko ngā whakamātautau whakahauhau, ngā pūrākau moe, ngā whakamātautau hau, me ngā whakaūtanga me ngā oro AI mārō, mārō.
                    
                
            
        
        
            Tirohia ngā take me ngā utauta katoa
        
    






    
        Ko ngā tauira kupu ki te kōrero katoa
        Ko ngā whakaritenga mōhiohio mō ia tauira AI e wātea ana i TTS.ai. Tērā te āhuatanga, te tere, te tautoko reo, me ngā āhuatanga hei kimi i te tauira tika mō tōmu kaupapa.

        
        
            Ko nga mea katoa (32)
            Waihoki (7)
            Paerewa (18)
            Whakawhiwhinga (7)
        

        
            
            
                
                    
                    
                        
                            
                                Kokoro
                                Free
                            
                            
                                Ko te Kokoro he tauira tuhi-ki-te-kōrero tauine 82 miriona e ātete ana i runga ake i tōna karaehe taumaha. Ahakoa tōna rahi iti, ka whakaputaina e ia he kōrero tino māori me te whakamārama. Ko te Kokoro e tautoko ana i ngā reo maha tae atu ki te reo Ingarihi, te reo Hapanihi, te reo Hainamana, me te reo Korean me ngā reo whakamārama maha. He tere rawa — e whakaputa ai i te oro tata ki te 100x tere ake i te wā tūturu i runga i te GPU.

                                
                                    
                                        kaiwhakawhanake::

                                        Hexgrad
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, ja, zh, ko, fr, de, it, pt, es, hi, ru
                                    
                                    
                                        VRAM:

                                        1.5GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        Parameter 82M
                                        
                                        Āhua tere
                                        
                                        Whakapāpāho ngā oro
                                        
                                        He maha nga reo
                                        
                                        Mā te tautoko pāpāho
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                TTS whai hua nui me te ātete iti rawa, ngā taupānga rerenga
                                
                            
                            
                                
                                    Whakamātautau Kokoro
                                
                            
                        
                    
                    
                    
                        
                            
                                Piper
                                Free
                            
                            
                                Ko Piper he mīhini kupu-ki-whakaahua ngāwari i hangaia e Rhasspy e whakamahi ana i ngā hanganga VITS me te larynx. E mahi ana katoa ana i runga i te CPU, e pai ana mō ngā pūrere pae, ngā pūkaha kāinga, me ngā taupānga e hiahiatia ana he TTS kāore i te tīariari. Me ngā reo neke atu i te 100 puta noa i ngā reo 30+, e tuku ana a Piper i te kōrero māori i te tere o te wā tūturu i runga anō i te Raspberry Pi 4.

                                
                                    
                                        kaiwhakawhanake::

                                        Rhasspy
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, de, fr, es, it, pt, nl, pl, ru, zh, ja, ko, ar, cs, da, fi, el, hu, is, ka, kk, ne, no, ro, sk, sr, sv, sw, tr, uk, vi
                                    
                                    
                                        VRAM:

                                        0 (CPU only)
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        E tika ana te CPU
                                        
                                        Ka taea te ārai
                                        
                                        100+ ngā reo
                                        
                                        30+ reo
                                        
                                        Te tautoko SSML
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā kitenga tere, ngā āheitanga, me ngā taupānga kōkuhu
                                
                            
                            
                                
                                    Whakamātautau Piper
                                
                            
                        
                    
                    
                    
                        
                            
                                VITS
                                Free
                            
                            
                                VITS (He whakarerekētanga me te akoranga ātete mō te mutunga-ki-te mutunga o te kupu-ki-te-whakahaere) he aratuka TTS mutunga-ki-te mutunga e puta ai he pūoro māori ake i ngā tauira wāhanga-rua o nāianei, e whakaae ana ki te whakarerekētanga o te whakarerekētanga i whakanuia e ngā rerenga pūnoa me tētahi tukanga whakaakoranga ātete, e whiwhi ana i tētahi whakapainga nui i te mātauranga.

                                
                                    
                                        kaiwhakawhanake::

                                        Jaehyeon Kim et al.
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko
                                    
                                    
                                        VRAM:

                                        1GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        Ko te whakakotahitanga mutunga-ki-te mutunga
                                        
                                        Ka taea te whakamahi te tikanga māori.
                                        
                                        Āhuatanga tere
                                        
                                        He tokomaha nga kaikōrero
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Huinga ahuwhānui-tuhi-ki-te-kōrero me te pūāhua māori
                                
                            
                            
                                
                                    Whakamātautau VITS
                                
                            
                        
                    
                    
                    
                        
                            
                                MeloTTS
                                Free
                            
                            
                                Ko MeloTTS e MyShell.ai he puna TTS reo maha e tautoko ana i te reo Ingarihi (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, me te Korean. He tere rawa, e mahi ana i te kupu i te tere o te wā tūturu i runga i te CPU anake. Kua hangaia a MeloTTS mō te whakamahinga whakanao, ā, e tautoko ana i te CPU me te GPU.

                                
                                    
                                        kaiwhakawhanake::

                                        MyShell.ai
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, es, fr, zh, ja, ko
                                    
                                    
                                        VRAM:

                                        0.5GB (GPU optional)
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        CPU-I tino pai
                                        
                                        He maha nga reo
                                        
                                        He maha nga kīanga
                                        
                                        Mā te whakanaotanga
                                        
                                        Waihoki iti
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā taupānga whakanao e hiahiatia ana he tere, he TTS reo maha
                                
                            
                            
                                
                                    Whakamātautau MeloTTS
                                
                            
                        
                    
                    
                    
                        
                            
                                Bark
                                Standard
                            
                            
                                Ko Bark e Suno he tauira kupu-ki-rongoā i runga anō i te whakarerekētanga ka taea te whakaputa i te kōrero tino pono, i ngā reo maha, i ētahi atu oro pūoro pēnei i te pūoro, i te pōhēhētanga o te papamuri, i ngā pānga oro. Ka taea e ia te whakaputa i ngā whakawhitinga ā-waha pēnei i te māharahara, i te tūkinotanga, i te tūkinotanga. He nui ake i te 100 ngā whakaritenga kaikōrero me ngā reo 13+ e tautoko ana e Bark.

                                
                                    
                                        kaiwhakawhanake::

                                        Suno
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Slow
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
                                    
                                    
                                        VRAM:

                                        5GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Māmā ngā pānga oro
                                        
                                        Whakahauhau/whakahauhau
                                        
                                        Ko te whakatūnga pūoro
                                        
                                        100+ ngā kaikōrero
                                        
                                        He maha nga reo
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā ihirangi oro, ngā pukapuka oro me ngā āhuatanga, ngā pānga oro
                                
                            
                            
                                
                                    Whakamātautau Bark
                                
                            
                        
                    
                    
                    
                        
                            
                                Bark Small
                                Standard
                            
                            
                                He putanga iti ake o te tauira Bark ko Bark e whakawhiti ana i ētahi o ngā āhuatanga oro mō ngā tere whakahau tere ake me ngā hiahia pūmahara iti iho, e pupuri ana i te kaha o Bark ki te whakanao i te kōrero me ngā āhuatanga, te māharahara, me ngā reo maha.

                                
                                    
                                        kaiwhakawhanake::

                                        Suno
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
                                    
                                    
                                        VRAM:

                                        2GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        He māmā
                                        
                                        Āhua tere ake i te rākau katoa
                                        
                                        Whakawhitiwhiti ā-hinengaro
                                        
                                        He maha nga reo
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                He tere te pūoro hanga i te wā he pōturi rawa te Bark katoa
                                
                            
                            
                                
                                    Whakamātautau Bark Small
                                
                            
                        
                    
                    
                    
                        
                            
                                CosyVoice 2
                                Standard
                            
                            
                                Ko te CosyVoice 2 a Alibaba's Tongyi Lab e whiwhi ana i te āhua o te kōrero e ōrite ana ki te tangata me te pōturi iti rawa, e pai ana mō ngā taupānga wā tūturu. Ka whakamahia e ia tētahi huarahi whakarea tūturu mō te tāruatanga reo, ā, ka tautokona e ia te tāruatanga reo kore, te tāruatanga reo whakawhiti, me te whakahaere āhua o te āhua o te āhua o te āhua.

                                
                                    
                                        kaiwhakawhanake::

                                        Alibaba (Tongyi Lab)
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko, fr, de, it, es
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Pāpāho
                                        
                                        Ko te tāruatanga-kore
                                        
                                        Cross-language
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        Human-parity
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Taupānga wā-tūturu, TTS whakatere, kaiāwhina reo
                                
                            
                            
                                
                                    Whakamātautau CosyVoice 2
                                
                            
                        
                    
                    
                    
                        
                            
                                Dia TTS
                                Standard
                            
                            
                                Ko Dia e Nari Labs he tauira kupu-ki-rongo 1.6B i hangaia mō te whakaputa i ngā kōrero maha. Ka taea e ia te whakaputa i ngā kōrero māori i waenganui i ngā kaikōrero e rua me te whakarerekētanga tika, me te kīanga ā-āhuatanga. He tino pai te Dia mō te waihanga i ngā ihirangi āhua podcast, ngā kōrero reo reo, me te AI whakawhitiwhitinga.

                                
                                    
                                        kaiwhakawhanake::

                                        Nari Labs
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        He tokomaha nga kaikōrero
                                        
                                        Ko te whakatūnga o te taupānga
                                        
                                        Māori te mahi hurihanga
                                        
                                        Ko te kīanga ā-hinengaro
                                        
                                        Parameter 1.6B
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Podcasts, kōrerorero pukapuka oro, ihirangi kōrerorero
                                
                            
                            
                                
                                    Whakamātautau Dia TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                Parler TTS
                                Standard
                            
                            
                                Ko te Parler TTS he tauira kupu-ki-rongo e whakamahi ana i ngā whakaahuatanga reo māori hei whakahaere i te kōrero i hangaia. Ehara i te kōwhiringa mai i ngā reo i whakaritea, ka whakaahuatia e koe te reo e hiahiatia ana e koe (hei tauira, "he reo wahine wera me tētahi āhuatanga British iti, e kōrero ana i te pōturi, i te mārama hoki") ā, ka whakaputaina e te Parler he kōrero e ōrite ana ki taua whakaahuatanga. Mā tēnei e āhei ai ki ngā taupānga auau.

                                
                                    
                                        kaiwhakawhanake::

                                        Hugging Face
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Whakamāramatanga reo
                                        
                                        Ka whakahaeretia te reo māori
                                        
                                        Ko te hanganga reo mārō
                                        
                                        Kāore he oro i te hiahiatia
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā taupānga auau e hiahiatia ana e koe ngā āhuatanga reo ā-ringa
                                
                            
                            
                                
                                    Whakamātautau Parler TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                GLM-TTS
                                Standard
                            
                            
                                Ko te GLM-TTS na Zhipu AI he pūnaha kupu-ki-rongo i hangaia ki runga i te hanganga Llama me te ōritetanga rerenga. E whiwhi ana i te mokatere hapa ira iti rawa i waenganui i ngā tauira TTS pūtake tūwhera, ko te tikanga ka whakaputaina e ia te kōrero tino tika. E tautoko ana a GLM-TTS i te reo Ingarihi me te reo Hainamana me te tārua reo mai i ngā tauira oro 3-10 waeine.

                                
                                    
                                        kaiwhakawhanake::

                                        Zhipu AI
                                    
                                    
                                        Whakawhiwhinga::

                                        GLM-4 License
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Te mokatere hapa iti rawa
                                        
                                        Ko te tārua reo
                                        
                                        He ōrite te rerenga
                                        
                                        Ka taea te whakamahi te tikanga māori.
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                E hiahiatia ana e ngā taupānga te tika o te kōrero nui rawa
                                
                            
                            
                                
                                    Whakamātautau GLM-TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                IndexTTS-2
                                Standard
                            
                            
                                Ko te IndexTTS-2 he pūnaha tuhituhi-ki-te-kōrero hōhonu e tino pai ana ki te whakakotahi reo-kore me te whakahaere āhuahira-kore. Ka taea e ia te whakaputa kōrero me ngā āhuahira ā-āhuahira pēnei i te māharahara, i te pōhara, i te pōhara, i te pōhara rānei me te kore e hiahiatia he raraunga whakaakoranga ā-āhuahira. Ka whakamahia e te tauira ngā ira ā-āhuahira hei whakahaere tika i te kīanga ā-āhuahira o te kōrero i hangaia.

                                
                                    
                                        kaiwhakawhanake::

                                        Index Team
                                    
                                    
                                        Whakawhiwhinga::

                                        Bilibili Model License
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        Zero-shot
                                        
                                        Ko nga rarangi āhuahira
                                        
                                        Whakaputanga ā-waha
                                        
                                        Ka tika te whakahaerenga
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā ihirangi whakapuaki ā-ā-waha, ngā pukapuka oro, ngā kaiāwhina ā-ariā
                                
                            
                            
                                
                                    Whakamātautau IndexTTS-2
                                
                            
                        
                    
                    
                    
                        
                            
                                Spark TTS
                                Standard
                            
                            
                                Ko te Spark TTS na SparkAudio he tauira kupu-ki-whakaahua e hono ana i te tārua reo me te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua o te āhua.

                                
                                    
                                        kaiwhakawhanake::

                                        SparkAudio
                                    
                                    
                                        Whakawhiwhinga::

                                        CC BY-NC-SA 4.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Ko te tārua reo
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        Kāhua whakahaere
                                        
                                        I runga i te pātai
                                        
                                        5 waeine te tāruatanga
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Hanganga ihirangi me ngā reo tārua me te mana ā-āhuatanga
                                
                            
                            
                                
                                    Whakamātautau Spark TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                GPT-SoVITS
                                Standard
                            
                            
                                Ko te GPT-SoVITS e whakakotahi ana i te tauira reo āhua GPT me te SoVITS (Singing Voice Inference mā te whakawhitinga me te whakakotahitanga) mō te tārua reo kaha-kore. Me te iti iho i te 5 sekone o te oro tohutoro, ka taea e ia te tārua tika i tētahi reo me te whakaputa reo hou i te wā e pupuri ana i ngā āhuatanga ahurei o te kaikōrero. He tino pai ki te kōrero me te whakakotahi reo.

                                
                                    
                                        kaiwhakawhanake::

                                        RVC-Boss
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Slow
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko
                                    
                                    
                                        VRAM:

                                        6GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        5 waeine te tāruatanga
                                        
                                        Te reo whakatangitangi
                                        
                                        He iti noa iho te akoranga
                                        
                                        He nui te pono
                                        
                                        Cross-language
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko te tārua reo, te whakakotahinga waiata, te tāruatanga reo o te kaiwhakanao ihirangi
                                
                            
                            
                                
                                    Whakamātautau GPT-SoVITS
                                
                            
                        
                    
                    
                    
                        
                            
                                Orpheus
                                Standard
                            
                            
                                Ko Orpheus he tauira kupu-ki-whakaahua nui e whiwhi ana i te kīanga ā-āhuatanga o te tangata. I whakaakona i runga i ngā raraunga kōrero maha ake i te 100,000 wā, e tino pai ana ki te whakaputa kōrero me ngā āhuatanga māori, te whakahua, me ngā kāhua kōrero. Ka taea e Orpheus te whakaputa kōrero e kore e taea te wehe i ngā pūkete tangata.

                                
                                    
                                        kaiwhakawhanake::

                                        Canopy Labs
                                    
                                    
                                        Whakawhiwhinga::

                                        Llama 3.2 Community
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Te āhua o te āhua tangata
                                        
                                        100K ngā wā whakaakoranga
                                        
                                        Āhuatanga māori
                                        
                                        Whakaputanga ā-waha
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko te kōrero ā-ā-ringa nui, ngā pukapuka oro, te mahi reo.
                                
                            
                            
                                
                                    Whakamātautau Orpheus
                                
                            
                        
                    
                    
                    
                        
                            
                                Chatterbox
                                Premium
                            
                            
                                Ko te Chatterbox na Resemble AI he tauira tāruatanga oro-kore. Ka taea e ia te tārua i tētahi reo mai i tētahi tauira oro kotahi me te tika tino mōhio, kāore i te tango anake i te timbre engari ko te kāhua kōrero me ngā āhuatanga ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā.

                                
                                    
                                        kaiwhakawhanake::

                                        Resemble AI
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Ko te tāruatanga-kore
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        He nui te pono
                                        
                                        Ka whakawhitia te kāhua
                                        
                                        Ko te tārua tauira kotahi
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko te tārua reo mātauranga me te mana ā-āhuatanga, te hanganga ihirangi
                                
                            
                            
                                
                                    Whakamātautau Chatterbox
                                
                            
                        
                    
                    
                    
                        
                            
                                Tortoise TTS
                                Premium
                            
                            
                                Ko te Tortoise TTS he pūnaha tuhituhi-ki-te-reo-maha e whakawhāiti ana i te āhua o te reo i runga i te tere. Ka whakamahia e ia te hanganga i whakaawetia e DALL-E hei waihanga i tētahi kōrero tino māori me te ōritetanga pai o te kōrero me te kaikōrero. Ahakoa he pōturi ake i ngā whirinoa maha, ka whakaputaina e te Tortoise ētahi o ngā kōrero tino mārama e wātea ana i roto i te pūnaha pūtake tūwhera.

                                
                                    
                                        kaiwhakawhanake::

                                        James Betker
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Slow
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        8GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Kāwai tiketike rawa
                                        
                                        He maha nga reo
                                        
                                        Hanganga hanga DALL-E
                                        
                                        Ko te tārua reo
                                        
                                        Ko te whakarerekētanga whaiaro
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                He pukapuka oro, he ihirangi utu nui, he taupānga pai-tūturu
                                
                            
                            
                                
                                    Whakamātautau Tortoise TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                StyleTTS 2
                                Premium
                            
                            
                                StyleTTS 2 e whiwhi ana i te hanganga TTS taumata- tangata mā te paheko i te whakawhānuitanga o te kāhua me te whakaakoranga ātete mā te whakamahi i ngā tauira reo kōrero nui. Ka whakaputaina e ia te kōrero tino māori i waenganui i ngā tauira kaikōrero kotahi, e whakataetae ana i ngā pūkete tangata. StyleTTS 2 e whakamahi ana i te tauira kāhua i runga anō i te whakawhānuitanga hei tango i te awhe katoa o te rerekētanga o te reo tangata.

                                
                                    
                                        kaiwhakawhanake::

                                        Columbia University
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Tau tangata
                                        
                                        Pāpāho kāhua
                                        
                                        Ko te whakaakoranga ātete
                                        
                                        He rerekētanga māori
                                        
                                        He nui te pono
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko te whakakotahitanga o te kaikōrero kotahi o te mātauranga, te kōrero ngaio
                                
                            
                            
                                
                                    Whakamātautau StyleTTS 2
                                
                            
                        
                    
                    
                    
                        
                            
                                OpenVoice
                                Premium
                            
                            
                                E āhei ana a OpenVoice e MyShell.ai ki te tārua reo tere me te whakahaere matatini i runga i te kāhua reo, i te āhua, i te āhua, i te āhua, i te wā, i te āhua. Ka taea e ia te tārua i tētahi reo mai i tētahi rīpene orooro poto me te whakaputa kōrero i ngā reo maha i te pupuri i te tuakiri o te kaikōrero. Ka mahi hoki a OpenVoice hei kaiwhakarere reo, e whakaae ana ki te huringa reo i te wā tūturu.

                                
                                    
                                        kaiwhakawhanake::

                                        MyShell.ai / MIT
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko, fr, de, es, it
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Ko te tārua tere
                                        
                                        Ka whakarerekētia te reo
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        He maha nga reo
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko te tārua reo me te whakahaere kāhua kōaro, te tahuri reo
                                
                            
                            
                                
                                    Whakamātautau OpenVoice
                                
                            
                        
                    
                    
                    
                        
                            
                                Qwen3 TTS
                                Standard
                            
                            
                                Ko Qwen3-TTS he tauira tuhi-ki-te-kōrero tauine 1.7 miriona mai i te rōpū Qwen o Alibaba. E toru ngā āhuatanga e tautoko ana i a ia: ngā reo i whakaritea i mua me te mana ā-āhuatanga (9 ngā kaikōrero), te tārua reo mai i ngā waeine 3 anake o te oro, me tētahi āhuatanga hoahoa reo motuhake e whakaahua ana i te reo e hiahiatia ana e koe i roto i te reo māori.

                                
                                    
                                        kaiwhakawhanake::

                                        Alibaba (Qwen)
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko, de, fr, ru, pt, es, it
                                    
                                    
                                        VRAM:

                                        7GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Ko te tārua reo
                                        
                                        9 ngā oro i whakaritea i mua
                                        
                                        He hoahoa reo mai i te kupu
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        reo
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                He maha ngā ihirangi reo me te tārua reo, te hoahoa reo rānei
                                
                            
                            
                                
                                    Whakamātautau Qwen3 TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                Sesame CSM
                                Premium
                            
                            
                                Ko te Sesame CSM (Model Speech Conversational) he tauira taurearea kotahi mano, kua hangaia hei whakaputa kōrero ā-waha. Ka tauiratia e ia ngā tauira tūturu o te kōrero tangata tae atu ki te wā whakarerekētanga, ngā urupare ā-roto, ngā urupare ā-āhuatanga, me te rerenga kōrero.

                                
                                    
                                        kaiwhakawhanake::

                                        Sesame
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Slow
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        8GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Pāpāhotanga
                                        
                                        Te wā māori
                                        
                                        Ka huri
                                        
                                        Backchannel
                                        
                                        Parameter 1B
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā kaiāwhina AI, ngā tāngata kōrero, ngā taupānga AI kōrerorero
                                
                            
                            
                                
                                    Whakamātautau Sesame CSM
                                
                            
                        
                    
                    
                    
                        
                            
                                Chatterbox Turbo
                                Standard
                            
                            
                                Ko te Chatterbox Turbo na Resemble AI he whakawhānuitanga tohuāhua 350M ki te Chatterbox, e tuku ana ki te tere o te wā tūturu o te 6x me te ātetetanga o te 200ms. E tautoko ana i ngā tohu paralinguistic pēnei i te [laugh], [cough], me te [chuckle] i roto i te kupu. Kei roto ko te Perth watermarking i ngā oro katoa i hangaia mō te whai i te take.

                                
                                    
                                        kaiwhakawhanake::

                                        Resemble AI
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        2GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Sub-200ms te ātetetanga
                                        
                                        Mā ngā tohu Paralinguistic
                                        
                                        6x te wā tūturu
                                        
                                        Ko te tārua reo
                                        
                                        Te tohu wai
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ko ngā māngai reo wā-tūturu, he kōrero whakamārama me ngā oro māori.
                                
                            
                            
                                
                                    Whakamātautau Chatterbox Turbo
                                
                            
                        
                    
                    
                    
                        
                            
                                Zonos
                                Standard
                            
                            
                                Ko te Zonos v0.1 na Zyphra he tauira tohuāhua 1.6B e whakaatu ana i te mana ā-āhuatanga mārō me ngā kāwai mō te aroha, te pōharatanga, te pōharatanga, te pōharatanga, me te whakamātautau. E whakarato ana i tētahi Transformer me tētahi tāupe SSM hōu (tauira mokowā ā-kāwanatanga). I whakaakona i runga i ngā wā 200K+ o te kōrero maha me te tārua reo-kore mai i ngā waeine 10-30 o te oro tohutoro.

                                
                                    
                                        kaiwhakawhanake::

                                        Zyphra
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, ja, zh, fr, de
                                    
                                    
                                        VRAM:

                                        6GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Ka whakahaeretia te āhuahira
                                        
                                        Ko te tārua reo
                                        
                                        Hanganga SSM
                                        
                                        He maha nga reo
                                        
                                        Whakahaere/whakahaere i te mana
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Ka kōrerorerotia te kōrero me te whakahaere āhuahira, te whare taiwhanga hoahoa reo.
                                
                            
                            
                                
                                    Whakamātautau Zonos
                                
                            
                        
                    
                    
                    
                        
                            
                                Dia 2
                                Standard
                            
                            
                                Ko te Dia2 a Nari Labs he whakawhānuitanga-tūturu ki te Dia, e wātea ana i roto i ngā tāupe tohu 1B me te 2B. Ka tīmata ki te whakawhanake i te oro mai i ngā tohu tuatahi, e pai ai mō ngā māngai reo wā-tūturu me ngā pūwhitinga kōrero-ki-te-kōrero. E tautoko ana i te kōrerorero maha me ngā tohu [S1] / [S2] me ngā tohu paralinguistic pēnei i te (laughs), (coughs).

                                
                                    
                                        kaiwhakawhanake::

                                        Nari Labs
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Ko te huaputa rerenga
                                        
                                        He tokomaha nga kaikōrero
                                        
                                        Waihoki iti
                                        
                                        Ko ngā tohu ā-reo
                                        
                                        Tae atu ki te huaputa 2 min
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Kaikōrero reo wā tūturu, whakawhanakenga kōrero, taupānga whakawhitiwhitinga
                                
                            
                            
                                
                                    Whakamātautau Dia 2
                                
                            
                        
                    
                    
                    
                        
                            
                                VoxCPM
                                Standard
                            
                            
                                Ko te VoxCPM 1.5 na OpenBMB he tauira TTS kore tohu hou e mahi ana i roto i te mokowā tūturu ehara i te tohu motuhake. Ka whakaputaina e ia i te oro 44.1kHz, e tautoko ana i te tārua reo kore-kōrero mai i te 3-10 sekone, ā, ka pupuri i te ōritetanga puta noa i ngā wāhanga. Ka taea e te tārua reo te hoatu i tētahi reo Ingarihi ki te kōrero Hainamana, ā, ko te āhua anō.

                                
                                    
                                        kaiwhakawhanake::

                                        OpenBMB
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        44.1kHz oro
                                        
                                        Waihoki-kore
                                        
                                        Cross-language kloning
                                        
                                        E mōhio ana ki te horopaki
                                        
                                        LoRA fine-tuning
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                He pūoro pūmau, he pukapuka pūoro, he ihirangi āhua roa me te ōritetanga reo
                                
                            
                            
                                
                                    Whakamātautau VoxCPM
                                
                            
                        
                    
                    
                    
                        
                            
                                OuteTTS
                                Free
                            
                            
                                E whakaroa ana e te OuteTTS ngā tauira reo nui me ngā āheinga kupu-ki-whakaahua i te wā e pupuri ana i te hanganga taketake. E tautoko ana i ngā taupoki maha tae atu ki a llama.cpp (CPU/GPU), Hugging Face Transformers, ExLlamaV2, VLLM, ā, ko te whakawāteatanga whakangākau mā Transformers.js. He āhuahira te tārua reo kore-pōti mā ngā tātai kōrero i tiakina hei JSON.

                                
                                    
                                        kaiwhakawhanake::

                                        OuteAI
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        2GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        Ko te whakahuatanga o te CPU
                                        
                                        Whakahautanga matapihi
                                        
                                        Whakakōrero reo
                                        
                                        He maha nga papamuri
                                        
                                        Ka taea te whakahua i ngā tāurunga
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Whakapapatanga Edge, TTS i runga i te whakahura, taiao rawa-iti
                                
                            
                            
                                
                                    Whakamātautau OuteTTS
                                
                            
                        
                    
                    
                    
                        
                            
                                TADA
                                Standard
                            
                            
                                TADA (Text-Acoustic Dual Alignment) e Hume AI he tauira TTS whakahauhau e whakakore ana i ngā hallucinations mā tētahi hoahoa tapawhā hōu i hangaia ki Llama 3.2. Kei te wātea i roto i ngā tāupe 1B (English) me te 3B (maha-reo), ka tae mai a TADA ki tētahi RTF o te 0.09 — 5x tere ake i ngā tauira TTS i runga i te LLM. E tautoko ana i te 700 waeine o te horopaki oro, ā, ka whakaputaina he kōrero ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā-ā

                                
                                    
                                        kaiwhakawhanake::

                                        Hume AI
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        5GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Haumarutanga kore
                                        
                                        5x tere ake i te LLM TTS
                                        
                                        Emotional expression
                                        
                                        700s ngā horopaki oro
                                        
                                        Whakawaruatanga rua
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                High-quality hallucination-free speech, emotional expression, fast inference
                                
                            
                            
                                
                                    Whakamātautau TADA
                                
                            
                        
                    
                    
                    
                        
                            
                                VibeVoice
                                Standard
                            
                            
                                E rua nga momo VibeVoice a Microsoft: he tauira 1.5B mō ngā ihirangi ā-rohe roa (tata ki ngā minu 90, 4 ngā kaikōrero) me tētahi tauira 0.5B o te wā tūturu mō te whakawhitinga me te ātete reo tuatahi ~200ms. Ko te momo 1.5B e tino pai ana i ngā podcast me ngā pukapuka oro me te ōritetanga o te kaikōrero i ngā whakawhitinga roa. Whakama: I tangohia e Microsoft te waehere TTS mai i te puna, ā, ko te oro i whakaputaina e whakauru ana i ngā whakawāteatanga AI.

                                
                                    
                                        kaiwhakawhanake::

                                        Microsoft
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        He tokomaha nga kaikōrero
                                        
                                        Tae noa ki te 90 min
                                        
                                        Ko te whakawhanaketanga Podcast
                                        
                                        He ōrite te kaikōrero
                                        
                                        200ms te rerenga
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Podcasts, ngā pukapuka oro, ngā ihirangi pūkōrero maha o te āhua roa
                                
                            
                            
                                
                                    Whakamātautau VibeVoice
                                
                            
                        
                    
                    
                    
                        
                            
                                Pocket TTS
                                Free
                            
                            
                                Ko te Pocket TTS a Kyutai (ngā kaihanga o Moshi) he tauira kupu-ki-whakaahua 100M tauine-ki-whakaahua e whakarewa ana i runga ake i tōna taumahatanga. Ka mahi tika i runga i te CPU, e tautoko ana i te tārua reo kore-kōrero mai i tētahi tauira oro kotahi, ā, ka whakaputaina he kōrero māori. Ko te rahi o te tauira iti e tino pai ana mō te whakawhānui i te pito me ngā taiao rawa iti.

                                
                                    
                                        kaiwhakawhanake::

                                        Kyutai
                                    
                                    
                                        Whakawhiwhinga::

                                        MIT
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, fr
                                    
                                    
                                        VRAM:

                                        1GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        Parameter 100M
                                        
                                        Ko te whakahuatanga o te CPU
                                        
                                        Whakakōrero reo
                                        
                                        Ko te tārua tauira-kotahi
                                        
                                        E noho tata ana te pito
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Whakapapa māmā, taiao CPU- anake, whakaruru tere o te reo
                                
                            
                            
                                
                                    Whakamātautau Pocket TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                Kitten TTS
                                Free
                            
                            
                                Kitten TTS by KittenML is an ultra-lightweight text-to-speech model built on ONNX. With variants from 15M to 80M parameters (25-80 MB on disk), it delivers high-quality voice synthesis on CPU without requiring a GPU. Features 8 built-in voices, adjustable speech speed, and built-in text preprocessing for numbers, currencies, and units. Ideal for edge deployment and low-latency applications.

                                
                                    
                                        kaiwhakawhanake::

                                        KittenML
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en
                                    
                                    
                                        VRAM:

                                        0GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         Kāore
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        Waihoki
                                    
                                

                                
                                
                                    
                                        
                                        CPU-only inference
                                        
                                        Under 80MB model size
                                        
                                        8 built-in voices
                                        
                                        Speed control
                                        
                                        ONNX-based
                                        
                                        24kHz output
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Fast lightweight TTS, edge deployment, low-latency applications
                                
                            
                            
                                
                                    Whakamātautau Kitten TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                CosyVoice3
                                Standard
                            
                            
                                CosyVoice3 is the latest evolution from Alibaba's FunAudioLLM team. It features bi-streaming inference with ~150ms latency, instruction-based control for emotion/speed/volume, and improved speaker similarity for zero-shot cloning. Supports 9 languages plus 18 Chinese dialects. RL-tuned variant delivers state-of-the-art prosody.

                                
                                    
                                        kaiwhakawhanake::

                                        Alibaba (FunAudioLLM)
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Fast
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, ja, ko, de, es, fr, it, ru
                                    
                                    
                                        VRAM:

                                        4GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        2x
                                    
                                

                                
                                
                                    
                                        
                                        Bi-streaming
                                        
                                        Emotion control
                                        
                                        Voice cloning
                                        
                                        Speed/volume control
                                        
                                        Instruction following
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Multilingual production TTS, real-time applications, voice cloning
                                
                            
                            
                                
                                    Whakamātautau CosyVoice3
                                
                            
                        
                    
                    
                    
                        
                            
                                MOSS-TTS
                                Premium
                            
                            
                                MOSS-TTS from OpenMOSS supports generation of up to 1 hour of continuous speech across 20 languages. Features token-level duration control, phoneme-level pronunciation control via IPA/Pinyin, and code-switching between languages. The 8B production model delivers state-of-the-art quality with zero-shot voice cloning from reference audio.

                                
                                    
                                        kaiwhakawhanake::

                                        OpenMOSS
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Medium
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh, de, es, fr, ja, it, hu, ko, ru, fa, ar, pl, pt, cs, da, sv, el, tr
                                    
                                    
                                        VRAM:

                                        16GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Ultra-long generation
                                        
                                        20 languages
                                        
                                        Voice cloning
                                        
                                        Duration control
                                        
                                        Pronunciation control
                                        
                                        Code-switching
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                Audiobooks, long-form content, multilingual production
                                
                            
                            
                                
                                    Whakamātautau MOSS-TTS
                                
                            
                        
                    
                    
                    
                        
                            
                                MegaTTS3
                                Premium
                            
                            
                                MegaTTS3 from ByteDance uses a novel sparse alignment mechanism combined with a latent diffusion transformer. Features adjustable trade-off between speech intelligibility and speaker similarity for zero-shot voice cloning.

                                
                                    
                                        kaiwhakawhanake::

                                        ByteDance
                                    
                                    
                                        Whakawhiwhinga::

                                        Apache 2.0
                                    
                                    
                                        Āhuatanga:

                                        Slow
                                    
                                    
                                        Kāwai::

                                        
                                    
                                    
                                        reo:

                                        en, zh
                                    
                                    
                                        VRAM:

                                        8GB
                                    
                                    
                                        Ko te tāruatanga reo:

                                         He
                                    
                                    
                                        Ko te utu mō ia pūāhua 1K:

                                        4x
                                    
                                

                                
                                
                                    
                                        
                                        Voice cloning
                                        
                                        Adjustable similarity
                                        
                                        Cross-lingual
                                        
                                    
                                
                                

                                
                                Ko te tino pai mo:: 
                                High-fidelity voice cloning
                                
                            
                            
                                
                                    Whakamātautau MegaTTS3
                                
                            
                        
                    
                    
                
            

            
            
                
                    
                    
                        
                            
                                Kokoro
                                Waihoki
                            
                            
                                Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
                                
                                    kaiwhakawhanake::
Hexgrad
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en, ja, zh, ko, fr, de, it, pt, es, hi, ru
                                
                                Ko te tino pai mo:: High-quality TTS with minimal latency, streaming applications
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                Piper
                                Waihoki
                            
                            
                                Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
                                
                                    kaiwhakawhanake::
Rhasspy
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en, de, fr, es, it, pt, nl, pl, ru, zh, ja, ko, ar, cs, da, fi, el, hu, is, ka, kk, ne, no, ro, sk, sr, sv, sw, tr, uk, vi
                                
                                Ko te tino pai mo:: Quick previews, accessibility, and embedded applications
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                VITS
                                Waihoki
                            
                            
                                VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
                                
                                    kaiwhakawhanake::
Jaehyeon Kim et al.
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en, zh, ja, ko
                                
                                Ko te tino pai mo:: General-purpose text-to-speech with natural prosody
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                MeloTTS
                                Waihoki
                            
                            
                                MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
                                
                                    kaiwhakawhanake::
MyShell.ai
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en, es, fr, zh, ja, ko
                                
                                Ko te tino pai mo:: Production applications needing fast, multilingual TTS
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                OuteTTS
                                Waihoki
                            
                            
                                OuteTTS extends large language models with text-to-speech capabilities while preserving the original architecture. It supports multiple backends including llama.cpp (CPU/GPU), Hugging Face Transformers, ExLlamaV2, VLLM, and even browser inference via Transformers.js. Features zero-shot voice cloning through speaker profiles saved as JSON.
                                
                                    kaiwhakawhanake::
OuteAI
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en
                                
                                Ko te tino pai mo:: Edge deployment, browser-based TTS, low-resource environments
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                Pocket TTS
                                Waihoki
                            
                            
                                Pocket TTS by Kyutai (creators of Moshi) is a compact 100M parameter text-to-speech model that punches well above its weight. It runs efficiently on CPU, supports zero-shot voice cloning from a single audio sample, and produces natural-sounding speech. The small model size makes it ideal for edge deployment and low-resource environments.
                                
                                    kaiwhakawhanake::
Kyutai
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en, fr
                                
                                Ko te tino pai mo:: Lightweight deployment, CPU-only environments, quick voice cloning
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                    
                        
                            
                                Kitten TTS
                                Waihoki
                            
                            
                                Kitten TTS by KittenML is an ultra-lightweight text-to-speech model built on ONNX. With variants from 15M to 80M parameters (25-80 MB on disk), it delivers high-quality voice synthesis on CPU without requiring a GPU. Features 8 built-in voices, adjustable speech speed, and built-in text preprocessing for numbers, currencies, and units. Ideal for edge deployment and low-latency applications.
                                
                                    kaiwhakawhanake::
KittenML
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo: en
                                
                                Ko te tino pai mo:: Fast lightweight TTS, edge deployment, low-latency applications
                            
                            
                                Whakamātautau wātea
                            
                        
                    
                    
                
            

            
            
                
                    
                    
                        
                            
                                Bark
                                Paerewa
                            
                            
                                Bark by Suno is a transformer-based text-to-audio model that can generate highly realistic, multilingual speech as well as other audio like music, background noise, and sound effects. It can produce nonverbal communications like laughing, sighing, and crying. Bark supports over 100 speaker presets and 13+ languages.
                                
                                    kaiwhakawhanake::
Suno
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Slow
                                    Kāwai::

                                    reo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Sound effectsLaughing/sighingMusic generation100+ speakersMultilingual
                                Ko te tino pai mo:: Creative audio content, audiobooks with emotion, sound effects
                            
                            
                                Whakamātautau Bark
                            
                        
                    
                    
                    
                        
                            
                                Bark Small
                                Paerewa
                            
                            
                                Bark Small is a distilled version of the Bark model that trades some audio quality for significantly faster inference speeds and lower memory requirements. It retains Bark's ability to generate speech with emotions, laughter, and multiple languages.
                                
                                    kaiwhakawhanake::
Suno
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh, fr, de, hi, it, ja, ko, pl, pt, ru, es, tr
                                    Ko te tāruatanga reo:
 Kāore
                                
                                LightweightFaster than full BarkEmotional speechMultilingual
                                Ko te tino pai mo:: Quick creative audio when full Bark is too slow
                            
                            
                                Whakamātautau Bark Small
                            
                        
                    
                    
                    
                        
                            
                                CosyVoice 2
                                Paerewa
                            
                            
                                CosyVoice 2 by Alibaba's Tongyi Lab achieves human-comparable speech quality with extremely low latency, making it ideal for real-time applications. It uses a finite scalar quantization approach for streaming synthesis and supports zero-shot voice cloning, cross-lingual synthesis, and fine-grained emotion control. It outperforms many commercial TTS systems in subjective evaluations.
                                
                                    kaiwhakawhanake::
Alibaba (Tongyi Lab)
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh, ja, ko, fr, de, it, es
                                    Ko te tāruatanga reo:
 He
                                
                                StreamingZero-shot cloningCross-lingualEmotion controlHuman-parity
                                Ko te tino pai mo:: Real-time applications, streaming TTS, voice assistants
                            
                            
                                Whakamātautau CosyVoice 2
                            
                        
                    
                    
                    
                        
                            
                                Dia TTS
                                Paerewa
                            
                            
                                Dia by Nari Labs is a 1.6B parameter text-to-speech model designed specifically for generating multi-speaker dialogue. It can produce natural-sounding conversations between two speakers with appropriate turn-taking, prosody, and emotional expression. Dia is perfect for creating podcast-style content, audiobook dialogues, and interactive conversational AI.
                                
                                    kaiwhakawhanake::
Nari Labs
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Multi-speakerDialog generationNatural turn-takingEmotional expression1.6B parameters
                                Ko te tino pai mo:: Podcasts, audiobook dialogues, conversational content
                            
                            
                                Whakamātautau Dia TTS
                            
                        
                    
                    
                    
                        
                            
                                Parler TTS
                                Paerewa
                            
                            
                                Parler TTS is a text-to-speech model that uses natural language voice descriptions to control the generated speech. Instead of selecting from preset voices, you describe the voice you want (e.g., "a warm female voice with a slight British accent, speaking slowly and clearly") and Parler generates speech matching that description. This makes it uniquely flexible for creative applications.
                                
                                    kaiwhakawhanake::
Hugging Face
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Voice descriptionNatural language controlFlexible voice creationNo preset voices needed
                                Ko te tino pai mo:: Creative applications where you need custom voice characteristics
                            
                            
                                Whakamātautau Parler TTS
                            
                        
                    
                    
                    
                        
                            
                                GLM-TTS
                                Paerewa
                            
                            
                                GLM-TTS by Zhipu AI is a text-to-speech system built on the Llama architecture with flow matching. It achieves the lowest character error rate among open-source TTS models, meaning it produces the most accurate pronunciation. GLM-TTS supports English and Chinese with voice cloning from 3-10 second audio samples.
                                
                                    kaiwhakawhanake::
Zhipu AI
                                    Whakawhiwhinga::
GLM-4 License
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 He
                                
                                Lowest error rateVoice cloningFlow matchingNatural prosody
                                Ko te tino pai mo:: Applications requiring maximum pronunciation accuracy
                            
                            
                                Whakamātautau GLM-TTS
                            
                        
                    
                    
                    
                        
                            
                                IndexTTS-2
                                Paerewa
                            
                            
                                IndexTTS-2 is an advanced text-to-speech system that excels at zero-shot voice synthesis with fine-grained emotion control. It can generate speech with specific emotional tones like happy, sad, angry, or fearful without requiring emotion-specific training data. The model uses emotion vectors to precisely control the emotional expression of generated speech.
                                
                                    kaiwhakawhanake::
Index Team
                                    Whakawhiwhinga::
Bilibili Model License
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 He
                                
                                Emotion controlZero-shotEmotion vectorsExpressive speechFine-grained control
                                Ko te tino pai mo:: Emotionally expressive content, audiobooks, virtual assistants
                            
                            
                                Whakamātautau IndexTTS-2
                            
                        
                    
                    
                    
                        
                            
                                Spark TTS
                                Paerewa
                            
                            
                                Spark TTS by SparkAudio is a text-to-speech model that combines voice cloning with controllable emotion and speaking style. Using just 5 seconds of reference audio, it can clone a voice and then generate speech with different emotions, speeds, and styles while maintaining the cloned voice identity. Spark TTS uses a prompt-based control system.
                                
                                    kaiwhakawhanake::
SparkAudio
                                    Whakawhiwhinga::
CC BY-NC-SA 4.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 He
                                
                                Voice cloningEmotion controlStyle controlPrompt-based5-second cloning
                                Ko te tino pai mo:: Content creation with cloned voices and emotional control
                            
                            
                                Whakamātautau Spark TTS
                            
                        
                    
                    
                    
                        
                            
                                GPT-SoVITS
                                Paerewa
                            
                            
                                GPT-SoVITS combines GPT-style language modeling with SoVITS (Singing Voice Inference via Translation and Synthesis) for powerful few-shot voice cloning. With as little as 5 seconds of reference audio, it can accurately clone a voice and generate new speech while preserving the speaker's unique characteristics. It excels at both speaking and singing voice synthesis.
                                
                                    kaiwhakawhanake::
RVC-Boss
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Slow
                                    Kāwai::

                                    reo:
en, zh, ja, ko
                                    Ko te tāruatanga reo:
 He
                                
                                5-second cloningSinging voiceFew-shot learningHigh fidelityCross-lingual
                                Ko te tino pai mo:: Voice cloning, singing synthesis, content creator voice replication
                            
                            
                                Whakamātautau GPT-SoVITS
                            
                        
                    
                    
                    
                        
                            
                                Orpheus
                                Paerewa
                            
                            
                                Orpheus is a large-scale text-to-speech model that achieves human-level emotional expression. Trained on over 100,000 hours of diverse speech data, it excels at generating speech with natural emotions, emphasis, and speaking styles. Orpheus can produce speech that is virtually indistinguishable from human recordings.
                                
                                    kaiwhakawhanake::
Canopy Labs
                                    Whakawhiwhinga::
Llama 3.2 Community
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Human-level emotion100K hours trainingNatural emphasisExpressive speech
                                Ko te tino pai mo:: High-quality emotional speech, audiobooks, voice acting
                            
                            
                                Whakamātautau Orpheus
                            
                        
                    
                    
                    
                        
                            
                                Qwen3 TTS
                                Paerewa
                            
                            
                                Qwen3-TTS is a 1.7 billion parameter text-to-speech model from Alibaba's Qwen team. It supports three modes: preset voices with emotion control (9 speakers), voice cloning from just 3 seconds of audio, and a unique voice design mode where you describe the voice you want in natural language. It covers 10 languages with high expressiveness and natural prosody.
                                
                                    kaiwhakawhanake::
Alibaba (Qwen)
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh, ja, ko, de, fr, ru, pt, es, it
                                    Ko te tāruatanga reo:
 He
                                
                                Voice cloning9 preset voicesVoice design from textEmotion control10 languages
                                Ko te tino pai mo:: Multilingual content with voice cloning or custom voice design
                            
                            
                                Whakamātautau Qwen3 TTS
                            
                        
                    
                    
                    
                        
                            
                                Chatterbox Turbo
                                Paerewa
                            
                            
                                Chatterbox Turbo by Resemble AI is a 350M parameter upgrade to Chatterbox, delivering up to 6x real-time speed with sub-200ms latency. It supports paralinguistic tags like [laugh], [cough], and [chuckle] directly in text. Includes Perth watermarking on all generated audio for provenance tracking.
                                
                                    kaiwhakawhanake::
Resemble AI
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 He
                                
                                Sub-200ms latencyParalinguistic tags6x real-timeVoice cloningWatermarking
                                Ko te tino pai mo:: Real-time voice agents, expressive speech with natural sounds
                            
                            
                                Whakamātautau Chatterbox Turbo
                            
                        
                    
                    
                    
                        
                            
                                Zonos
                                Paerewa
                            
                            
                                Zonos v0.1 by Zyphra is a 1.6B parameter model featuring fine-grained emotion control with sliders for happiness, anger, sadness, fear, and surprise. It offers both a Transformer and a novel SSM (state-space model) variant. Trained on 200K+ hours of multilingual speech with zero-shot voice cloning from 10-30 seconds of reference audio.
                                
                                    kaiwhakawhanake::
Zyphra
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, ja, zh, fr, de
                                    Ko te tāruatanga reo:
 He
                                
                                Emotion controlVoice cloningSSM architectureMultilingualPitch/rate control
                                Ko te tino pai mo:: Expressive speech with emotion control, voice design studio
                            
                            
                                Whakamātautau Zonos
                            
                        
                    
                    
                    
                        
                            
                                Dia 2
                                Paerewa
                            
                            
                                Dia2 by Nari Labs is a streaming-first upgrade to Dia, available in 1B and 2B parameter variants. It begins synthesizing audio from the first few tokens, making it ideal for real-time voice agents and speech-to-speech pipelines. Supports multi-speaker dialogue with [S1]/[S2] tags and paralinguistic cues like (laughs), (coughs).
                                
                                    kaiwhakawhanake::
Nari Labs
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Streaming outputMulti-speakerLow latencyParalinguistic cuesUp to 2 min output
                                Ko te tino pai mo:: Real-time voice agents, dialogue generation, streaming applications
                            
                            
                                Whakamātautau Dia 2
                            
                        
                    
                    
                    
                        
                            
                                VoxCPM
                                Paerewa
                            
                            
                                VoxCPM 1.5 by OpenBMB is a novel tokenizer-free TTS model that operates in continuous space rather than discrete tokens. It produces high-fidelity 44.1kHz audio, supports zero-shot voice cloning from 3-10 seconds, and maintains consistency across paragraphs. Cross-language cloning lets you apply an English voice to Chinese speech and vice versa.
                                
                                    kaiwhakawhanake::
OpenBMB
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 He
                                
                                44.1kHz audioTokenizer-freeCross-lingual cloningContext-awareLoRA fine-tuning
                                Ko te tino pai mo:: High-fidelity audio, audiobooks, long-form content with voice consistency
                            
                            
                                Whakamātautau VoxCPM
                            
                        
                    
                    
                    
                        
                            
                                TADA
                                Paerewa
                            
                            
                                TADA (Text-Acoustic Dual Alignment) by Hume AI is a groundbreaking TTS model that eliminates hallucinations through a novel dual alignment architecture built on Llama 3.2. Available in 1B (English) and 3B (multilingual) variants, TADA achieves an RTF of 0.09 — 5x faster than comparable LLM-based TTS models. It supports up to 700 seconds of audio context and produces emotionally expressive speech with zero hallucinations on standard benchmarks.
                                
                                    kaiwhakawhanake::
Hume AI
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Zero hallucinations5x faster than LLM TTSEmotional expression700s audio contextDual alignment
                                Ko te tino pai mo:: High-quality hallucination-free speech, emotional expression, fast inference
                            
                            
                                Whakamātautau TADA
                            
                        
                    
                    
                    
                        
                            
                                VibeVoice
                                Paerewa
                            
                            
                                VibeVoice from Microsoft generates long-form speech up to 90 minutes with support for 4 simultaneous speakers, making it ideal for podcasts and dialogues. The Realtime 0.5B variant achieves ~300ms latency for interactive use. Supports speaker tags for multi-turn dialogue generation.
                                
                                    kaiwhakawhanake::
Microsoft
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 Kāore
                                
                                Multi-speakerLong-form (90 min)Podcast generationDialogueLow latency
                                Ko te tino pai mo:: Podcasts, dialogues, long-form narration, multi-speaker content
                            
                            
                                Whakamātautau VibeVoice
                            
                        
                    
                    
                    
                        
                            
                                CosyVoice3
                                Paerewa
                            
                            
                                CosyVoice3 is the latest evolution from Alibaba's FunAudioLLM team. It features bi-streaming inference with ~150ms latency, instruction-based control for emotion/speed/volume, and improved speaker similarity for zero-shot cloning. Supports 9 languages plus 18 Chinese dialects. RL-tuned variant delivers state-of-the-art prosody.
                                
                                    kaiwhakawhanake::
Alibaba (FunAudioLLM)
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Fast
                                    Kāwai::

                                    reo:
en, zh, ja, ko, de, es, fr, it, ru
                                    Ko te tāruatanga reo:
 He
                                
                                Bi-streamingEmotion controlVoice cloningSpeed/volume controlInstruction following
                                Ko te tino pai mo:: Multilingual production TTS, real-time applications, voice cloning
                            
                            
                                Whakamātautau CosyVoice3
                            
                        
                    
                    
                
            

            
            
                
                    
                    
                        
                            
                                Chatterbox
                                Whakawhiwhinga
                            
                            
                                Chatterbox by Resemble AI is a cutting-edge zero-shot voice cloning model. It can replicate any voice from a single audio sample with remarkable accuracy, capturing not just the timbre but also the speaking style and emotional nuances. Chatterbox also features fine-grained emotion control, allowing you to adjust the emotional tone of the generated speech independently from the voice identity.
                                
                                    kaiwhakawhanake::
Resemble AI
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 He
                                    VRAM:
4GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Zero-shot cloningEmotion controlHigh fidelityStyle transferSingle sample cloning
                                Ko te tino pai mo:: Professional voice cloning with emotional control, content creation
                            
                            
                                Whakamātautau Chatterbox
                            
                        
                    
                    
                    
                        
                            
                                Tortoise TTS
                                Whakawhiwhinga
                            
                            
                                Tortoise TTS is an autoregressive multi-voice text-to-speech system that prioritizes audio quality over speed. It uses DALL-E-inspired architecture to generate highly natural speech with excellent prosody and speaker similarity. While slower than many alternatives, Tortoise produces some of the most realistic synthetic speech available in the open-source ecosystem.
                                
                                    kaiwhakawhanake::
James Betker
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Slow
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 He
                                    VRAM:
8GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Highest qualityMulti-voiceDALL-E architectureVoice cloningAutoregressive
                                Ko te tino pai mo:: Audiobooks, premium content, quality-first applications
                            
                            
                                Whakamātautau Tortoise TTS
                            
                        
                    
                    
                    
                        
                            
                                StyleTTS 2
                                Whakawhiwhinga
                            
                            
                                StyleTTS 2 achieves human-level TTS synthesis by combining style diffusion with adversarial training using large speech language models. It generates the most natural sounding speech among single-speaker models, rivaling human recordings. StyleTTS 2 uses diffusion-based style modeling to capture the full range of human speech variation.
                                
                                    kaiwhakawhanake::
Columbia University
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                    VRAM:
4GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Human-levelStyle diffusionAdversarial trainingNatural variationHigh fidelity
                                Ko te tino pai mo:: Studio-quality single-speaker synthesis, professional narration
                            
                            
                                Whakamātautau StyleTTS 2
                            
                        
                    
                    
                    
                        
                            
                                OpenVoice
                                Whakawhiwhinga
                            
                            
                                OpenVoice by MyShell.ai enables instant voice cloning with granular control over voice style, emotion, accent, rhythm, pauses, and intonation. It can clone a voice from a short audio clip and generate speech in multiple languages while maintaining the speaker identity. OpenVoice also functions as a voice converter, allowing real-time voice transformation.
                                
                                    kaiwhakawhanake::
MyShell.ai / MIT
                                    Whakawhiwhinga::
MIT
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh, ja, ko, fr, de, es, it
                                    Ko te tāruatanga reo:
 He
                                    VRAM:
4GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Instant cloningVoice conversionEmotion controlAccent controlMultilingual
                                Ko te tino pai mo:: Voice cloning with fine-grained style control, voice conversion
                            
                            
                                Whakamātautau OpenVoice
                            
                        
                    
                    
                    
                        
                            
                                Sesame CSM
                                Whakawhiwhinga
                            
                            
                                Sesame CSM (Conversational Speech Model) is a 1 billion parameter model designed specifically for generating conversational speech. It models the natural patterns of human conversation including turn-taking timing, backchannel responses, emotional reactions, and conversational flow. CSM generates audio that sounds like a natural human conversation rather than synthetic speech.
                                
                                    kaiwhakawhanake::
Sesame
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Slow
                                    Kāwai::

                                    reo:
en
                                    Ko te tāruatanga reo:
 Kāore
                                    VRAM:
8GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                ConversationalNatural timingTurn-takingBackchannel1B parameters
                                Ko te tino pai mo:: AI assistants, chatbots, conversational AI applications
                            
                            
                                Whakamātautau Sesame CSM
                            
                        
                    
                    
                    
                        
                            
                                MOSS-TTS
                                Whakawhiwhinga
                            
                            
                                MOSS-TTS from OpenMOSS supports generation of up to 1 hour of continuous speech across 20 languages. Features token-level duration control, phoneme-level pronunciation control via IPA/Pinyin, and code-switching between languages. The 8B production model delivers state-of-the-art quality with zero-shot voice cloning from reference audio.
                                
                                    kaiwhakawhanake::
OpenMOSS
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Medium
                                    Kāwai::

                                    reo:
en, zh, de, es, fr, ja, it, hu, ko, ru, fa, ar, pl, pt, cs, da, sv, el, tr
                                    Ko te tāruatanga reo:
 He
                                    VRAM:
16GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Ultra-long generation20 languagesVoice cloningDuration controlPronunciation controlCode-switching
                                Ko te tino pai mo:: Audiobooks, long-form content, multilingual production
                            
                            
                                Whakamātautau MOSS-TTS
                            
                        
                    
                    
                    
                        
                            
                                MegaTTS3
                                Whakawhiwhinga
                            
                            
                                MegaTTS3 from ByteDance uses a novel sparse alignment mechanism combined with a latent diffusion transformer. Features adjustable trade-off between speech intelligibility and speaker similarity for zero-shot voice cloning.
                                
                                    kaiwhakawhanake::
ByteDance
                                    Whakawhiwhinga::
Apache 2.0
                                    Āhuatanga:
Slow
                                    Kāwai::

                                    reo:
en, zh
                                    Ko te tāruatanga reo:
 He
                                    VRAM:
8GB
                                    Ko te utu mō ia pūāhua 1K:
4x
                                
                                Voice cloningAdjustable similarityCross-lingual
                                Ko te tino pai mo:: High-fidelity voice cloning
                            
                            
                                Whakamātautau MegaTTS3
                            
                        
                    
                    
                
            
        

        
        
            Te ripanga whakataurite tauira
            
                
                    
                        
                            Kāhua
                            kaiwhakawhanake:
                            Te āhua
                            Kāwai:
                            Āhuatanga
                            reo
                            Ko te tāruatanga reo
                            VRAM
                            Whakawhiwhinga:
                            pūtea
                            
                        
                    
                    
                        
                        
                            Kokoro
                            Hexgrad
                            Free
                            
                            Fast
                            11
                            
                            1.5GB
                            Apache 2.0
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            Piper
                            Rhasspy
                            Free
                            
                            Fast
                            31
                            
                            0 (CPU only)
                            MIT
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            VITS
                            Jaehyeon Kim et al.
                            Free
                            
                            Fast
                            4
                            
                            1GB
                            MIT
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            MeloTTS
                            MyShell.ai
                            Free
                            
                            Fast
                            6
                            
                            0.5GB (GPU optional)
                            MIT
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            Bark
                            Suno
                            Standard
                            
                            Slow
                            13
                            
                            5GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            Bark Small
                            Suno
                            Standard
                            
                            Medium
                            13
                            
                            2GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            CosyVoice 2
                            Alibaba (Tongyi Lab)
                            Standard
                            
                            Medium
                            8
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            Dia TTS
                            Nari Labs
                            Standard
                            
                            Medium
                            1
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            Parler TTS
                            Hugging Face
                            Standard
                            
                            Medium
                            1
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            GLM-TTS
                            Zhipu AI
                            Standard
                            
                            Medium
                            2
                            
                            4GB
                            GLM-4 License
                            2
                            Ka whakamahia
                        
                        
                        
                            IndexTTS-2
                            Index Team
                            Standard
                            
                            Medium
                            2
                            
                            4GB
                            Bilibili Model License
                            2
                            Ka whakamahia
                        
                        
                        
                            Spark TTS
                            SparkAudio
                            Standard
                            
                            Medium
                            2
                            
                            4GB
                            CC BY-NC-SA 4.0
                            2
                            Ka whakamahia
                        
                        
                        
                            GPT-SoVITS
                            RVC-Boss
                            Standard
                            
                            Slow
                            4
                            
                            6GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            Orpheus
                            Canopy Labs
                            Standard
                            
                            Medium
                            1
                            
                            4GB
                            Llama 3.2 Community
                            2
                            Ka whakamahia
                        
                        
                        
                            Chatterbox
                            Resemble AI
                            Premium
                            
                            Medium
                            1
                            
                            4GB
                            MIT
                            4
                            Ka whakamahia
                        
                        
                        
                            Tortoise TTS
                            James Betker
                            Premium
                            
                            Slow
                            1
                            
                            8GB
                            Apache 2.0
                            4
                            Ka whakamahia
                        
                        
                        
                            StyleTTS 2
                            Columbia University
                            Premium
                            
                            Medium
                            1
                            
                            4GB
                            MIT
                            4
                            Ka whakamahia
                        
                        
                        
                            OpenVoice
                            MyShell.ai / MIT
                            Premium
                            
                            Medium
                            8
                            
                            4GB
                            MIT
                            4
                            Ka whakamahia
                        
                        
                        
                            Qwen3 TTS
                            Alibaba (Qwen)
                            Standard
                            
                            Medium
                            10
                            
                            7GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            Sesame CSM
                            Sesame
                            Premium
                            
                            Slow
                            1
                            
                            8GB
                            Apache 2.0
                            4
                            Ka whakamahia
                        
                        
                        
                            Chatterbox Turbo
                            Resemble AI
                            Standard
                            
                            Fast
                            1
                            
                            2GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            Zonos
                            Zyphra
                            Standard
                            
                            Medium
                            5
                            
                            6GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            Dia 2
                            Nari Labs
                            Standard
                            
                            Fast
                            1
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            VoxCPM
                            OpenBMB
                            Standard
                            
                            Fast
                            2
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            OuteTTS
                            OuteAI
                            Free
                            
                            Fast
                            1
                            
                            2GB
                            Apache 2.0
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            TADA
                            Hume AI
                            Standard
                            
                            Fast
                            1
                            
                            5GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            VibeVoice
                            Microsoft
                            Standard
                            
                            Fast
                            2
                            
                            4GB
                            MIT
                            2
                            Ka whakamahia
                        
                        
                        
                            Pocket TTS
                            Kyutai
                            Free
                            
                            Fast
                            2
                            
                            1GB
                            MIT
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            Kitten TTS
                            KittenML
                            Free
                            
                            Fast
                            1
                            
                            0GB
                            Apache 2.0
                            Waihoki
                            Ka whakamahia
                        
                        
                        
                            CosyVoice3
                            Alibaba (FunAudioLLM)
                            Standard
                            
                            Fast
                            9
                            
                            4GB
                            Apache 2.0
                            2
                            Ka whakamahia
                        
                        
                        
                            MOSS-TTS
                            OpenMOSS
                            Premium
                            
                            Medium
                            19
                            
                            16GB
                            Apache 2.0
                            4
                            Ka whakamahia
                        
                        
                        
                            MegaTTS3
                            ByteDance
                            Premium
                            
                            Slow
                            2
                            
                            8GB
                            Apache 2.0
                            4
                            Ka whakamahia
                        
                        
                    
                
            
        
    




    
        
            
                Ko te pūwāhi kupu AI tino whānui ki te kōrerorero

                
                    
                        He aha te kōwhiringa a TTS.ai mō te kupu ki te kōrero?
                        TTS.ai e whakakotahi ana te ao
                        Ko ia tauira he pūtake tūwhera i raro i te MIT, Apache 2.0, he whakaaetanga ōrite rānei, e whakaū ana i ōna mana hokohoko katoa hei whakamahi i te oro i hangaia i roto i ōna kaupapa. Mēnā e hiahiatia ana e koe he whakakotahitanga tere, māmā rānei mō ngā taupānga wā tūturu, te huaputa mātauranga rānei mō ngā pukapuka oro me ngā podcast, he tauira tika a TTS.ai mō ia take whakamahi.

                        Kāhua wātea, kāore he tatau e hiahiatia ana
                        Ka tīmata i te wā kotahi ki ngā tauira TTS wātea e toru: Piper (āhua tere, māmā), VITS (whakahaeretanga ā-ira nui), me MeloTTS (whakahaeretanga reo maha). Kāore he whakaingoatanga, kāore he kāri pūtea, kāore he tepe i runga i ngā whakatupuranga. Ko ngā tauira wātea e tautoko ana i te reo Ingarihi me ētahi atu reo maha me ngā huaputa pūoro māori e tika ana mō te nuinga o ngā taupānga.
                    
                    
                        Ka whakateretia te tukanga GPU
                        Ko ngā tauira TTS katoa e haere ana i runga i ngā GPU NVIDIA motuhake mō ngā wā whakawhanake tere, ōrite. Ko ngā tauira wātea e whakaputa reo ana i raro iho i te 2 sekone. Ko ngā tauira paerewa pēnei i a Kokoro, CosyVoice 2, me Bark te nuinga o te 3-5 sekone. Ko ngā tauira utu me te āhuatanga tiketike rawa, pēnei i a Tortoise me Chatterbox, e mahi ana i roto i te 5-15 sekone, i runga anō i te roanga o te kupu.

                        30+ reo kua tautokona
                        Ka whakaputa kōrero i ngā reo neke atu i te 30 tae atu ki te reo Ingarihi, Paniora, Wīwī, Tiamana, Itari, Portuguese, Hainamana, Hapanihi, Koreana, Arabic, Hindi, Rūhia, me ētahi atu. He maha ngā tauira e tautoko ana i te whakawhiti-reo, ko te tikanga ka taea e koe te whakaputa kōrero i roto i tētahi reo kāore anō kia whakaakona te reo taketake. Ko CosyVoice 2 me GPT-SoVITS e tino pai ana i te tārua reo whakawhiti-reo.

                        Ka whakaritea e te kaiwhakawhanake
                        Ka whakaurua a TTS.ai ki ōna taupānga me a tātau OpenAI-hoatu REST API. He wāhi mutunga kotahi mō ngā tauira 20+ katoa. Python, JavaScript, cURL, me Go SDKs. Whakawhiwhinga tautoko mō ngā taupānga wā tūturu. Whakaputanga rōpū mō te whakawhanaketanga ihirangi nui. Webhooks mō ngā mōhiohio async. E wātea ana ki ngā mahere Pro me Enterprise.
                    
                
            
        
    









    



    
        
        
        Mātau anō →
        
    










    
        E pā ana ngā pātai
        
            
                
                    
                    
                        
                            
                        
                        
                            
                                Ko te kupu ki te kōrero (TTS) he hangarau AI e tahuri ana i te kupu tuhituhi ki te oro kōrero māori. Ko ngā tauira TTS ā-ira o nāianei pēnei i a Kokoro, Chatterbox, me CosyVoice 2 e whakamahi ana i te akoranga hōhonu hei whakanao i te reo e āhua nei he tino tangata, me te āhua o te āhua o te āhua, te āhua o te āhua, me te āhua o te āhua.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                E ai ki ōna hiahia. Mō ngā tirohanga tere, ka whakamahia e Piper, MeloTTS rānei (wāhanga, tere). Mō te āhuatanga tiketike, ka whakamātautia e Kokoro, CosyVoice rānei 2 (tauine paerewa). Mō te tārua reo, ka whakamahia e Chatterbox, GPT-SoVITS rānei (whakahaere). Mō ngā ihirangi tauwhitinga / podcast, ka whakamātautia e Dia TTS. He rerekē ngā kaha o ia tauira — whakamātau ki te kimi i te pai rawa atu.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                He! TTS.ai e whakarato ana i te kupu-ki-whakaahua-whakaahua me ngā tauira Kokoro, Piper, VITS, me MeloTTS. Kāore he kāwanatanga e hiahiatia ana mō ngā pūāhua tae atu ki te 500 me ngā whakatupuranga 3 i ia wā. Ka tāuru mō tētahi kāwanatanga wātea kia whiwhi ai i ngā pūtea 50 me te āheitanga ki ngā tauira katoa.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ko a tātau tauira TTS e tautoko ana i ngā reo 30+ tae atu ki te reo Ingarihi, Paniora, Wīwī, Tiamana, Itari, Portuguese, Hainamana, Hapanihi, Korea, Arabic, Russian, Hindi, me ētahi atu.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ināianei, ka taea te whakamahi i te oro i hangaia mā TTS.ai. Ka whakamahia e tātau ngā tauira katoa ngā whakaaetanga pūtake tūwhera (MIT, Apache 2.0). Ka tirohia ngā whakaaetanga tauira takitahi mō ngā whakaritenga tauwhāiti. E whakatūpato ana mātou ki te arotake i te whakaaetanga o te tauira tauwhāiti e whakamahia ana mō tōtou kaupapa.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                E tautoko ana a TTS.ai i ngā momo huaputa MP3, WAV, OGG, me FLAC. Ko te MP3 te tūturu mō te tākaro i te Wīwī. E whakaaetia ana te WAV mō te tukatuka oro. Ka taea e koe te tahuri i waenganui i ngā momo mā te whakamahi i tātau utauta Pārere Oro.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ko te tārua reo e whakamahi ana i te AI hei tārua i tētahi reo tauwhāiti mai i tētahi tauira orooro poto (i te nuinga o te wā 5–30 sekone). Ka whakatakina he pūkete mārama o te orooro ūnga, ā, ko ngā tauira pēnei i te Chatterbox, GPT-SoVITS, OpenVoice rānei ka waihanga i tētahi kōrero hou i roto i taua orooro. Ka pai ake te āhuatanga me te orooro tohutoro mārō.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ka taea e ngā kaiwhakamaori wātea te waihanga tae atu ki ngā pūāhua 500 i ia tono. Ka whiwhi ngā kaiwhakamaori rārangi ki ngā pūāhua 5,000 i ia tono. Mō ngā kupu roa ake, ka waihangatia te oro i roto i ngā kōwae, ā, ka whakakotahitia ā-pūāhua. Ka taea e ngā kaiwhakamaori API te tukanga tae atu ki ngā pūāhua 10,000 i ia tono.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                He rerekē te tautoko a SSML (Speech Synthesis Markup Language) i runga anō i te tauira. Ko Piper me ētahi atu tauira e tautoko ana i ngā tohu SSML taketake mō ngā tauwhāiti, ngā whakahuatanga, me te whakahaere kōrero. Mō ngā tauira kāore i te tautoko SSML taketake, ka taea e koe te whakamahi i ngā whakarārangi māori me ngā whakawhitinga raina hei whakaawe i te āhua o te kōrero.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Heoi anō, ko te nuinga o ngā tauira e tautoko ana i te whakarerekētanga tere mai i te 0.5x ki te 2.0x. Ko ētahi tauira pēnei i te Bark me te Parler e whakaae ana hoki ki te whakahaere āhua me te āhua. Ka taea e koe te whakarite i ngā tohu tere i roto i te taupuni whakaritenga hōhonu, mā te tohu tere API rānei.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ināianei, ka wātea te tukanga rōpū puta noa i a tātau API. Ka taea e koe te tono i ngā wāhanga kupu maha i roto i tētahi kīanga API kotahi, i tētahi tuhipānui rānei, ā, ka whakamātautia ia me te hoki ki ngā pūranga oro motuhake. He tino pai tēnei mō ngā wāhanga pukapuka oro, ngā wae e-mātau, ngā tuhipānui kōrero kēmu rānei.
                            
                        
                    
                    
                    
                        
                            
                        
                        
                            
                                Ka whakaputaina he kī API mai i tōtou papatono kāwanatanga, kātahi ka tukuna ngā tono POST ki a tātau wāhi mutunga o te REST API me tōtou kupu, tauira, me ngā tohu reo. Ka whakarato rātau i ngā tauira waehere i roto i te Python, JavaScript, me te cURL. Ko te API e ōrite ana ki te OpenAI, nō reira ka mahi ngā whakaurutanga tīariari ki ngā huringa iti rawa.
                            
                        
                    
                    
                
            
        
    








    
        
            
                
                
                
                
                
                
                
                
                
                
                
                
                
            
            5.0/5 (3)
        
        
            What could we improve? Your feedback helps us fix issues.
            
                
                
                
                
            
            
                
                
            
        
    







    
        Ka tīmata te tahuritanga o te kupu ki te kōrero ināianei
        Ka hono ki ngā mano o ngā kaihanga e whakamahi ana i te TTS.ai. Ka whiwhi 15,000 ngā pūāhua wātea me tētahi pūtake hou. Ka wātea ngā tauira wātea me te kore whakaingoatanga.
        
            
            Ka tāuru i te wātea
            Ka tirohia te utu

kaiwhakawhanake:	Rhasspy
Whakawhiwhinga:	MIT
Āhuatanga	Fast
Kāwai:
reo	31 reo
VRAM	0 (CPU only)
Ko te tāruatanga reo	Kāore i tautokona

Kāhua	kaiwhakawhanake:	Te āhua	Āhuatanga	reo	VRAM	Whakawhiwhinga:	pūtea
Kokoro	Hexgrad	Free	Fast	11	1.5GB	Apache 2.0	Waihoki	Ka whakamahia
Piper	Rhasspy	Free	Fast	31	0 (CPU only)	MIT	Waihoki	Ka whakamahia
VITS	Jaehyeon Kim et al.	Free	Fast	4	1GB	MIT	Waihoki	Ka whakamahia
MeloTTS	MyShell.ai	Free	Fast	6	0.5GB (GPU optional)	MIT	Waihoki	Ka whakamahia
Bark	Suno	Standard	Slow	13	5GB	MIT	2	Ka whakamahia
Bark Small	Suno	Standard	Medium	13	2GB	MIT	2	Ka whakamahia
CosyVoice 2	Alibaba (Tongyi Lab)	Standard	Medium	8	4GB	Apache 2.0	2	Ka whakamahia
Dia TTS	Nari Labs	Standard	Medium	1	4GB	Apache 2.0	2	Ka whakamahia
Parler TTS	Hugging Face	Standard	Medium	1	4GB	Apache 2.0	2	Ka whakamahia
GLM-TTS	Zhipu AI	Standard	Medium	2	4GB	GLM-4 License	2	Ka whakamahia
IndexTTS-2	Index Team	Standard	Medium	2	4GB	Bilibili Model License	2	Ka whakamahia
Spark TTS	SparkAudio	Standard	Medium	2	4GB	CC BY-NC-SA 4.0	2	Ka whakamahia
GPT-SoVITS	RVC-Boss	Standard	Slow	4	6GB	MIT	2	Ka whakamahia
Orpheus	Canopy Labs	Standard	Medium	1	4GB	Llama 3.2 Community	2	Ka whakamahia
Chatterbox	Resemble AI	Premium	Medium	1	4GB	MIT	4	Ka whakamahia
Tortoise TTS	James Betker	Premium	Slow	1	8GB	Apache 2.0	4	Ka whakamahia
StyleTTS 2	Columbia University	Premium	Medium	1	4GB	MIT	4	Ka whakamahia
OpenVoice	MyShell.ai / MIT	Premium	Medium	8	4GB	MIT	4	Ka whakamahia
Qwen3 TTS	Alibaba (Qwen)	Standard	Medium	10	7GB	Apache 2.0	2	Ka whakamahia
Sesame CSM	Sesame	Premium	Slow	1	8GB	Apache 2.0	4	Ka whakamahia
Chatterbox Turbo	Resemble AI	Standard	Fast	1	2GB	MIT	2	Ka whakamahia
Zonos	Zyphra	Standard	Medium	5	6GB	Apache 2.0	2	Ka whakamahia
Dia 2	Nari Labs	Standard	Fast	1	4GB	Apache 2.0	2	Ka whakamahia
VoxCPM	OpenBMB	Standard	Fast	2	4GB	Apache 2.0	2	Ka whakamahia
OuteTTS	OuteAI	Free	Fast	1	2GB	Apache 2.0	Waihoki	Ka whakamahia
TADA	Hume AI	Standard	Fast	1	5GB	MIT	2	Ka whakamahia
VibeVoice	Microsoft	Standard	Fast	2	4GB	MIT	2	Ka whakamahia
Pocket TTS	Kyutai	Free	Fast	2	1GB	MIT	Waihoki	Ka whakamahia
Kitten TTS	KittenML	Free	Fast	1	0GB	Apache 2.0	Waihoki	Ka whakamahia
CosyVoice3	Alibaba (FunAudioLLM)	Standard	Fast	9	4GB	Apache 2.0	2	Ka whakamahia
MOSS-TTS	OpenMOSS	Premium	Medium	19	16GB	Apache 2.0	4	Ka whakamahia
MegaTTS3	ByteDance	Premium	Slow	2	8GB	Apache 2.0	4	Ka whakamahia