VITS

Baker (Chinese)

Ikhululekile isi-Chinese (kunzima) Neutral VITS

{igama} yizwi le- neutral AI elisebenza ngemodeli ye-VITS yombhalo-kuya-kwezwi. Lezwi le-free-tier likhuluma {ulwimi} futhi linikeza ukuhlanganisa kwezwi le-olungile-quality. Nge i-near-instant isivinini sokukhishwa kanye nezinga lomgangatho lwe 3/5, Baker (Chinese) lilungele general-purpose text-to-speech with natural prosody. Injini VITS ithuthukiswe ngu Jaehyeon Kim et al. under the MIT license, iyenza iphephile ukusetshenziswa kwezokuhweba. Izinsiza ezibalulekile zifaka: {izici}.

Akukho manani

VITSUlwazi lwemodeli

Imodeli VITS
Umthuthukisi Jaehyeon Kim et al.
Ubunjani
Isivinini Isheshayo
Ilayisense MIT
Ukuklonya Ayikho
I-Tiger Ikhululekile (akunamagama asetshenziswa)
Amapharamitha 25M
Ukwakhiwa VAE + Normalizing Flows + GAN
Ulwazi lokuzivocavoca 585 amahora
Unyaka 2021

Isibonelo esihle kakhulu sokusetshenziswa Baker (Chinese)

Izisebenziso ezivunyelwe ezisekelwe ezici zalesi sizwi

Incwadi yomsindo nenkulumo

Sebenzisa i-Baker (Chinese) ukuchaza okuqukethwe kwefomu elide nge-prosody ne-expression ezijwayelekile.

Amavidiyo akhuluma ngazo

Engeza ukukhuluma okusezingeni eliphakeme ku-YouTube amavidiyo, izikhangiso, kanye nesihloko semidiya yomphakathi.

Izicelo & Ukufinyeleleka

Ukukhiqizwa okukhawulelwe kwenza lo msindo ulungele izicelo zesikhathi sangempela, abafundi besiga-nyezi, namathuluzi ofinyeleleka.

Ukufunda nokufundisa

Dala amathuluzi okuqeqesha, izifundo, kanye nezinto ezifundiswayo ezibandakanya ukukhuluma nge-AI.

Okuningi VITS Izizwi

Okunye amagama emodeli efanayo ye-TTS

Default

isiNgisi Neutral

Imibuzo ebuzwa kaningi

VITS (Izibalo ezishintshayo ezifunda ngokuphikisanayo ukuqala ukubhala-ukukhuluma-ukuphela-ku-kuphela) yindlela ye-TTS elinganayo ekugcineni-ku-kuphela ekhiqiza umsindo ozwakalayo ojwayelekile kunalezo ezingemuva-ezimbili. Isebenzisa izibalo ezishintshayo ezithuthukisiwe ngokuhamba okujwayelekile kanye nenqubo yokuqeqeshwa okuphikisanayo, ethola ukukhula okuphawulekayo ekungavamile.

VITS yathuthukiswa nguJaehyeon Kim et al. futhi ikhishwa ngaphansi kwelayisense leMIT, evumela ukusetshenziswa kokuthengiswa kwesandi esikhiqizwe.

VITS isekela izilimi ezingu-4: isiNgisi, isiChinese, isiJaphani, isiKorean.

VITS ikwizinga elimahhala — mahhala — akukho zimali ezidingekayo. Ungabona ngaphambili noma yiluphi umsindo we-VITS mahhala ngaphambi kokwenza umsindo ophelele.

VITS inejubane lokuzaliseka elisheshayo kakhulu. Isebenza ngesikhathi esifanayo, iyenza ilungele ukusakazwa kanye nokuxhuma kohlelo lokusebenza.

I-VITS ilinganiselwe ngo-3/5 ngekhwalithi yomsindo ku-TTS.ai. Inikeza ukukhuluma okusezingeni eliphakeme okufanelekile kuzinhlelo eziningi.

Hayi, i-VITS isebenzisa isiqephu esiqinile samazwi afakwe ngaphakathi. Ukwenza isikhalazo, zama amamodeli afana ne-CosyVoice 2, GPT-SoVITS, noma ibhokisi lokuxoxa.

Yebo, i-VITS ikhuthazwa ngokukhethekile ukuhlela okujwayelekile kwe-text-to-speech nge-prosody ejwayelekile. Ukukhishwa kwayo okuphelele, i-prosody ejwayelekile, ukukhishwa okusheshayo kwenza kube ngcono kakhulu ukusebenzisa le nkinga.

Yebo, i-VITS ivunyelwe ngaphansi kwe-MIT, evumela ukusetshenziswa kokuthengiswayo. Umsindo okhiqizwa ngemisindo ye-VITS ungasetshenziswa kumavidiyo, kumapodcast, kuma-apps, kuma-games, nakwezinye izinhloso zokuthengiswayo.

Yebo, wonke amazwi ku-TTS.ai asebenzisa amamodeli avulekile avunyelwe ngokuhweba (MIT, Apache 2.0). Umsindo okhiqizwe ukhona wena ukuwusebenzisa kumavidiyo, amapodcast, ama-apps, imidlalo, nanoma iyiphi enye inqubo yokuhweba.

Thumela isicelo se-POST ku /api/v1/tts/ ngegama lemodeli ne-ID yomsindo. Bona ikhasi lethu le-API Documentation ngemiboniso yekhodi ku-Python, JavaScript, Go, ne-cURL.

Yebo, chofoza inkinobho yokudlala kulekhasi ukuze ulalele isibonisi. Ungabhala futhi umbhalo ojwayelekile kwikhasi le-Text to Speech futhi udale ukubuka kuqala okumahhala nganoma iyiphi ingoma.

Zama Baker (Chinese) Manje

Bhala noma yiluphi uxhumanisi bese ukhuluma ngaso Baker (Chinese). Imahhala ukuyisebenzisa akukho amagama adingekayo.