Ukukhuluma ku-Ukukhuluma

Guqula umsindo okhulumayo — shintsha umsindo, inkohliso, ulwimi, nesitayela ngenkathi ugcina okuqukethwe okusemthethweni.

Asikho isikhulumi se-TTS ezweni lakho. Sicela usize ukungeza isandla sakho! Uhlu lwamagama

Umsuka womsindo

Thwebula bese ushiya ihele lakho lapha, noma bheka

Upload your speech recording. MP3, WAV, FLAC, OGG. Max 50MB.

file.mp3

0 MB
— noma rekhoda umsindo wakho —
00:00

Izinguquko

Thwebula bese ushiya ihele lakho lapha, noma bheka

Upload a reference of the target voice. 10-30 sec recommended.

file.mp3

0 MB

Imiphumela

Layisha umsindo wokukhuluma, khetha ukuguqulwa kwakho, bese uchofoza ukuguqula ukuqala

Ukushintsha ulwimi... Lokhu kungathatha isikhathi.

Okuqala

Iguqulwe

Indlela esebenza ngayo

1. Layisha phezulu ulwimi

Khuphela noma ulayishe umsindo ofuna ukuwuguqula

2. Khetha Ukushintsha

Khetha ushintsho lomsindo, ukudluliswa kwesitayela, noma ukuguqulwa kolimi

3. Ukuguqulwa kwe-AI

I-AI iqhubekela phambili umsindo kusuka esiphethweni kuya esiphethweni igcina okuqukethwe kokukhuluma

Layisha phezulu

Bheka imiphumela bese ulanda umsindo wakho oguqulwe

Sebenzisa izimo

Ukukhuluma kuphi nalokho okukhulunywa ngawo, ukufinyeleleka, kanye nezinhlelo ezisha

Ukudluliswa kwevidiyo

Ividiyo iguqulwe ibe yizilimi ezahlukene ngenkathi igcina izimo zomsindo womsindo.

Ukuhlela imizwa

Guqula umsindo othakazelisayo wokurekhoda - yenza ukukhuluma okunengqondo okunethemba, noma ukukhuluma okungaba khona okunethemba futhi okunethezeka.

Ukukhishwa kwezwi

Guqula ukurekhodwa kwezwi okubi zibe izingxoxo eziqinile ngezwi nesimo esihlukile.

Ukufihla umsindo

Ufihla isibonakaliso somlobi ngenkathi ugcina igama ngalinye, ukuvimbela ukuthunyelwa kwezindaba noma ukuvikela ubumfihlo.

I-Speech to Speech Model

OpenVoice

Ukuguqulwa kwezwi okukhawulelwe ngesimo sokulawula esiqinile. Guqula isibonakaliso somsindo, isivinini, nesimo sengqondo emaminithini.

  • Uhlelo olusheshayo
  • Ukudluliswa kwesitayela
  • isi-Latin

Chatterbox

Uklonyeliswa kwezwi elingenamkhawulo nge-fine-grained emotion control kusuka ku-Resemble AI.

  • Ukulawula imizwa
  • I-zero-shot cloning
  • Ubuqiniso obuphezulu

CosyVoice 2

Ukuklonywa kwezwi nge-cross-language phakathi kwezilimi ezingu-8 nge-prosody yemvelo kanye noxhaso lokusakaza.

  • Izilimi
  • Ukulungiswa kwezwi
  • Ukusakazwa

Imibuzo ebuzwa kaningi

Ukukhuluma kukhuluma (STS) AI iguqula umsindo okhulumayo ube yisipiliyoni sokukhuluma esihlukile - ukuguqula umsindo, isitayela, inkohliso, noma ulwimi ngenkathi ugcina amagama angaphambilini nokulinganisa isikhathi. Ihlanganisa ukuphawula kokukhuluma, ukucubungula, nokuhlanganiswa kwe-pipeline eyodwa.

Umbhalo-ku-ukukhuluma uguqula umbhalo obhalwe ube yisandi. Ukukhuluma-ku-ukukhuluma uthatha isisandi esisha njengengeniso futhi uguqula ngokuqondile sibe yisandi esitsha — ugcina umsindo ojwayelekile, izikhumbuzo, ukuphawula, kanye nemizwa yokulingisa okusemthethweni ngaphezu kokuletha ulwimi kusuka kumbhalo ocacile.

Ukusetshenziswa okujwayelekile kufaka phakathi ukudlulisa amavidiyo kwezinye izilimi, ukuguqula umsindo womsindo kurekhodi, ukuhlela inkanuko noma umsindo okhona, ukwenza izilimi eziphezulu ezirekhodiwe, nokwenza izilimi ezirekhodiwe zibe ziyimfihlo ngenkathi zigcina okuqukethwe.

Imodeli yokushintshana kwezwi njenge OpenVoice ne RVC iphatha ukushintshana kwezwi-ngezwi. Ukukhuluma ulwimi olufanayo, iCosyVoice 2 ne GPT-SoVITS ingaklonyela futhi iphinde ihlanganise kabusha ulwimi oluhlukile. Ibhokisi lezingxoxo lixhasa futhi ukuxhuma-isikhulumi-sisekelwe ekuhlanganiseni.

Yebo. Usebenzisa amamodeli okuklonyelwe kwezwi, ungaguqula ulwimi lwakho lube olunye ulwimi ngenkathi ugcina izici zakho zezwi. I-AI ikhipha umlando wakho wezwi futhi iphinde ihlanganise umsindo kwi-ilwimi noma kuhlobo oluzosetshenziswa.

Ipayipi liqala ngokudlulisa ulwimi lwakho, liguqula umbhalo ube yilungu lelimi elifunayo, bese lisebenzisa ukuklonya umsindo ukuhlela umbhalo oguqulwe ube yilungu lelimi lakho eliyinhloko. Amamodeli afana neCosyVoice 2 axhasa izilimi ezingu-8 zokuhlela izilimi ezihlukene.

Ukuthola imiphumela engcono kakhulu, ulayishe umsindo ohlanzekile nge-background noise encane. I-WAV noma i-FLAC ku-16kHz noma ngaphezulu isebenza kahle kakhulu. I-MP3, OGG, M4A, ne-WEBM nazo zivunyelwe. Ukukhuluma okucacile kukhiqiza ukuguqulwa okunembile kakhulu.

Uhlelo lwesikhathi sangempela esincane lutholakala nge-API yethu usebenzisa amamodeli asheshayo njenge-Kokoro ukwakha isifinyezo kanye ne-Faster Whisper ukuqonda. Ukwehla kwesikhathi kuxhomekeka kumodeli kanye nobude besandi, kodwa ama-sub-3-second turnarounds angafinyelelwa ngemibhalo emincane.

Yebo. Amamodeli afana ne-Chatterbox, i-Spark TTS, ne-IndexTTS-2 axhasa ukukhathazeka nokulawulwa kwesimo. Ungashintsha ukukhuluma okunengqondo kube okunethezeka, okubuhlungu kube okunethezeka, noma okungenalutho kube okunethezeka ngenkathi ugcina amagama afanayo nolwazi lomlobi.

Ukukhuluma kuguqulela ukukhuluma kuhlanganise ukuphawula nokuhlanganiswa kwezinhlamvu. Ukuguqulwa okujwayelekile kwemizuzu emi-1 kusetshenziswa izinhlamvu ezingu-3,000-8,000 ngokuya ngemodeli ekhethiwe. Imodeli ye-free-tier njenge-Kokoro ingasetshenziswa ngenyathelo yokuhlanganiswa ngemali engenayo.

Abasebenzisi abamahhala bangaphatha umsindo kuze kube yimizuzu engu-1. Ama-plans akhokhelwayo axhasa amafayela kuze kube yimizuzu engu-10. Ukufaka okude, hlukanisa umsindo zibe yizingxenye noma sebenzisa i-API yethu yokuphatha iqembu ngaphandle komkhawulo wobude.

Yebo, wonke umsindo olayishwe phezulu uqhubekelwa kumaseva ethu aphephile we-GPU futhi ucishwe ngokuzenzakalela ngaphakathi kwehora le-24. Asisebenzisi umsindo wakho ukuqeqesha amamodeli. Wonke ukudluliswa sebenzisa ukuxhumana okufihlakele futhi ukuxhumana phakathi kwamaseva kuqinisekiswa.
5.0/5 (1)

Yini esingayithuthukisa? Umbono wakho usiza ukuxazulula izinkinga.

Tshintsha noma yiluphi ulwimi nge-AI

Guqula umsindo, inkohliso, ulwimi, nesitayela. Bhala ngokumahhala futhi uthole amaphawu angama-15,000 ukuqalisa.