Umbiko wephutha / Umbuzo wezici

Isikhathi sangempela TTS

Ukusakaza umbhalo-ku-ukukhuluma ngesikhathi sokuqala-sokudla-sokuqala. Ifakwe izisebenzi zokukhuluma nezicelo eziphilayo.

Asikho isikhulumi se-TTS ezweni lakho. Sicela usize ukungeza isandla sakho! Uhlu lwamagama

Okubhaliweyo

Ukusakazwa
0/5,000 amaphawu ~0.3s umsindo wokuqala

Izicwangciso zomsindo

Amamodeli akwazi ukusakazwa kuphela.

Isikhathi sokuzimela esiphilayo

Chofoza i-Stream ukulinganisa i-latency yomsindo wokuqala

I-Output

Ama-chunks omsindo azodlalwa lapha njengoba esuka.

0:00
Ingxenye yokuqala:
Izinto eziphelele: 0
Isikhathi esiphelele:

Indlela i-Streaming TTS isebenza ngayo

Thumela umbhalo

I-POST umbhalo ku /v1/tts/stream/ njengesicelo seSisebenzisi-Sithunyelwe Izigameko.

2. Imodeli ikhiqiza

I-Kokoro iqoqa umbhalo futhi ikhiqiza umsindo wesampula-ngesampula kwi-GPU.

3. Ukusakazwa kwe-Chunks

I-Base64-encoded WAV chunks ifika nge-SSE futhi iqala ukudlala ngokushesha.

4. Listen Live

Umsebenzisi ulalela ukuqala kwebinzana ngezansi kwesekondi, ngisho nakwezingeniso ezide.

Sebenzisa izimo

lapho i-sub-second latency ivula izifundo ezintsha.

Izisebenzi zomsindo

Ama-bots okuxhumana aphendula ngokushesha njengoba umuntu ekwenza.

Ukudubula okuqhubekayo

Gcina futhi uguqule umsuka ngesikhathi sangempela ngaphandle kokucindezela.

Imidlalo

I-NPC dialog ephendula ngokuzenzakalela kukhetho lomdlali, akukho VO eyenziwe.

Ufinyeleleka

Abafundi besiga-nyezi kanye nezisetshenziswa ezisizayo eziqala ukukhuluma uma umsebenzisi ecofoza.

Izinhlelo zesikhathi sangempela ze-TTS

Qala ngokukhululekile, uthuthukise uma ufuna okuningi

Ikhululekile
  • Kokoro streaming (imodeli ekhululekile)
  • 500 amaphawu ngayinye
  • 10 ama-streams amahhala/ngesonto kumsebenzisi ongaziwayo
  • Isiqephu esingaphansi-sesibili sokuqala-sokuzizwisa
  • SSE isakazwa nge-HTTPS
Okuthandwa kakhulu
I-akhawunti Ekhululekile
  • 15,000 amaphawu ngesikhathi sokubhalisa
  • 5,000 amaphawu ngayinye
  • Isithonjana se-API sokungena nge-programming
  • Umlando wokuzalwa
  • Akukho kungena kwe-stream ngayinye
Bhala
I-Pro
  • MOSS-TTS-Realtime (uma uphila)
  • 100,000 amaphawu ngayinye
  • Iphutha le-GPU
  • Umphathi womsindo + Twilio ukuhlanganisa
  • Izinga eliphakeme lemikhawulo
Ukulungiswa

Imibuzo ebuzwa kaningi

Isikhathi sangempela sokubhala-ukukhuluma sidlulisa ama-chunks omsindo njengoba zikhiqizwa, endaweni yokulindela ukuthi isihloko esigcwele siphele. Isibonisi somsindo sokuqala sifika ngaphansi kwesekondi eyodwa, siyenza ilungele izisebenzi zokukhuluma eziphilayo, ukudubula, kanye nezicelo ezixhumene lapho i-latency ibalulekile.

I-TTS ejwayelekile ikhiqiza ifayela lesandi eligcwele ngaphambi kokubuyisela noma yini — ulinde, bese ulalela isihloko esigcwele ngasikhathi sinye. Isikhathi sangempela i-TTS isebenzisa Izinhlamvu ezithunyelwe yi-Server (SSE) ukusakaza ama-chunks esandi ancane njengoba imodeli ikhiqiza. Umsebenzisi ulalela ukuqala kwesihloko ngokushesha, ngisho nakwezenzo ezide.

I-Kokoro iyingxenye engaphambili ezenzakalelayo — ikhiqiza umsindo osheshayo ngama-100x kunasikhathi sangempela ku-GPU yamanje. Sifaka i-MOSS-TTS-Realtime njengendlela engcono kakhulu; abasebenzisi bazokwazi ukukhetha ngesicelo ngasinye uma i-ships.

Isikhathi sokuqala-sosandi esijwayelekile ku-Kokoro yi-300-800ms ngaphezulu kokuxhumana kwabantu. Ukuxhumana kwe-round-trip kulawula ngemuva kwalokho. Ikhasi libonisa isikhathi esilinganiselwe sokuqala-sosandi ku-UI ukuze ubone ngokucacile ukuthi isikhathi esingakanani esithatha isicelo ngasinye.

Izisebenzi zokukhuluma eziphendula ngokuxoxisana, ukudlulisa okuqhubekayo kwemidiya, imidlalo exhunywe ne-NPCs, abafundi abafinyelelekayo abaqala ukuxoxa ngesikhathi umsebenzisi echofoza, noma yisiphi isisebenziso lapho kulindelwe amasekondi amabili noma amathathu wesandi sizozizwa sihamba kancane.

Yebo. POST ku https://api.tts.ai/v1/tts/stream/ ngengxenye efanayo njenge /v1/tts/ isiqephu esijwayelekile. Uphendulo yi-SSE stream ye-base64-encoded WAV chunks. Izinga elimahhala lixhasa izizukulwane ezingu-10 ngosuku ngalunye lomsebenzisi ongenagama; abasebenzisi abaqinisekisiwe bathola umnikelo ophelele we-akhawunti ngayinye.

I-Kokoro isebenzisa izingxoxo eziqeqeshiwe futhi ayihloniphi. I-MOSS-TTS-Realtime (lapho ihlolwe) ixhasa ukuklonywa kwengxoxo engapheliyo kusuka ku-3-second reference. Ukuklonywa kwengxoxo egcwele namhlanje, sebenzisa ikhasi elijwayelekile /text-to-speech/ nge-Chatterbox noma i-GPT-SoVITS — lezo azikwazi ukusakazwa kodwa zikhiqiza izingxoxo ezikhethekile.

Izindleko zophawu olufanayo njengendawo yokuphelisa ye-TTS ejwayelekile. I-Kokoro i-free-tier (1x izindleko). I-MOSS-TTS-Realtime izosebenza ku-standard tier (2x izindleko) uma isebenza. Imithetho yokuhamba yohlelo ayifaki noma yiziphi izindleko ezingeziwe.

Yebo - hlanganisa i-streaming endpoint ne-Twilio voice webhook ukuletha umsindo ophilayo ku-phone call. I-voice agent platform yethu ikhona ikwenza lokhu ku-IVR ne-outbound calling. End-to-end latency ku-phone call ivame ukuba yimizuzu engu-1-2 kufaka phakathi i-STT ne-LLM response.

Uma inethiwekhi yakho ishiya i-chunk ehamba phambili, umdlali we-streaming uzohamba phambili endaweni yokugqwala. Kwezicelo ezingavumelani nezithuba, buyela emuva ku-no-streaming endpoint, noma gcina 500ms yomsindo ngaphambi kokuqala ukudlala.
5.0/5 (1)

Yini esingayithuthukisa? Umbono wakho usiza ukuxazulula izinkinga.

Uhlu lwemiyalezo

Imahhala ngezikhathi ezingu-10 zokuqala zosuku. Bhala ukuze uvule ukhiye wophawu oluphelele nokungena kwe-API.