Realtime TTS

Streaming text-to-speech ne sub-second-first-audio latency. Yakagadzirwa yevamiriri vezwi uye maapplication ehupenyu.

Hatina TTS mazwi muchirungu chako parizvino. Tibatsire kuti tigadzirise ako! Kutengesa mashoko ako

Text

Kutendeuka
0/5,000 mavara ~0.3s first audio

Mifananidzo

Zvigadzirwa zvinokwanisa kutamba chete.

Kugara-kugara

Tinya pa Stream kuti uwane latency yekutanga-audio

Kuburitsa

Audio chunks ichatamba pano sezvavanopinda.

0:00
Chidimbu chekutanga:
Kusvika pakukanganisa: 0
Kusvika pakuguma kwenguva:

Maitiro e Streaming TTS Works

1. Tsanangura

POST meseji /v1/tts/stream/ se Server-Sent Events chikumbiro.

2. Model inogadzira

Kokoro inotora mabhureki emashoko uye inogadzira mifananidzo yevhidhiyo paGPU.

3. Stream mabhureki

Base64-encoded WAV mabhureki anosvika pamusoro SSE uye kutanga kutamba panguva imwe chete.

4. Kuverenga Live

Mushandisi anonzwa kutangira kwechirevo munguva pfupi, kunyange pazvinyorwa zvakaoma.

Usaisa Zvimwe

Kunyangwe sub-second latency unlocks nyowani ruzivo.

Mazita evanhu

Conversational bots kuti vaite sezvinobvira semunhu aizoita.

Kutambana

Translate uye dub mvura munguva chaiyo pasina buffering paunenge uchitamba.

Mitambo

NPC dialog iyo inosangana nesarudzo dzemutambi panguva imwe chete, hapana VO yakagadzirirwa.

Kugona Kusvika

Mabhuku ekuverenga skrini uye zvinowedzera zvinotanga kutaura pazuva rinotevera richionekwa nemunhu anodzvanya.

Realtime TTS Plans

Kutanga zvakasununguka, kuvandudzwa kana iwe uchida zvakawanda

Free
  • Kokoro streaming (mahara model)
  • 500 characters per generation
  • 10 free streams / zuva per anonyymi user
  • Sub-second first-audio latency
  • SSE streaming pamusoro HTTPS
Inonyanya Kuzivikanwa
Free Account
  • 15,000 characters pakutanga
  • 5,000 chars per stream
  • API key yekushandisa neprogramming
  • Kuzvarwa kwekare
  • Hapana mazuva ese kutenderera cap
Sign Up Free
Pro
  • MOSS-TTS-Realtime (kana uchigara)
  • 100,000 chars per stream
  • GPU Priority Queue
  • Voice agent + Twilio kubatanidzwa
  • Yakakwira mwero miganhu
Upgrade

Mibvunzo Inobvunzwa Kazhinji

Realtime text-to-speech inofambisa mabhureki eaudio sezvavanogadzirwa, kwete kungomirira kuti zvese zvinyorwa zvigadziriswe. Iyo yekutanga audio sample inosvika pasi pesekondi imwe chete, ichiita kuti ienderane nevanhu vanotaura, kudhirowa, uye maapplication anowirirana apo latency inonyanya kukosha.

Regular TTS inogadzira iyo audio file yose usati wadzokera chero chinhu — iwe unomirira, wobva wanzwa mashoko ose pamwechete. Realtime TTS inoshandisa Server-Sent Events (SSE) kutumira mabhureki emabhureki emashoko sezvavanogadzirwa nemodel. Mushandisi anonzwa kutangira kwemashoko anenge achiri kure, kunyange pamashoko akareba.

Kokoro ndiyo default backend — inogadzira mashoko anenge 100x nekukurumidza kupfuura real time pane yemazuva ano GPU. Tine MOSS-TTS-Realtime senzira ine mhando yepamusoro; vashandisi vachakwanisa kusarudza patsva kana vachida.

Chimwe chinhu chekutanga chekutaura paKokoro ndechekuti 300-800ms pane yekubatanidza yevanhu. Network round-trip inotungamira mushure meizvozvo. Iyo peji inoratidzwa nenguva yekuongorora nguva-yekutanga-audio muUI kuitira kuti ugone kuona kuti nguva yakareba sei yega yega yekukumbira.

Maagent ezwi anobvunza mumitauro, kutamba vhidhiyo, NPCs emitambo, vanoverenga vanotanga kutaura pavanobata bhatani, uye chero application iyo inotarisira maminetsi maviri kana matatu ezwi ichaita kuti zvive nyore.

Yeah. POST to https://api.tts.ai/v1/tts/stream/ with the same body as the regular /v1/tts/ endpoint. The response is an SSE stream of base64-encoded WAV chunks. The free tier supports 10 generations per day per anonymous user; authenticated users get the full per-account character allowance.

Kokoro inoshandisa mazwi akagadzirira uye haasi kutambanudza. MOSS-TTS-Realtime (kana ichishandiswa) inotsigira kutambanudza mazwi pasina kutambanudza kubva pa3-sekondi reference. Kuti uite kutambanudza mazwi zvachose, shandisa /text-to-speech/ peji ne Chatterbox kana GPT-SoVITS — izvo hazvina kutambanudza mazwi asi zvinogadzira mazwi akagadzirirwa.

Chinyorwa chine mutengo wakaenzana newe TTS endpoint. Kokoro imahara-chimiro (1x mutengo). MOSS-TTS-Realtime ichashanda pamhando yezvinyorwa (2x mutengo) kana ichishanda. Streaming protocol haidzisi chero mutengo wakawedzera.

Yeah — peer the streaming endpoint with a Twilio voice webhook to feed live audio into a phone call. Our voice agent platform already does this for IVR and outbound calling. End-to-end latency on a phone call is typically 1-2 seconds including STT and LLM response.

Kana network yako ichisiya chunk munzira, streaming player ichaenda mberi kupfuura kusvikira. Yezvirongwa zvisingagone kutonhora zvikanganiso, dzika kumusoro kune yakajairika isina kutenderera endpoint, kana bvisa 500ms yezwi mushure mekutanga kutamba.
5.0/5 (1)

Chii chingatibatsira kuti tiite zvakanaka? Ruzivo rwako runogona kutibatsira kugadzirisa matambudziko.

Stream Speech munguva chaiyo

Free for the first 10 generations a day. Sign up to unlock the full character allowance and API access.