Warbixin Bug / Feature dalbaday

Waqtiga dhabta ah TTS

Streaming qoraal-in-dhageysiga la sub-second hore-audio latency. Built for wakiilada codka iyo codsiyada nool.

Ma lihin codadka TTS ee afkaaga weli. Na caawi inaad ku darto kuwaaga! Iibso Codkaaga

Text

Soo-gaadhista
0/5,000 xarfo ~0.3s Codka koowaad

Goobaha Codka

Streaming-qaadasho oo keliya.

Latency-ga nool

Riix Stream si aad u miisaanto hore-audio latentity

Wax soosaarka

Audio qaybaha ciyaari doonaa halkan sida ay ku qulquli.

0:00
Qaybta hore:
Tirada guud ee qaybaha: 0
Waqtiga guud:

Sida Streaming TTS Works

1. Dib u soo celi

POST qoraalka /v1/tts/stream / sida Server-Diray dhacdooyinka codsiga.

2. Model soo saaro

Kokoro wuxuu ka kooban yahay qoraalka oo soo saara tusaale-sample-by-sample ah oo ku saabsan GPU.

3. Stream Chunks

Base64-coded WAV chunks yimaadaan SSE iyo bilaabaan ciyaarta isla markiiba.

4. Dhagayso Nolosha

User maqli doonaa bilowgii erayga in hoos ku ah labaad, xitaa ku saabsan soo gelitaanka dheer.

Waxyaabaha la isticmaalo

Meel sub-second latentiy furaysaa khibrado cusub.

Erayga

Bots hadalka oo jawaab sida deg deg ah sida aadanaha.

La-soo-dhaweynta

Translate iyo dub a qulquli waqti dhab ah oo aan buffering joojin.

Ciyaaraha

NPC wadahadalka oo ka jawaabaya ciyaaryahan doorasho si degdeg ah, ma VO hore loo soo bandhigay.

U-helitaan

Akhristaha shaashadda iyo qalabka caawinta oo bilaaba hadalka marka isticmaaluhu riixo.

Realtime TTS qorshayaasha

Bilaash u bilow, kor u qaad markaad u baahan tahay in ka badan

Bilaash
  • Kokoro streaming (model free)
  • 500 xarfo oo ku saabsan dhalasho kasta
  • 10 qulqulka bilaashka ah / maalin kasta isticmaale aan la aqoon
  • Sub-second first-audio latency
  • SSE daawashada HTTPS
Ugu caansan
Xisaab Bilaash ah
  • 15,000 xaraf marka la diiwaangeliyo
  • 5,000 xarfo oo isku daadsan
  • API key si loo helo barnaamijyada
  • Taariikhda soosaarka
  • Ma jiro xad-dhaafka qulqulka maalinlaha ah
Ka diiwaangashan Free
Pro
  • MOSS-TTS-Realtime (marka la nool yahay)
  • 100,000 xarfo oo isku daadsan
  • Xididka GPU ee Horumarka
  • Wakiilka codka + isku xirnaanta Twilio
  • Xaddidaadda heerka sare
Kordhi

Su'aalaha badanaa la waydiiyo

Realtime qoraalka-to-dhageysiga streams audio chunks sida ay soo saaraan, ka dibna sugaya in erayga oo dhan si ay u dhamaystiraan. Samaynta audio ugu horeysay waxaa soo gaaray in hoos mid ka mid ah, taas oo ka dhigaysa mid ku habboon u ah wakiilada codka nool, dub, iyo codsiyada isgaarsiinta halkaas oo latentity muhiim.

TTS caadiga ah soo saartaa file audio buuxa ka hor inta uusan waxba ka soo laabtay - aad sugto, ka dibna maqli erayada oo dhan isla markiiba. Realtime TTS isticmaalaa Server-Diray dhacdooyinka (SSE) si ay u daadiyaan audio yar chunks sida qaabka soo saarta iyaga. user maqli doonaa bilowgii erayada in ka badan isla markiiba, xitaa on inji dheer.

Kokoro waa backend default — waxay soo saartaa audio ku dhowaad 100x ka dhaqso badan waqti dhab ah on GPU casriga ah. Waxaan ku darno MOSS-TTS-Realtime sida ikhtiyaar tayo sare leh; isticmaaleyaashu waxay awoodi doonaan inay doortaan hal codsi mar ay doonta.

Tijaabada ugu horeysay ee codka ah ee Kokoro waa 300-800ms oo ku saabsan xiriirka dadweynaha. Network round-trip ayaa ka sii xoog badan ka dib. Bogga wuxuu muujinayaa waqtiga-u-hore-audio ee UI si aad u aragto sida saxda ah inta jeer ee codsiga kasta uu qaatay.

Aqoonyahannada codka oo jawaabaya hadal ahaan, dubsteps nool oo warbaahinta ka socda, ciyaaraha NPCs, akhristaha awoodda oo bilaabaya inuu ka hadlo waqtigii isticmaale uu riixo, iyo codsi kasta oo la sugayo laba ama saddex ilbiriqsi oo cod ah oo aad u adkaan doonta.

Haa. POST in https://api.tts.ai/v1/tts/stream/ la jirka la mid ah sida caadiga ah / v1/tts/ dhamaadka. jawaabta waa SSE qulqulka base64-coded WAV chunks. heerka bilaash ah taageeraa 10 qarni maalin kasta user aan la aqoon; isticmaalayaasha la xaqiijiyay helaan buuxa per-account xaddiga xarafka.

Kokoro waxay isticmaalaan codadka hore loo tababaray mana kala soocaan. MOSS-TTS-Realtime (marka la isku daro) waxay taageertaa kala soocida codka zero-shot ka soo horjeeda 3-second. Si loo kala sooco codka oo dhan maanta, isticmaal bogga caadiga ah / text-to-speech / oo leh Chatterbox ama GPT-SoVITS - kuwaas oo aan ahayn kuwo awood u leh laakiin soo saara codyo gaar ah.

Kokoro waa bilaash-qaab (1x qiimaha). MOSS-TTS-Realtime ku socon doonaa heerka caadiga ah (2x qiimaha) marka la awoodi karo. Protocol streaming ma ku dari kartaa wax qiimo dhimis ah.

Haa - isku darka xadka dhamaadka ee la wadaago codka Twilio webhook si ay u siiyaan codka ku nool in telefoonka. Platform-ka codka ee codka ayaa hadda u sameeya IVR iyo wicitaanada soo socda. End-to-end latency on a phone call is usually 1-2 seconds including STT and LLM response.

Haddii network aad hoos u chunk in transit, ciyaaryahanka streaming ka sii gudbi doonaa hore ka badan in ay ka hor. For codsiyada aan awoodin inay dulqaadan kala go'a, dib u dhaca in ay caadiga ah aan-streaming endpoint, ama buffer 500ms of audio ka hor inta uusan bilaabin ciyaarta.
5.0/5 (1)

Maxaa aan ku hagaajin karnaa? Jawaabtaada waxay naga caawisaa inaan xallino dhibaatooyinka.

Dhagayso hadalka waqti dhab ah

Bilaash ah 10-kii qarni ee ugu horreeyay maalintii. Ku soo biir si aad u furto abaalmarinta astaanta oo dhan iyo helitaanka API.