Zolemba za API

Kuphatikiza TTS.ai m'mapulogalamu anu ndi REST API yathu. OpenAI-kugwirizana mtundu kwa kunyamula kosavuta.

REST API OpenAI yogwirizana JSON Mafunso Streaming Support

Kufotokozera

TTS.ai API imapatsa mwayi wogwiritsa ntchito mapulogalamu onse a platform: synthesization ya text-to-speech, transcription ya speech-to-text, cloning ya mawu, kuwonjezeka kwa audio, ndi zina zambiri.API imagwiritsa ntchito REST standard conventions ndi JSON request / response bodies.

Chithunzi cha API

Pezani chiphaso chanu cha API kuchokera Kusintha kwa Akaunti. Zilipo pa Pro ndi Enterprise miyezo.

Base URL

https://api.tts.ai/v1/

Ovomerezeka

Bearer token kudzera Authorization header

Kutsimikizira

Free tier - palibe cholumikizira chofunikira. Anonymous POSTS kuti /v1/tts/ ntchito popanda auth iliyonse, mpaka 5,000 characters/day per IP, pogwiritsa ntchito iliyonse ya ife ufulu mafano (piper, vits, melotts, kokoro). Sign up for a free account to get 15,000 bonus characters and access to premium models.

Kwa mafoni a premium ndi ma limits a mtengo wotsika, onetsetsani kuti muli ndi Authorization header.

HTTP Header
Authorization: Bearer sk-tts-your-api-key-here
Gwiritsani ntchito mawu achinsinsi a API. Musazigawana mu code ya client-side, ma repositories a anthu, kapena ma logs. Rotate keys regularly from your account settings.

SDKs

Ovomerezeka SDKs amathandiza kuti TTS.ai iphatikizidwe mu pulogalamu yanu.Zomwezo ndizotsegulidwa komanso zimapezeka pa GitHub.

Python

pip install ttsai
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-...")
audio = client.generate(
    text="Hello world!",
    model="kokoro"
)
client.save(audio, "output.wav")
GitHub

JavaScript / Node.js

npm install @ttsainpm/ttsai
const { TTSClient } = require('@ttsainpm/ttsai');

const client = new TTSClient({
  apiKey: 'sk-tts-...'
});
const audio = await client.generate({
  input: 'Hello world!',
  model: 'kokoro'
});
await client.saveToFile(audio, 'output.wav');
GitHub

Base URL

Base URL: https://api.tts.ai/v1/

Zomwezo zonse ndi zokhudzana ndi ulalo wa ulalo. Mwachitsanzo, TTS endpoint ndi:

POST https://api.tts.ai/v1/tts/

Kusinthanitsa

Malamulo a API amasiyana malinga ndi ndondomeko:

Plan Mafunso/mphindi Kugwirizana Kukula kwa Tebulo
Opanda pake 10 2 500 chars
Woyamba 30 3 1,000,000 chars
Pro 60 5 1,000,000 chars
Enterprise 300 20 50,000 chars

Mtengo malire zigawo zikuluzikulu ndi kuphatikizidwa mu chilichonse yankho: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Mtengo wa ngongole

Ntchito Mtengo Unit
TTS (Mapangidwe aulere: Piper, VITS, MeloTTS) 1,000 characters per 1,000 characters
TTS (Standard mafano: Kokoro, CosyVoice 2, etc.) 2,000 characters per 1,000 characters
TTS (Premium models: Tortoise, Chatterbox, etc.) 4,000 characters per 1,000 characters
Kusintha mawu kukhala malemba 2,000 characters per minute of audio
Chizindikiro cha mawu 4,000 characters per 1,000 characters
Wosintha mawu 3,000 characters per minute of audio
Audio Kuwonjezera 2,000 characters per minute of audio
Kuchotsa kwa Vocal / Stem Splitting 3,000-4,000 characters per minute of audio
Kutanthauzira kwa mawu 5,000 characters per minute of audio
Kuyankhulana kwa mawu 3,000 characters pa mphindi
Key & BPM Finder Opanda pake --
Audio Converter Opanda pake --

Text to SpeechQuery

POST /v1/tts/

Kusintha malemba kukhala mawu. Kubwezeretsa fayilo ya mawu m'njira yofunidwa.

Mtima wa funso

ParamitaChigawoZofunikaKufotokozera
model string Si Model ID (mwachitsanzo, kokoro, chatterbox, piper). Ngati sichiperekedwa, tidzasankhanso mo awtomatikizidwa mtundu womwe umagwirizana ndi language wofunidwa — kokoro kwa en/ja/zh/ko/fr/de/it/pt/es/hi/ru, piper kwa zinenero zina zogwirizana (ar/pl/nl/cs/da/fi/el/hu/tr/uk/vi/etc.).
text string Yes Tebulo losinthidwa kukhala mawu. Kufunikira kwa maudindo: 500 (anonymous), 5,000 (free account), 1,000,000 (paid plan). Ma inputs ochepa amagawidwanso m'magulu ndi seva.
voice string Yes Voice ID (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
format string Si Format ya kutulutsa: mp3 (yosasinthika), wav, flac, ogg
speed float Si Mtundu wa kulankhula. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0
language string Si Kodi ya chilankhulo (mwachitsanzo, en, es). Idzapezeka mwamsanga ngati sichiperekedwa.
instructions string Si Kuchita / kutumiza cues (≤500 chars). e.g. \
pronunciations object | array Si Kusintha mawu papempho.
stream boolean Si Ikani yankho la kutumizira. Lolungama: false

Mphatso ya lamulo

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

SSML tags

Kuzungulira mitengo, masiku, ndalama, mafoni, ndi acronyms mu

interpret-asInputKulankhula monga
cardinal1234one thousand two hundred thirty-four
ordinal21twenty-first
date1999-12-31December thirty-one, nineteen ninety-nine
time14:30two thirty PM
telephone+1-555-867-5309plus one five five five eight six seven…
currency$1,234.56one thousand two hundred thirty-four dollars and fifty-six cents
spell-outNASAN A S A

Kusintha kwa tsiku ndi nthawi kumasinthidwa kuti mdy kwa Chingelezi ndi dmy kwina; kusinthidwa ndi format=\

Mfano
{
  "model": "kokoro",
  "voice": "af_bella",
  "text": "Your appointment is on <say-as interpret-as=\"date\">2026-04-26</say-as> at <say-as interpret-as=\"time\">14:30</say-as>. Please call <say-as interpret-as=\"telephone\">+1-555-867-5309</say-as> if you need to reschedule."
}

Kuyankha

The TTS endpoint queues your request and returns a JSON response with a job UUID. You then poll for the result.

Step 1: Submit request

Response (JSON)
{
  "uuid": "77b71db532874ce98e84a69a2d740d4c",
  "job_id": "f21316bb-aefa-480d-8523-701d1e3184ce",
  "status": "queued",
  "credits_used": 11,
  "credits_remaining": 15000
}

Step 2: Poll for result

GET /v1/speech/results/?uuid=<job_uuid>

Poll this endpoint every 1-2 seconds until status is completed or failed.

Polling response (completed)
{
  "status": "completed",
  "result_url": "https://api.tts.ai/static/downloads/77b71db5.../output.mp3"
}
Polling response (still processing)
{
  "status": "processing"
}

Step 3: Download audio

Fetch the result_url from the completed response to download the audio file.

Mfano wokwanira

Python
import requests, time

API_KEY = "sk-tts-your-key"
BASE = "https://api.tts.ai"

# 1. Submit TTS request
resp = requests.post(f"{BASE}/v1/tts/", json={
    "model": "kokoro",
    "text": "Hello from TTS.ai!",
    "voice": "af_bella"
}, headers={"Authorization": f"Bearer {API_KEY}"})
data = resp.json()
uuid = data["uuid"]

# 2. Poll for result
while True:
    result = requests.get(f"{BASE}/v1/speech/results/",
        params={"uuid": uuid}).json()
    if result["status"] == "completed":
        # 3. Download audio
        audio = requests.get(result["result_url"])
        with open("output.mp3", "wb") as f:
            f.write(audio.content)
        break
    elif result["status"] == "failed":
        raise Exception(result.get("error", "Generation failed"))
    time.sleep(1.5)

Streaming alternative: For supported models (Kokoro, MeloTTS), use POST /v1/tts/stream/ for real-time Server-Sent Events (SSE) streaming — no polling needed.

Kusintha mawu kukhala malemba

POST /v1/stt/

Transscribe audio kuti malemba. Supports 99 zinenero ndi auto-kuzindikira.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Audio wapamwamba (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Si STT model: whisper (zosasinthika), faster-whisper, sensevoice
language string Si Kodi ya Chilankhulo. auto kuti mudziwe mwamsanga (zosasinthika).
timestamps boolean Si Kuphatikiza ma timestamps pa level ya mawu. Cholakwika: false
diarize boolean Si Ikani kuyimba kwa wokamba nkhani. Cholakwika: false

Kuyankha

Kuyankha kwa JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Chizindikiro cha mawu

POST /v1/tts/clone/

Kutulutsa mawu m'mawu osinthidwa. Kutsitsa mawu ndi malemba osinthidwa.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
reference_audio file Yes Kulemba mawu ovomerezeka (10-30 masekondi amalimbikitsa). Max 20MB.
text string Yes Text kuti kunena mu cloned mawu.
model string Si Model ya clone: chatterbox (yosasinthika), cosyvoice2, gpt-sovits
format string Si Format ya kutulutsa: mp3 (yosasinthika), wav, flac
language string Si Kodi ya chilankhulo chofunikira. Iyenera kuthandizidwa ndi mtundu wosankhidwa.

Kuyankha

Kubwerera audio fayilo monga binary deta, monga TTS endpoint.

Wosintha mawu

POST /v1/voice-convert/

Kusintha Audio kuti kuoneka ngati wina mawu. Upload m'badwo audio ndi kusankha woyenera mawu.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera (MP3, WAV, FLAC). Max 50MB.
target_voice string Yes ID ya mawu yoyenera kusinthidwa (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
model string Si Model yosinthira mawu: openvoice (yosasinthika), knn-vc
format string Si Format ya kutulutsa: wav (yosasinthika), mp3, flac

Mphatso ya lamulo

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Kuyankha

Kubwerera kamodzi kusinthidwa audio fayilo monga binary deta.

Kutanthauzira kwa mawu

POST /v1/speech-translate/

Kutanthauzira akulankhula audio kuchokera ku mtundu wina kwa wina. Kuphatikiza mawu-ku-malemba, kutanthauzira, ndi malemba-ku-malemba mu wina foni.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera m'zinenero zoyambirira. Max 100MB.
target_language string Yes Kodi ya chilankhulo chofunikira (mwachitsanzo, es, fr, de, ja)
voice string Si Chilankhulo cha kutanthauzira kwa mauthenga. Chidzasankhidwa mwamsanga ngati sichisankhidwa.
preserve_voice boolean Si Timafuna kuteteza khalidwe la mawu la wolankhulayo. Default: false

Kuyankha

Kuyankha kwa JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Kulankhula kwa Chilankhulo

POST /v1/speech-to-speech/

Kusintha kwa mawu, maganizo, kapena kutumiza pogwiritsa ntchito zomwe zili. Kuthandiza kuwongolera tonse, pacing, ndi kufotokoza.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya mawu yochokera. Max 50MB.
voice string Yes ID ya mawu yoyenera kwa mawu a kutuluka
model string Si Model: openvoice (zosasinthika), chatterbox
emotion string Si Mfundo yofunikira: neutral, happy, sad, angry, excited
speed float Si Kusintha kwa liwiro. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0

Kuyankha

Imakhalanso ndi fayilo ya audio yosinthidwa ngati deta ya binary.

Zipangizo za Audio

Audio processing endpoints for enhancement, vocal removal, stem splitting, ndi zina zambiri.

POST /v1/audio/enhance/

Kukulitsa khalidwe la audio: denoise, kukulitsa chidziwitso, super resolution.

file fileFayilo ya mawu yowonjezera
denoise booleanIkani denoising (zosasinthika: zoona)
enhance_clarity booleanKuwonjezera kumvetsetsa kwa mawu (osankhidwa: true)
super_resolution booleanKukulitsa mtundu wa mawu (osankhidwa: ayi)
strength integer1-3 (wowala, wotsika, wolimba). Default: 2
POST /v1/audio/separate/

Osiyana vocals kuchokera instrumentals (kuchotsa vocal) kapena kugawidwa m'mitundu.

file fileFayilo ya audio yogawidwa
model stringdemucs (default) kapena spleeter
stems integerMtengo wa masamba: 2, 4, 5, kapena 6 (zosasinthika: 2)
format stringFormat ya kutulutsa: wav, mp3, flac
POST /v1/audio/dereverb/

Chotsani echo ndi reverb kuchokera zomvetsera.

file fileFayilo ya audio yogwiritsira ntchito
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Opanda pake

Analyze audio kuzindikira batani, BPM, ndi nthawi signature.

file fileFayilo ya audio yofufuzira
Kuyankha
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Opanda pake

Kusinthana audio pakati pa mavidiyo.

file fileFayilo ya audio yosinthidwa
format stringFormat yoyenera: mp3, wav, flac, ogg, m4a, aac
bitrate integerKutulutsa kwa bitrate mu kbps: 64, 128, 192, 256, 320
sample_rate integerKuyerekezera kwa sampling: 22050, 44100, 48000
channels stringmono kapena stereo

Kuyankhulana kwa mawu

POST /v1/voice-chat/

Kutumiza audio kapena malemba ndi kulandira yankho AI ndi mawu synthesized.

Mtima wa funso (multipart/form-data kapena JSON)

ParamitaChigawoZofunikaKufotokozera
audio file Si* Audio input (kufunikira audio kapena text)
text string Si* Kulemba mawu (kufunika audio kapena text)
voice string Si Chilankhulo cha yankho la AI. Cholakwika: af_bella
tts_model string Si Model ya TTS yofunsira. Choyambirira: kokoro
system_prompt string Si Kusintha kwa dongosolo la AI
conversation_id string Si Kupititsa patsogolo kulankhulana komwe kulipo

Kuyankha

Kuyankha kwa JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Batch TTS

POST /v1/tts/batch/

Kutumiza malemba ambirimbiri kwa TTS yopanga mogwirizana. Kulandira webhook callback pamene ntchito zonse zatha.

Ma parameters

ParamitaChigawoKufotokozera
textsarrayArray of objects: {text, model, voice}. Max 50 items.
webhook_urlstringChofunika URL kuti POST zotsatira pamene gulu amamaliza.

Kuyankha

Kuyankha kwa JSON
{
  "batch_id": "abc123",
  "total": 3,
  "completed": 0,
  "status": "processing"
}

Kuyankhulana kupitilira ndi GET /v1/tts/batch/result/?batch_id=abc123

Kuphatikiza kwa mawu

POST /v1/voice-embed/

Kuwerengetsa koyamba kuphatikizika kwa mawu kuchokera pa mawu ochokera. Pezani embed_id yobwezeredwa m'mafunso otsatira a kuphatikizika kwa mawu kuti mupange mwamsanga.

Ma parameters

ParamitaChigawoKufotokozera
filefileReference audio file (WAV, MP3, FLAC).
modelstringCloning model (default: chatterbox). Supported: chatterbox, cosyvoice2, openvoice, gpt-sovits, spark, indextts2, qwen3-tts.

Kuyankha

Kuyankha kwa JSON
{
  "embed_id": "emb_abc123",
  "model": "chatterbox",
  "duration_ms": 450
}

Kuyesa Kwaumoyo

GET /v1/health/

Kafukula khalidwe la seva ya GPU, mamodeli otsegulidwa, ndi kukula kwa mndandanda wazoyembekezera. Sikufunikira kutsimikizira. Kusungira kwa masekondi 30.

Kuyankha

Kuyankha kwa JSON
{
  "status": "online",
  "latency_ms": 45,
  "queue_size": 3,
  "models_loaded": ["kokoro", "chatterbox", "cosyvoice2"]
}

Lipoti Models

GET /v1/models/

Kubwerera mndandanda wa onse opezeka mafano ndi luso lawo.

Kuyankha

Kuyankha kwa JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Limbikitsani Mawu

GET /v1/voices/

Imabweretsa mndandanda wa mawu onse omwe alipo, omwe angasankhidwe kutengera mtundu kapena zinenero.

Kufunsa Ma parameters

ParamitaChigawoKufotokozera
model string Chotsani ndi ID ya model (mwachitsanzo, kokoro)
language string Fufuzani pogwiritsa ntchito kodi ya chilankhulo (mwachitsanzo, en)
gender string Fufuzani malinga ndi mtundu: male, female, neutral

Kuyankha

Kuyankha kwa JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Subtitles (SRT / VTT) latsopano

GET /v1/speech/subtitles/?uuid=<job_uuid>&format=srt|vtt&download=1

Kutulutsa mawu osakira ogwirizana kwa ntchito iliyonse ya TTS yomwe yatha. Kuyendetsa kugwirizana kwa Whisper pa mawu ndi kutumiza SRT kapena WebVTT. Zimenezi zimasungidwa pa diski kuti kulumikizana kwachiwiri kwa uuid yomweyo ndi kuwerenga pa diski.

Kufunsa Ma parameters

ParamitaZofunikaKufotokozera
uuidYesUUID ya ntchito yobwezedwa ndi /v1/tts/ kapena /v1/voice-clone/.
formatSisrt (zosasinthika) kapena vtt.
downloadSi1 kuti mutumizire Content-Disposition: attachment kuti msakatuli ubwezeretse kuposa kusonyeza.
languageSiKufotokozera kwa mtundu wa kugwirizana (kuzindikira kwaotomatiki ngati kulephera).
cURL
curl "https://api.tts.ai/v1/speech/subtitles/?uuid=$UUID&format=srt&download=1" -o subtitles.srt

Dikisitoni ya mawu latsopano

GET POST DELETE /api/v1/pronunciations/

Kufotokozera TTS injini mmene kulankhula mawu okhudzana. Kusunga entries auto-kugwiritsa ntchito pa chilichonse TTS funso inu kuchita. 200-entry per-account limit.

Mtima wa funso (POST)

ParamitaChigawoKufotokozera
wordstringM'badwo woyenera kuchotsa (mwachitsanzo GIF, Anthropic). M'badwo wa m'badwo wa mawu udagwirizana.
replacementstringKodi kulemba izo kwa chitsanzo (mwachitsanzo jiff, ann THROP ick).
languagestringChosankha ISO code. Empty = amagwira ntchito kwa onse zinenero.
case_sensitivebooleanfalse yosasinthika. Ikani mawu ofanana ndi true.
cURL
# Save an entry
curl -X POST https://tts.ai/api/v1/pronunciations/ \
  -H "Authorization: Bearer sk-tts-..." \
  -H "Content-Type: application/json" \
  -d '{"word": "GIF", "replacement": "jiff"}'

# List your entries
curl https://tts.ai/api/v1/pronunciations/ -H "Authorization: Bearer sk-tts-..."

# Delete entry by id
curl -X DELETE "https://tts.ai/api/v1/pronunciations/?id=42" -H "Authorization: Bearer sk-tts-..."

Mukhozanso kupita per-kupempha overrides popanda kupulumutsa iwo — kuphatikiza pronunciations pa chilichonse / v1 / tts / kufunsa ngati chinthu kapena atolankhani (onani TTS endpoint params).

Mlengi wa nkhani latsopano

Kutulutsa imodzi < code>< script> tag pa nkhani iliyonse tsamba ndi owerenga kupeza fixed wowerenga bar kuti narrates tsamba pa kuwonekera. Auto-kuzindikira nkhani thupi, amathandiza makonda mawu / model / malo / accent mtundu.

HTML
<script src="https://tts.ai/narrator.js"
    data-pk="pk-tts-your-publishable-key"
    data-voice="af_bella"
    data-model="kokoro"
    data-extract="auto"
    data-position="bottom"
    data-color="#e60000"
    data-locale="en"></script>

Zosankha

ParamitaKufotokozera
data-pkChingwe chosindikizidwa (pk-tts-…). Malamulo a domain amachitidwa pogwiritsa ntchito allowed_domains ya chingwe.
data-voiceID ya mawu. Default af_bella.
data-modelTTS model ID. Default kokoro.
data-extractauto (default) — imayesetsa kulemba nkhani/mawu/.post-content/.entry-content osankha, imabwereranso ku mndandanda wa masamba olimba kwambiri.
data-positionbottom (zosasinthika) kapena top.
data-colorKuwala kwa kuwala (kapena mtundu uliwonse wa CSS). Kukhazikika #e60000.
data-min-chars / data-max-charsKukakamizidwa kuchotsa bokosi ngati nkhaniyo ndi yochepa kuposa min-chars (zosasinthika 200). Kuletsa kulowa kwa max-chars (zosasinthika 50,000).

Masamba a Wikipedia a m'Chisipanishi:

Kuyankha

Kuphatikiza kwa mawonekedwe a batani la m'manja. Amawoneka pafupi ndi tag yake ya