API Documentation

Kubatanidza TTS.ai mukushandisa kwako neiyo REST API. OpenAI-inowirirana fomati yekufambisa nyore.

REST API OpenAI inowirirana JSON Mibvunzo Streaming rutsigiro

Overview

The TTS.ai API inopa programmatic kuwanikwa kune ese maficha eplatform: chinyorwa-ku-kutaura kuumbwa, chinyorwa-ku-kutaura transcription, mashoko cloning, audio kusimudzira, uye zvakawanda. The API inoshandisa standard REST conventions ne JSON request / mhedzisiro muviri.

API Key

Tora yako API kiyi kubva Mifananidzo. Available on Pro uye Enterprise zvirongwa.

Base URL

https://api.tts.ai/v1/

Authentication

Bearer token via Authorization peji repamusoro

Authentication

Free tier - hapana kiyi inodiwa. Anonymous POSTS to /v1/tts/ basa pasina chero auth, kusvika 5,000 characters / zuva per IP, usinga shandisa chero imwe yedu yemahara mamodheru (piper, vits, melotts, kokoro). Sign up for a free account to get 15,000 bonus characters and access to premium models.

Pazvigadzirwa zvemhando yepamusoro uye nemamiriro emitengo akanyanya, ratidza kuti uri mushandisi neBearer token Authorization peji repamusoro.

HTTP Header
Authorization: Bearer sk-tts-your-api-key-here
Keep your API key secret. Do not share it in client-side code, public repositories, or logs. Rotate keys regularly from your account settings.

SDKs

Official SDKs kuita nyore kubatanidza TTS.ai muapplication yako. Vaviri ndivo open source uye zviripo pa GitHub.

Python

pip install ttsai
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-...")
audio = client.generate(
    text="Hello world!",
    model="kokoro"
)
client.save(audio, "output.wav")
GitHub

JavaScript / Node.js

npm install @ttsainpm/ttsai
const { TTSClient } = require('@ttsainpm/ttsai');

const client = new TTSClient({
  apiKey: 'sk-tts-...'
});
const audio = await client.generate({
  input: 'Hello world!',
  model: 'kokoro'
});
await client.saveToFile(audio, 'output.wav');
GitHub

Base URL

Base URL: https://api.tts.ai/v1/

All endpoints are relative to this base URL. For example, the TTS endpoint is:

POST https://api.tts.ai/v1/tts/

Kuwedzera

API rate limits zvinosiyana nechirongwa:

Plan Zvikumbiro/min Kusangana Max Text Length
Free 10 2 500 characters
Starter 30 3 1,000,000 chars
Pro 60 5 1,000,000 chars
Enterprise 300 20 50,000 chars

Rate limit headers zvinosanganisirwa mune yega yega mhinduro: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Mitengo yechikwereti

Service Mutengo Unit
TTS (Free mamodheru: Piper, VITS, MeloTTS) 1,000 characters per 1,000 characters
TTS (Standard mamodheru: Kokoro, CosyVoice 2, etc.) 2,000 characters per 1,000 characters
TTS (Premium mamodheru: Tortoise, Chatterbox, etc.) 4,000 characters per 1,000 characters
Kutaura-Ku-TekisiName 2,000 characters per minute of audio
Voice Cloning 4,000 characters per 1,000 characters
Voice ChangerName 3,000 characters per minute of audio
Kuvandudzwa kweSound 2,000 characters per minute of audio
Vocal Kubvisa / Stem Kuparadzana 3,000-4,000 characters per minute of audio
Kushandura Kutaura 5,000 characters per minute of audio
Kutaura nezwi 3,000 characters per turn
Key & BPM Finder Free --
Audio Converter Free --

Tevere:

POST /v1/tts/

Kushandura tebhu kuita mashoko emitauro. Inodzosera faira remitauro mufomati yaunoda.

Kukumbira muviri

ParameterTypeInodiwaKutaura
model string Hapana Model ID (e.g., kokoro, chatterbox, piper). Kana ichiitwa, isu tinosarudza otomatiki model iyo inotsigira yaunoda languagekokoro ye en/ja/zh/ko/fr/de/it/pt/es/hi/ru, piper ye mamwe malanguage anotsigirwa (ar/pl/nl/cs/da/fi/el/hu/tr/uk/vi/etc.).
text string Yes Tenzi kuti ashandure kutaura. Per-request cap: 500 chars (anonymize), 5,000 (free account), 1,000,000 (paid plan). Long inputs are auto-chunked server-side.
voice string Yes Voice ID (use /v1/voices/ to list available voices)
format string Hapana Output format: mp3 (yakajairika), wav, flac, ogg
speed float Hapana Kutaura-nguva multiplier. Default: 1.0. Range: 0.5 to 2.0
language string Hapana Kodhi yechitauro (e.g., en, es). Inoonekwa otomatiki kana yakadzimwa.
instructions string Hapana Kuita / kutumira zviratidzo (≤500 chars). e.g. \
pronunciations object | array Hapana {\
stream boolean Hapana Kuita kuti kuendeswa kwemashoko kuite. Zviri pachena: false

Muenzaniso wechikumbiro

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

SSML tags

Wrap nhamba, mazuva, mari, nhamba dzefoni, uye acronyms mu

interpret-asInputInotaurwa se
cardinal1234one thousand two hundred thirty-four
ordinal21twenty-first
date1999-12-31December thirty-one, nineteen ninety-nine
time14:30two thirty PM
telephone+1-555-867-5309plus one five five five eight six seven…
currency$1,234.56one thousand two hundred thirty-four dollars and fifty-six cents
spell-outNASAN A S A

Date format defaults to mdy for English and dmy elsewhere; override with format=\

Mufananidzo
{
  "model": "kokoro",
  "voice": "af_bella",
  "text": "Your appointment is on <say-as interpret-as=\"date\">2026-04-26</say-as> at <say-as interpret-as=\"time\">14:30</say-as>. Please call <say-as interpret-as=\"telephone\">+1-555-867-5309</say-as> if you need to reschedule."
}

Kubvunzana

The TTS endpoint queues your request and returns a JSON response with a job UUID. You then poll for the result.

Step 1: Submit request

Response (JSON)
{
  "uuid": "77b71db532874ce98e84a69a2d740d4c",
  "job_id": "f21316bb-aefa-480d-8523-701d1e3184ce",
  "status": "queued",
  "credits_used": 11,
  "credits_remaining": 15000
}

Step 2: Poll for result

GET /v1/speech/results/?uuid=<job_uuid>

Poll this endpoint every 1-2 seconds until status is completed or failed.

Polling response (completed)
{
  "status": "completed",
  "result_url": "https://api.tts.ai/static/downloads/77b71db5.../output.mp3"
}
Polling response (still processing)
{
  "status": "processing"
}

Step 3: Download audio

Fetch the result_url from the completed response to download the audio file.

Full example

Python
import requests, time

API_KEY = "sk-tts-your-key"
BASE = "https://api.tts.ai"

# 1. Submit TTS request
resp = requests.post(f"{BASE}/v1/tts/", json={
    "model": "kokoro",
    "text": "Hello from TTS.ai!",
    "voice": "af_bella"
}, headers={"Authorization": f"Bearer {API_KEY}"})
data = resp.json()
uuid = data["uuid"]

# 2. Poll for result
while True:
    result = requests.get(f"{BASE}/v1/speech/results/",
        params={"uuid": uuid}).json()
    if result["status"] == "completed":
        # 3. Download audio
        audio = requests.get(result["result_url"])
        with open("output.mp3", "wb") as f:
            f.write(audio.content)
        break
    elif result["status"] == "failed":
        raise Exception(result.get("error", "Generation failed"))
    time.sleep(1.5)

Streaming alternative: For supported models (Kokoro, MeloTTS), use POST /v1/tts/stream/ for real-time Server-Sent Events (SSE) streaming — no polling needed.

Kutaura-Ku-TekisiName

POST /v1/stt/

Transcribe audio to text. Supports 99 languages with auto-detection.

Kukumbira muviri (multipart/form-data)

ParameterTypeInodiwaKutaura
file file Yes Audio faira (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Hapana STT model: whisper (default), faster-whisper, sensevoice
language string Hapana Kodhi yechitauro. auto yekuwana otomatiki (yakajairika).
timestamps boolean Hapana Kusanganisira timestamps paword-level. Default: false
diarize boolean Hapana Kuita kuti mutauro uite diarization. Zviri pachena: false

Kubvunzana

JSON mhinduro
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Voice Cloning

POST /v1/tts/clone/

Create speech in a cloned voice. Upload a reference audio and text.

Kukumbira muviri (multipart/form-data)

ParameterTypeInodiwaKutaura
reference_audio file Yes Reference voice audio (10-30 masekondi anokurudzira). Max 20MB.
text string Yes Tevere, sarudza mashoko aunoda kutaura nezwi rako.
model string Hapana Clone model: chatterbox (default), cosyvoice2, gpt-sovits
format string Hapana Output format: mp3 (yakajairika), wav, flac
language string Hapana Nhamba yechinangwa chemitauro. Inofanira kutsigirwa nechigadzirwa chakasarudzwa.

Kubvunzana

Inodzosera faira rezwi se binary data, sezvakaitika ne TTS endpoint.

Voice ChangerName

POST /v1/voice-convert/

Convert audio to sound like a different voice. Upload source audio and choose a target voice.

Kukumbira muviri (multipart/form-data)

ParameterTypeInodiwaKutaura
file file Yes Source audio file (MP3, WAV, FLAC). Max 50MB.
target_voice string Yes Target voice ID to convert to (use /v1/voices/ to list available voices)
model string Hapana Mufananidzo wekushandurwa kwezwi: openvoice (yakajairika), knn-vc
format string Hapana Output format: wav (default), mp3, flac

Muenzaniso wechikumbiro

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Kubvunzana

Inodzosera yakashandurwa audio faira se binary data.

Kushandura Kutaura

POST /v1/speech-translate/

Kushandura zvakataurwa audio kubva mutauro mumwe kune mumwe. Kubatanidza mashoko-ku-tebhu, kushandura, uye mashoko-ku-mashoko mu mumwe kufona.

Kukumbira muviri (multipart/form-data)

ParameterTypeInodiwaKutaura
file file Yes Audio file yemutauro wekutanga. Max 100MB.
target_language string Yes Nhamba yechinangwa chemitauro (e.g., es, fr, de, ja)
voice string Hapana Mutauro wezvakashandurwa. Ichasarudza otomatiki kana ichitsiva.
preserve_voice boolean Hapana Sarudza kuti uchengete sei hunhu hwemutauro wemutaura. Default: false

Kubvunzana

JSON mhinduro
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Kutaura-Kutaura

POST /v1/speech-to-speech/

Kushandura pfungwa, pfungwa, kana kutumira panguva yekuchengeta zvinhu. Zvakanakira kudzora toni, pacing, uye expressionality.

Kukumbira muviri (multipart/form-data)

ParameterTypeInodiwaKutaura
file file Yes Source speech audio file. Max 50MB.
voice string Yes Target voice ID ye output speech
model string Hapana Model: openvoice (yakajairika), chatterbox
emotion string Hapana Target emotional: neutral, happy, sad, angry, excited
speed float Hapana Kugadzirisa kwesimba. Zviri pachena: 1.0. Kusiyana: 0.5 kusvika 2.0

Kubvunzana

Inodzosera yakashandurwa audio faira se binary data.

Zvishandiso zveSound

Audio processing endpoints for enhancement, vocal removal, stem splitting, and more.

POST /v1/audio/enhance/

Kuvandudza audio mhando: denoise, kuvandudza kujeka, super resolution.

file fileAudio file to enhance
denoise booleanKubvumira kubvisa ruzha (default: true)
enhance_clarity booleanKuwedzera kujeka kwemashoko (yakajairika: true)
super_resolution booleanKuwedzera mhando yezwi (yakajairika: fake)
strength integer1-3 (yakajeka, yakaderera, yakakwira). Default: 2
POST /v1/audio/separate/

Kuparadzanisa vowels kubva instrumentals (vowel kubvisa) kana kuparadzana muzvipfuyo.

file fileAudio faira rinofanira kuparadzanisa
model stringdemucs (default) kana spleeter
stems integerNhamba yezvipfuyo: 2, 4, 5, kana 6 (yakajairika: 2)
format stringOutput format: wav, mp3, flac
POST /v1/audio/dereverb/

Remove echo uye reverb kubva audio recordings.

file fileAudio faira rinofanira kugadziriswa
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Vakasununguka

Analyze audio kuongorora key, BPM, uye nguva saini.

file fileAudio faira rinofanira kuongororwa
Kubvunzana
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Vakasununguka

Kushandura audio pakati formats.

file fileAudio faira rinofanira kushandurwa
format stringMufananidzo wemufananidzo
bitrate integerOutput bitrate in kbps: 64, 128, 192, 256, 320
sample_rate integerSampu rate: 22050, 44100, 48000
channels stringmono kana stereo

Kutaura nezwi

POST /v1/voice-chat/

Kutumira audio kana meseji uye kugamuchira AI mhinduro ne synthesized mashoko.

Kukumbira muviri (multipart/form-data kana JSON)

ParameterTypeInodiwaKutaura
audio file Hapana* Audio input (audio kana text zvinodiwa)
text string Hapana* Kunyora (audio kana text zvinodiwa)
voice string Hapana Voice for AI response. Default: af_bella
tts_model string Hapana TTS model yekudzosera. Yakavanzika: kokoro
system_prompt string Hapana Custom system prompt for the AI
conversation_id string Hapana Kuenderera mberi nekutaura kwazvino

Kubvunzana

JSON mhinduro
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Batch TTS

POST /v1/tts/batch/

Kutumira akawanda mapepa ekunyora kune imwe TTS kuumbwa. Kana uchida, unogonawo kugamuchira webhook callback kana zvese mabasa zvapera.

Parameter

ParameterTypeKutaura
textsarrayArray of objects: {text, model, voice}. Max 50 items.
webhook_urlstringZviri nyore URL kuti POST zvinoratidza kana batch inosvika.

Kubvunzana

JSON mhinduro
{
  "batch_id": "abc123",
  "total": 3,
  "completed": 0,
  "status": "processing"
}

Poll progress with GET /v1/tts/batch/result/?batch_id=abc123

Kuisa Mutauro

POST /v1/voice-embed/

Kuisa mutauro kubva kune zvinongedzo zvemitauro. Usashandisa embed_id yakadzoserwa mumashoko anotevera ezvokuzviisa mutauro kuti uite zvinongedzo zvinongoitika nguva pfupi.

Parameter

ParameterTypeKutaura
filefileReference audio file (WAV, MP3, FLAC).
modelstringCloning model (default: chatterbox). Supported: chatterbox, cosyvoice2, openvoice, gpt-sovits, spark, indextts2, qwen3-tts.

Kubvunzana

JSON mhinduro
{
  "embed_id": "emb_abc123",
  "model": "chatterbox",
  "duration_ms": 450
}

Cheka hutano

GET /v1/health/

Ona GPU server status, loaded models, uye queue size. Hapana kubvumidzwa kunoda. Yakachengetwa mubhokisi rekuchengetedza kwe 30 masekondi.

Kubvunzana

JSON mhinduro
{
  "status": "online",
  "latency_ms": 45,
  "queue_size": 3,
  "models_loaded": ["kokoro", "chatterbox", "cosyvoice2"]
}

List Models

GET /v1/models/

Returns a list of all available models with their capabilities.

Kubvunzana

JSON mhinduro
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

List Voices

GET /v1/voices/

Inodzosera runyorwa rwese rwemazwi anowanikwa, kana zvichidikanwa, akachena nemodeli kana rurimi.

Parameter

ParameterTypeKutaura
model string Filter by model ID (e.g., kokoro)
language string Kuchenesa nekodzero yechitauro (e.g., en)
gender string Sarudza nezera: murume, mukadzi, chaiyo

Kubvunzana

JSON mhinduro
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Subtitles (SRT / VTT) new

GET /v1/speech/subtitles/?uuid=<job_uuid>&format=srt|vtt&download=1

Kugadzira zvinyorwa zvakaenzana nenguva kune chero yakamisikidzwa TTS basa. Inoita Whisper alignment pamusoro pezvokutaura uye inodzosera SRT kana WebVTT. Muenzaniso unochengetwa mudhiski kuitira kuti kufona kwekupedzisira kweiyo uuid imwe chete iite kuverenga kwedhiski.

Parameter

ParameterInodiwaKutaura
uuidYesJob UUID yakadzoserwa ne /v1/tts/ kana /v1/voice-clone/.
formatHapanasrt (yakajairika) kana vtt.
downloadHapana1 kuti uendese Content-Disposition: attachment kuitira kuti mushandisi achengete peji peji risina kuratidzwa.
languageHapanaKukurudzira kunzvimbo yemufananidzo (inoonekwa otomatiki kana yasara pasi).
cURL
curl "https://api.tts.ai/v1/speech/subtitles/?uuid=$UUID&format=srt&download=1" -o subtitles.srt

Chikamu chekutaura new

GET POST DELETE /api/v1/pronunciations/

Chiti TTS engine sei kutaura mazwi akazvimirira. Kuchengetwa entries otomatiki-kushandiswa kune chero TTS chikumbiro iwe kuita. 200-entry per-account limit.

Kukumbira muviri (POST)

ParameterTypeKutaura
wordstringChinyorwa chaungashandure (e.g. GIF, Anthropic). Chinyorwa chinoenzana nezita.
replacementstringKuita kuti zvive nyore kuisa mazita ezvinyorwa (e.g. jiff, ann THROP ick).
languagestringISO code. Empty = applies to all languages.
case_sensitivebooleanfalse yakajairika. Iva nechokwadi chekuti true inoenderana necase.
cURL
# Save an entry
curl -X POST https://tts.ai/api/v1/pronunciations/ \
  -H "Authorization: Bearer sk-tts-..." \
  -H "Content-Type: application/json" \
  -d '{"word": "GIF", "replacement": "jiff"}'

# List your entries
curl https://tts.ai/api/v1/pronunciations/ -H "Authorization: Bearer sk-tts-..."

# Delete entry by id
curl -X DELETE "https://tts.ai/api/v1/pronunciations/?id=42" -H "Authorization: Bearer sk-tts-..."

Iwe unogonawo kutumira kukumbira-ne-kukumbira kuchinja-chinja pasina kuchengeta iwo — sanganisira pronunciations pane chero /v1/tts/ kufona sechigadzirwa kana searray (ona TTS endpoint params).

Chinyorwa Narrator new

Kusiya imwe < code>< script> tag pa chero nyaya peji uye vashanyi kuwana a fixed muverengi bar kuti narrates peji pa click. Auto- inowana nyaya muviri, inotsigira custom voice / model / nzvimbo / accent color.

HTML
<script src="https://tts.ai/narrator.js"
    data-pk="pk-tts-your-publishable-key"
    data-voice="af_bella"
    data-model="kokoro"
    data-extract="auto"
    data-position="bottom"
    data-color="#e60000"
    data-locale="en"></script>

Options

ParameterKutaura
data-pkChiratidzo chinoburitswa (pk-tts-…). Kurambidzwa kwedomain kunoitwa kuburikidza ne allowed_domains field chechiratidzo.
data-voiceVoice ID. Default af_bella.
data-modelTTS model ID. Default kokoro.
data-extractauto (default) — inoedza zvisarudzo zvearticle/main/.post-content/.entry-content, inodzokera kumaparagraph cluster ane hukuru hwakawanda. Kana kuti inotumira chero CSS selector kuti iite chinangwa chechimwe chinhu.
data-positionbottom (yakajairika) kana top.
data-colorKutaura kwevara (isina CSS). #e60000.
data-min-chars / data-max-charsKunyangwe chinyorwa chiri chidiki, shandisa 200 (default) kana kuti 50,000 (max) mavara.

Source on GitHub:

Widget yeKunzwa Button

Inline button-style embed. Renders next to its