Dọkumenti

N'ime TTS.ai n'ime usoro ihe omume gị na anyị REST API. OpenAI-compatible format maka mfe ịkwaga.

REST API OpenAI na-agbakwụnye Ndesịta ozi ndị ahụ Nnyemaka nbudata

Nhazi

The TTS.ai API na-enye ohere ịbanye n'ime usoro ihe omume niile: ntinye ederede na-ekwu okwu, ntinye okwu na-ekwu okwu, ntinye okwu, nkwalite ụda, na ndị ọzọ. API na-eji ụkpụrụ REST conventions na JSON request / response bodies.

Kii API

Nweta kii API gị site na Nhazi akaụntụ. Available on Pro and Enterprise plans.

Base URL

https://api.tts.ai/v1/

Nkwenye

Token bearer site n'aka Authorization ihenlereanya

Nkwenye

Ndesịta arịrịọ API niile chọrọ nkwenye site na token Bearer n'ime Authorization ihenlereanya.

HTTP Héèdì
Authorization: Bearer sk-tts-your-api-key-here
Chekwaa kii API gị dị n'ime. Enweghị ike ịkekọrịta ya n'ime kọ́ọ̀dị̀ n'akụkụ klaasị, repository ndị mmadụ, mọọbụ logs. Kpụghaa kii mgbe ọbụla site n'ịhazi akaụntụ gị.

SDKs

Official SDKs na-eme ka ọ dị mfe iji jikọta TTS.ai na usoro ihe omume gị. Ha abụọ bụ isi na-emeghe na GitHub.

Python

pip install ttsai
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-...")
audio = client.generate(
    text="Hello world!",
    model="kokoro"
)
client.save(audio, "output.wav")
GitHub

JavaScript / Node.js

npm install @ttsainpm/ttsai
const { TTSClient } = require('@ttsainpm/ttsai');

const client = new TTSClient({
  apiKey: 'sk-tts-...'
});
const audio = await client.generate({
  input: 'Hello world!',
  model: 'kokoro'
});
await client.saveToFile(audio, 'output.wav');
GitHub

Base URL

Base URL: https://api.tts.ai/v1/

Ngwụcha-pọ́ị̀tị̀ niile bụ n'ụdị nke URL a. N'ụdị, ngwụcha-pọ́ị̀tị̀ TTS bụ:

POST https://api.tts.ai/v1/tts/

Ogo nke oge

API rate limits na-agbanwe site na plan:

Nhazi Ajụjụ/min N'otu oge Ogo ngwe ngwe nke kacha nta
_Nkebi 10 2 500 akara
Nhazi 30 3 100,000 akara
Nhazi 60 5 100,000 akara
Ụlọọrụ 300 20 50,000 akara

Ndesịta nke ihenhọrọ ndị ahụ na-agbakwunyere na nzaghachi ọbụla: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Ụgwọta

Nrụọrụ Nri Ọnụọgụgụ
TTS (Free models: Piper, VITS, MeloTTS) 1,000 characters 1,000 characters per
TTS (Standard models: Kokoro, CosyVoice 2, wdg.) 2,000 akara 1,000 characters per
TTS (Premium models: Tortoise, Chatterbox, wdg.) 4,000 akara 1,000 characters per
Asụsụ ka ngwe 2,000 akara per minute of audio
Klọnsị ụda 4,000 akara 1,000 characters per
Onyembanye ụda 3,000 akara per minute of audio
Nhazi ụda 2,000 akara per minute of audio
Wepụ ụda / Wepụ ụda 3,000-4,000 akara per minute of audio
Ntụgharị asụsụ 5,000 akara per minute of audio
Ngosi okwu 3,000 akara N'otu n'otu
Key Finder _Nkebi --
Ntụgharị ụda _Nkebi --

Tọghata ngwe ka ọsụsọ

POST /v1/tts/

Banye ngwe na ụda okwu. Na-eziga faịlụ ụda n'ụdị achọrọ.

Nhazi ahụ

ParamitaỤdịEkwesịrịNdesịta nkọwa
model string Ee Model ID (eg, kokoro, chatterbox, piper)
text string Ee ngwe a ga-ebipụta na ngwe (maximum 100,000 characters per request)
voice string Ee Vòíọ̀tụ̀ ID (hazie /v1/vòìọ̀tụ̀tụ̀/ ka ịnye ndesịta vòìọ̀tụ̀ ndị dị̀)
format string Ọ bụghị Ọdịdị pụta: mp3 (dìfọ́ọ̀ltụ̀), wav, flac, ogg
speed float Ọ bụghị Mgbatị ọsọ ikwu. Dìfọ́ọ̀ltụ̀: 1.0. Oge: 0.5 ruo 2.0
language string Ọ bụghị Kóòdù asụsụ (eg, en, es). Achọpụtara nkeonwe ma ọ bụrụ na a hapụ ya.
stream boolean Ọ bụghị Mepee nzaghachi ntụgharị. Dìfọ́ọ̀ltụ̀: n'ezighị ezi

Nhazi

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Ndesịta ozi ahụ

Na-eweghachi faịlụ ụda dịka data bainirịi na Ndị ọdịnaya nke kwesịrị ekwesị (ụda/mpeg, ụda/wav, wdg.).

Ndesịta ozi ndị ahụ
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Asụsụ ka ngwe

POST /v1/stt/

Dezie ụda ka ọ bụrụ ngwe. Na-akwado asụsụ 99 na nchọpụta onwe ya.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịEkwesịrịNdesịta nkọwa
file file Ee Faịlụ ụda (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Ọ bụghị STT móòdù: whisper (dìfọ́ọ̀ltụ̀), faster-whisper, sensevoice
language string Ọ bụghị Kọ́ọ̀dị̀ asụsụ. auto maka nchọpụta onwe ya (dìfọ́ọ̀ltụ̀).
timestamps boolean Ọ bụghị Tinye oge n'okpuru okwu. Dìfọ́ọ̀ltụ̀: false
diarize boolean Ọ bụghị Mepee diarization nke onyeọsụsụ. Dìfọ́ọ̀ltụ̀: n'ezighị ezi

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Klọnsị ụda

POST /v1/tts/clone/

Kewapụta okwu n'ime ụda ahụ. Bubata ụda na ngwe nlebara anya.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịEkwesịrịNdesịta nkọwa
reference_audio file Ee Nkọwa ụda ụda (10-30 sekọnd a na-atụ aro). Max 20MB.
text string Ee Tẹ́ètị̀ ka a ga-ekwusa na ụda ahụ e mepụtara.
model string Ọ bụghị Klone móòdù: chatterbox (dìfọ́ọ̀ltụ̀), cosyvoice2, gpt-sovits
format string Ọ bụghị Ọdịdị pụta: mp3 (dìfọ́ọ̀ltụ̀), wav, flac
language string Ọ bụghị Kóòdù asụsụ́ n'ihe nlereanya. E kwesịrị inyere ya aka site na móòdù a họọrọ.

Ndesịta ozi ahụ

Na-eziga faịlụ ụda dịka data binarị, dị ka TTS ngwụcha.

Onyembanye ụda

POST /v1/voice-convert/

Banye ụda ka ọ bụrụ ụda dị iche iche. Bubata ụda isi na họrọ ụda n'isi.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịEkwesịrịNdesịta nkọwa
file file Ee Faịlụ ụda isi (MP3, WAV, FLAC). Max 50MB.
target_voice string Ee Target voice ID ka a gbanwee ka (iji /v1/voices/ mee ndesịta ụda ndị dị na ya)
model string Ọ bụghị Móòdù ntụgharị ụda: openvoice (dìfọ́ọ̀ltụ̀), knn-vc
format string Ọ bụghị Ụdị pụtapụta: wav (dìfọ́ọ̀ltụ̀), mp3, flac

Nhazi

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Ndesịta ozi ahụ

Na-eziga faịlụ ụda atụgharịrị dịka data bainirịi.

Ntụgharị asụsụ

POST /v1/speech-translate/

Gbanwee ụda a na-ekwu site n'asụsụ otu gaa n'otu. Na-ejikọta okwu-na-asụgharị, ntụgharị, na ngwe-na-asụgharị n'ime oku otu.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịEkwesịrịNdesịta nkọwa
file file Ee Faịlụ ụda isi na asụsụ okpukpe. Max 100MB.
target_language string Ee Kóòdù asụsụ́ n'ime (eg, es, fr, de, ja)
voice string Ọ bụghị Ụda maka ọbjektị atụgharịrị. Ahọrọla ya n'onwe ya ma ọ bụrụ na a hapụ ya.
preserve_voice boolean Ọ bụghị N'ihi na ịchọrọ ichekwa ụda nke onye na-ekwu okwu. Dìfọ́ọ̀ltụ̀: false

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Agụgụala

POST /v1/speech-to-speech/

Kwụsị gbanwee ụghaasị okwu, ọbụna ọbụna n'ịchekwa ihenhọrọ ahụ. Ọ bara uru maka ịhazi ụghaasị, n'ịhazi oge, nakwa n'ịkọwapụta ihe.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịEkwesịrịNdesịta nkọwa
file file Ee Faịlụ ụda ikwu isi. Max 50MB.
voice string Ee Target ID ụda maka ụda ọbụla
model string Ọ bụghị Model: openvoice (dìfọ́ọ̀ltụ̀), chatterbox
emotion string Ọ bụghị Ndụmọdụ: neutral, happy, sad, angry, excited
speed float Ọ bụghị Nhazi ọsọ. Dìfọ́ọ̀ltụ̀: 1.0. Oge: 0.5 ruo 2.0

Ndesịta ozi ahụ

Na-eziga faịlụ ụda atụgharịrị dịka data bainirịi.

Ngwaọrụ ụda

Audio processing endpoints maka nkwalite, ịkpụga ụda, ịgwakọta stem, na ndị ọzọ.

POST /v1/audio/enhance/

Melite ogo ụda: denoise, melite nghọta, super resolution.

file fileFaịlụ ụda a ga-emelite
denoise booleanMepee denoising (dìfọ́ọ̀ltụ̀: eziokwu)
enhance_clarity booleanBawanye nghọta okwu (dìfọ́ọ̀ltụ̀: eziokwu)
super_resolution booleanNhazi ụda dị elu (dìfọ́ọ̀ltụ̀: ụgha)
strength integer1-3 (n'elu, n'etiti, ike). Dìfọ́ọ̀ltụ̀: 2
POST /v1/audio/separate/

Wepụ vokali site n'instrumentals (wepu vokali) mọọbụ wepụ ha n'ime stims.

file fileFaịlụ ụda iji wepụ
model stringdemucs (Dìfọ́ọ̀ltụ̀) ma ọ bụ spleeter
stems integerỤdị stiim: 2, 4, 5, mọọbụ 6 (dìfọ́ọ̀ltụ̀: 2)
format stringỌdịdị ọbjektị: wav, mp3, flac
POST /v1/audio/dereverb/

Wepụ echo na reverb site na ụda rekọ́ọ̀sụ̀.

file fileFaịlụ ụda a ga-ewepụ
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Ọfụụ

Anakọta ụda iji chọpụta kii, BPM, na oge ngosipụta.

file fileFaịlụ ụda a ga-enyocha
Ndesịta ozi ahụ
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Ọfụụ

Kpọchie ụda n'etiti fóráịtị.

file fileFaịlụ ụda a ga-ebubata
format stringFóráịtị́ n'ihi: mp3, wav, flac, ogg, m4a, aac
bitrate integerBitịtị ọbụla n'ime kbps: 64, 128, 192, 256, 320
sample_rate integerNhazi:
channels stringmono ma ọ bụ stereo

Ngosi okwu

POST /v1/voice-chat/

Ziga ụda mọọbụ ngwe ma nweta nzaghachi AI site n'ike okwu.

Nhazi ahụ (multipart/form-data ma ọ bụ JSON)

ParamitaỤdịEkwesịrịNdesịta nkọwa
audio file Ọ bụghị* Audio input (ọbụla audio mọọbụ text chọrọ)
text string Ọ bụghị* Input ngwe (ọbụla audio mọọbụ text chọrọ)
voice string Ọ bụghị Ngosi maka nzaghachi AI. Dìfọ́ọ̀ltụ̀: af_bella
tts_model string Ọ bụghị TTS móòdù maka nzaghachi. Dìfọ́ọ̀ltụ̀: kokoro
system_prompt string Ọ bụghị Nnọọ sistem emeredịkachọrọ maka AI
conversation_id string Ọ bụghị Gaa n'ihu n'ọnụọgụgụ ahụ

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Báà TTS

POST /v1/tts/batch/

Kpọpụta ngwe ndị dị iche iche maka mbipụta TTS dị n'otu. Nhọrọ ahụ na-enweta webụhooku callback mgbe ọrụ niile gasịrị.

Paramita

ParamitaỤdịNdesịta ozi ndị ahụ
textsarrayArray of objects: {text, model, voice}. Max 50 items.
webhook_urlstringOptional URL to POST results when batch completes.

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "batch_id": "abc123",
  "total": 3,
  "completed": 0,
  "status": "processing"
}

Nlekọta n'ime n'ime na GET /v1/tts/batch/result/?batch_id=abc123

Nnyesaịtị ụda

POST /v1/voice-embed/

Kpọmkwem n'ihu n'ịnye okwu site na reèfọ́ọ̀ltụ̀ ụda. Jiri embed_id a na-eziga n'ime arịrịọ ndị ọzọ maka ịnye okwu n'oge na-adịghị anya.

Paramita

ParamitaỤdịNdesịta ozi ndị ahụ
filefileReference audio file (WAV, MP3, FLAC).
modelstringCloning model (default: chatterbox). Supported: chatterbox, cosyvoice2, openvoice, gpt-sovits, spark, indextts2, qwen3-tts.

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "embed_id": "emb_abc123",
  "model": "chatterbox",
  "duration_ms": 450
}

Nlekọta ahụike

GET /v1/health/

Nnyocha ọnọdụ sava GPU, mbipụta móòdù, nakwa ụhara ótù. Enweghị ikikembanye achọrọ. Kechie maka sekọnd 30.

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "status": "online",
  "latency_ms": 45,
  "queue_size": 3,
  "models_loaded": ["kokoro", "chatterbox", "cosyvoice2"]
}

Ndesịta móòdù

GET /v1/models/

Na-eziga ndesịta nke móòdù niile dị̀ n'ọrụ nakwa ikike ha nwere.

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Ndesịta ụda

GET /v1/voices/

Na-eziga ndesịta nke ụda niile dịnụ, nke a ga-ehichapụ site na móòdù mọọbụ asụsụ.

Paramita

ParamitaỤdịNdesịta nkọwa
model string Filtara site na móòdù ID (eg, kokoro)
language string Filtara site na kóòdù asụsụ (eg, en)
gender string Filtara site n'ụdị nwoke: male, female, neutral

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Egwuregwu ụkpụrụedemede

Tọghata ngwe ka ọsụsọ

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Asụsụ ka ngwe

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Klọnsị ụda

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Tọghata ngwe ka ọsụsọ

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Asụsụ ka ngwe

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Tọghata ngwe ka ọsụsọ

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Asụsụ ka ngwe

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Klọnsị ụda

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Nhazi ụda

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Kóòdù ndehie

Ndehie niile na-eziga nzaghachi JSON na error _Nkebi

Ndehie n'ụdị nzaghachi
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough characters for this request.",
    "characters_required": 4000,
    "characters_available": 2000
  }
}
Ụdị HTTPKóòdù ndehieNdesịta nkọwa
400 bad_request Paramita ndị achọrọ adịghị mma. Gụọ ozi ndehie maka nkọwa.
401 unauthorized Kii API ehichapụ mọọbụ nke agaghị ekwe omume.
402 insufficient_credits Ọdịiche dị n'ebe ahụ. Zụlite ihe ọzọ na /pricing/.
403 forbidden Agaghị enwe ike inweta API na mbido gị.
404 not_found Model mọọbụ ụda achọpụtaghị.
413 file_too_large Faịlụ a na-ebubata ruru ogo nke ọha na eze.
429 rate_limited Ntụziaka ndị dị ukwuu. Gbanwee n'elu nke oke nkwụsị.
500 internal_error Ndehie sava. Pịa ọzọ mgbe ahụ.
503 model_loading Na-ebubata móòdù. Pịa ọzọ n'ime sekọnd ole na ole.

Webhooks

Maka ọrụ na-aga n'ihu (ịgwakọta stiim, batch TTS), ị nwere ike ịnye webhook_url parameter. Mgbe ọrụ ahụ gasịrị, anyị ga-eziga nsonaazụ na URL gị.

Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
Ihe omume webụhooks dị maka ibudata maka awa 24 mgbe e mechara. Gbalịa ibudata ha n'oge.

Nwere ike ịrụzi?

Zụta kiịị API gị ma malite ịnyefe TTS.ai n'ime usoroiheomume gị.