Dọkumenti

Integration TTS.ai n'ime gị usoro ihe omume na anyị REST API. OpenAI-compatible format maka mfe ịkwaga.

REST API OpenAI na-agbakwụnye Ndesịta ozi ndị ahụ Nnyemaka nbudata

Nlekọta

The TTS.ai API provides programmatic access to all platform features: text-to-speech synthesis, speech-to-text transcription, voice cloning, audio enhancement, and more. The API uses standard REST conventions with JSON request/response bodies.

Kii API

Nweta kii API gị site na Nkarachọ Akaụntụ. Available on Pro and Enterprise plans.

Base URL

https://api.tts.ai/v1/

Nkwenye

Token bearer site n'aka Authorization ihenlereanya

Nkwenye

Ndesịta arịrịọ API niile chọrọ nkwenye site na token Bearer n'ime Authorization ihenlereanya.

HTTP Héèdì
Authorization: Bearer sk-tts-your-api-key-here
Chekwaa kii API gị dị n'ime. Enweghị ike ịkekọrịta ya n'ime kọ́ọ̀dị̀ n'akụkụ klaasị, repository ndị mmadụ, mọọbụ logs. Kpụghaa kii mgbe ọbụla site n'ịhazi akaụntụ gị.

Base URL

Base URL: https://api.tts.ai/v1/

Ihenhọrọ nke ntabi anya niile dị n'ụdị URL a. dịka ọmụmaatụ, ihenhọrọ nke ntabi anya TTS bụ:

POST https://api.tts.ai/v1/tts/

Òtù ótù

API rate limits gbanwere site na plan:

Nhazi Ajụjụ/min Concurrent Ogo ngwe ngwe kacha nta
Pro 60 5 5,000 akara
Ụlọọrụ 300 20 50,000 akara

Ndesịta nke ihenhọrọ ndị ahụ na-agbakwunyere na nzaghachi ọbụla: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Ụgwọta

Ọrụ Nri Unit
TTS (Free models: Piper, VITS, MeloTTS) 1 credit 1,000 characters per
TTS (Standard models: Kokoro, CosyVoice 2, wdg.) 2 credits 1,000 characters per
TTS (Premium models: Tortoise, Chatterbox, wdg.) 4 credits 1,000 characters per
Asụsụ ka ngwe 2 credits per minute of audio
Klọnsị ụda 4 credits 1,000 characters per
Onyembanye ụda 3 credits per minute of audio
Nhazi ụda 2 credits per minute of audio
Wepụ ụda / Wepụ ụda 3-4 credits per minute of audio
Ntụgharị asụsụ 5 credits per minute of audio
Ngosi okwu 3 credits n'otu n'otu
Key Finder Ọfụụ --
Ntụgharị ụda Ọfụụ --

Tọghata ngwe ka ọsụsọ

POST /v1/tts/

Banye ngwe na ụda okwu. Na-eziga faịlụ ụda n'ụdị achọrọ.

Nhazi ahụ

ParamitaỤdịAchọrọNdesịta nkọwa
model string Ee Model ID (eg, kokoro, chatterbox, piper)
text string Ee Tẹ́ètị̀ ka a gbanwee ka ọsụsọ (maximum 5,000 characters for Pro, 50,000 for Enterprise)
voice string Ee Vòíọ̀tụ̀ ID (hazie /v1/vòìọ̀tụ̀tụ̀/ ka ịnye ndesịta vòìọ̀tụ̀ ndị dị̀)
format string Ọ dịghị Ụdị pụtapụta: mp3 (dìfọ́ọ̀ltụ̀), wav, flac, ogg
speed float Ọ dịghị Mgbatị ọsọ ikwu. Dìfọ́ọ̀ltụ̀: 1.0. Ogo: 0.5 ruo 2.0
language string Ọ dịghị Kóòdù asụsụ (eg, en, es). Achọpụtara nkeonwe ma ọ bụrụ na a hapụ ya.
stream boolean Ọ dịghị Mepee nzaghachi nbudata. Dìfọ́ọ̀ltụ̀: false

Ajụjụ ọmụmaatụ

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Ndesịta ozi ahụ

Returns the audio file as binary data with appropriate Content-Type header (audio/mpeg, audio/wav, etc.).

Ndesịta ozi ndị ahụ
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Asụsụ ka ngwe

POST /v1/stt/

Dezie ụda ka ọ bụrụ ngwe. Na-akwado asụsụ 99 na nchọpụta onwe ya.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịAchọrọNdesịta nkọwa
file file Ee Faịlụ ụda (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Ọ dịghị STT model: whisper (dìfọ́ọ̀ltụ̀), faster-whisper, sensevoice
language string Ọ dịghị Kọ́ọ̀dị̀ asụsụ. auto maka nchọpụta onwe ya (dìfọ́ọ̀ltụ̀).
timestamps boolean Ọ dịghị Tinye oge n'okpuru okwu. Dìfọ́ọ̀ltụ̀: false
diarize boolean Ọ dịghị Mepee diarization nke onyeọsụsụ. Dìfọ́ọ̀ltụ̀: n'ezighị ezi

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Klọnsị ụda

POST /v1/tts/clone/

Kewapụta okwu n'ime ụda ahụ. Bubata ụda na ngwe nlebara anya.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịAchọrọNdesịta nkọwa
reference_audio file Ee Ndesịta ozi olu ụda (10-30 sekọnd). Max 20MB.
text string Ee A ga-ekwupụta ngwe ahụ n'asụsụ klọn ahụ.
model string Ọ dịghị Clone model: chatterbox (dìfọ́ọ̀ltụ̀), cosyvoice2, gpt-sovits
format string Ọ dịghị Format pụtapụta: mp3 (dìfọ́ọ̀ltụ̀), wav, flac
language string Ọ dịghị Kọ́ọ̀dị̀ asụsụ́ ihenhọrọ. Ga-enyere ya aka site na móòdù a họọrọ.

Ndesịta ozi ahụ

Na-eweghachi faịlụ ụda dịka data bainirịi, dị ka ngwụcha TTS.

Onyembanye ụda

POST /v1/voice-convert/

Banye ụda ka ọ bụrụ ụda dị iche iche. Bubata ụda isi na họrọ ụda n'isi.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịAchọrọNdesịta nkọwa
file file Ee Faịlụ ụda isi (MP3, WAV, FLAC). Max 50MB.
target_voice string Ee Target voice ID ka a gbanwee ka (iji /v1/voices/ mee ndesịta ụda ndị dị na ya)
model string Ọ dịghị Móòdù ntụgharị ụda: openvoice (dìfọ́ọ̀ltụ̀), knn-vc
format string Ọ dịghị Ụdị pụtapụta: wav (dìfọ́ọ̀ltụ̀), mp3, flac

Ajụjụ ọmụmaatụ

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Ndesịta ozi ahụ

Na-eweghachi faịlụ ụda atụgharịrị dịka data bainirịi.

Ntụgharị asụsụ

POST /v1/speech-translate/

Gbanwee ụda na-ekwu site n'asụsụ otu gaa n'asụsụ ọzọ. Na-ejikọta okwu-na-asụsụ, ntụgharị, na ngwe-na-asụsụ n'ime oku otu.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịAchọrọNdesịta nkọwa
file file Ee Faịlụ ụda isi n'asụsụ mbụ. Max 100MB.
target_language string Ee Kóòdù asụsụ́ n'ime (eg, es, fr, de, ja)
voice string Ọ dịghị Ngosi maka ọbjektị atụgharịrị. Ahọrọla ya n'onwe ya ma ọ bụrụ na a hapụ ya.
preserve_voice boolean Ọ dịghị Na-achọ ichekwa onyeọsụsụ ochie

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Agụgụala

POST /v1/speech-to-speech/

Tụgharịa ụkpụrụokwu, mmetụta, mọọbụ ịnyefe mgbe ị na-echekwa ihenhọrọ ahụ. Ọ bara uru maka ịhazi ụda, ịpị, na ịkọwapụta.

Nhazi ahụ (multipart/form-data)

ParamitaỤdịAchọrọNdesịta nkọwa
file file Ee Faịlụ ụda ikwu isi. Max 50MB.
voice string Ee Target ID ụda maka ụda ọbjektị
model string Ọ dịghị Model: openvoice (dìfọ́ọ̀ltụ̀), chatterbox
emotion string Ọ dịghị Ndụmọdụ: neutral, happy, sad, angry, excited
speed float Ọ dịghị Nhazi ọsọ. Dìfọ́ọ̀ltụ̀: 1.0. Oge: 0.5 ruo 2.0

Ndesịta ozi ahụ

Na-eweghachi faịlụ ụda atụgharịrị dịka data binarị.

Ngwaọrụ ụda

Audio processing endpoints maka nkwalite, ịkpụga ụda, ịgwakọta stem, na ndị ọzọ.

POST /v1/audio/enhance/

Melite ogo ụda: denoise, melite nghọta, super resolution.

file fileFaịlụ ụda a ga-emelite
denoise booleanMepee denoising (dìfọ́ọ̀ltụ̀: eziokwu)
enhance_clarity booleanBawanye nghọta okwu (dìfọ́ọ̀ltụ̀: eziokwu)
super_resolution booleanNhazi ụda dị elu (dìfọ́ọ̀ltụ̀: ụgha)
strength integer1-3 (nkịtị, n'etiti, ike). Dìfọ́ọ̀ltụ̀: 2
POST /v1/audio/separate/

Kpụghaa vokals site n'instrumentals (ehichapụ vokal) mọọbụ wepụ ha n'ime stims.

file fileFaịlụ ụda a ga-ewepụ
model stringdemucs (Dìfọ́ọ̀ltụ̀) ma ọ bụ spleeter
stems integerNọmba nke stiimi: 2, 4, 5, mọọbụ 6 (dìfọ́ọ̀ltụ̀: 2)
format stringỤdị pụtapụta: wav, mp3, flac
POST /v1/audio/dereverb/

Wepụ echo na reverb site na ụda rekọ́ọ̀sụ̀.

file fileFaịlụ ụda iji rụọ ọrụ
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Ọfụụ

Anakọta ụda iji chọpụta kii, BPM, na oge akara.

file fileAudio file to analyze
Ndesịta ozi ahụ
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Ọfụụ

Gbanwee ụda n'etiti ụdị.

file fileFaịlụ ụda iji gbanwee
format stringTarget format: mp3, wav, flac, ogg, m4a, aac
bitrate integerOutput bitrate in kbps: 64, 128, 192, 256, 320
sample_rate integerÓnyénwē ákàrà:
channels stringmono ma ọ bụ stereo

Ngosi okwu

POST /v1/voice-chat/

Ziga ụda mọọbụ ngwe ma nweta nzaghachi AI site n'ike okwu.

Nhazi ahụ (multipart/form-data ma ọ bụ JSON)

ParamitaỤdịAchọrọNdesịta nkọwa
audio file Ọ dịghị* Audio input (ọbụla audio mọọbụ text chọrọ)
text string Ọ dịghị* Input ngwe (ọbụla audio mọọbụ text chọrọ)
voice string Ọ dịghị Ngosi maka nzaghachi AI. Dìfọ́ọ̀ltụ̀: af_bella
tts_model string Ọ dịghị TTS model maka nzaghachi. Dìfọ́ọ̀ltụ̀: kokoro
system_prompt string Ọ dịghị Nhọrọ sistem emeredịkachọrọ maka AI
conversation_id string Ọ dịghị Gaa n'ihu n'ọnụọgụgụ ahụ

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Ndesịta móòdù

GET /v1/models/

Na-eziga ndesịta nke móòdù niile dị̀ n'ọrụ nakwa ikike ha nwere.

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Ndesịta ụda

GET /v1/voices/

Na-eziga ndesịta nke ụda niile dịnụ, nke a ga-ehichapụ site na móòdù mọọbụ asụsụ.

Paramita ndị ahụ

ParamitaỤdịNdesịta nkọwa
model string Filtara site na móòdù ID (eg, kokoro)
language string Filtara site na kóòdù asụsụ (eg, en)
gender string Filtara site n'ụdị nwoke: male, female, neutral

Ndesịta ozi ahụ

Ndesịta ozi ahụ
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Egwuregwu ụkpụrụedemede

Tọghata ngwe ka ọsụsọ

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Asụsụ ka ngwe

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Klọnsị ụda

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Tọghata ngwe ka ọsụsọ

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Asụsụ ka ngwe

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Tọghata ngwe ka ọsụsọ

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Asụsụ ka ngwe

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Klọnsị ụda

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Nhazi ụda

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Kóòdù ndehie

Ndehie niile na-eweghachi nzaghachi JSON na error Ebe ahụ.

Nhazi nzaghachi ndehie
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough credits for this request.",
    "credits_required": 4,
    "credits_available": 2
  }
}
Ụdị HTTPError CodeNdesịta nkọwa
400 bad_request Ndesịta ihenhọrọ ndị ahụ achọrọ abụghị nke kwesịrị ekwesị. Gụọ ozi ndehie maka nkọwa.
401 unauthorized Kii API na-efu ma ọ bụ enweghị isi.
402 insufficient_credits Enweghị kredit dị n'aka gị. Zụta ihe ndị ọzọ na /pricing/.
403 forbidden Agaghị enwe ike inweta API na mbido gị.
404 not_found Model mọọbụ ụda achọpụtaghị.
413 file_too_large Faịlụ a na-ebubata na-erughị ụhara ọha na eze.
429 rate_limited Ndesịta arịrịọ ndị dị ukwuu. Chọọ ọnụọgụgụ nkwụsị nke isiokwu ahụ.
500 internal_error Ndehie sava. Pịa ọzọ mgbe ahụ.
503 model_loading Na-ebubata móòdù. Pịa ọzọ n'ime sekọnd ole na ole.

Webhooks

Maka ọrụ na-aga n'ihu (ịgwakọta stiim, batch TTS), ị nwere ike ịnye webhook_url parameter. Mgbe ọrụ ahụ gasịrị, anyị ga-eziga nsonaazụ na URL gị.

Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
A ga-ebubata nsonaazụ webụhooks n'ime awa 24 mgbe e mesịrị. Gbalịa ibudata ha n'oge.

Nwere ike imepụta?

Nweta kii API gị ma malite ijikọta TTS.ai n'ime usoroiheomume gị.