Ukufaka idokhumende

Ukufaka i-TTS.ai kuzinhlelo zakho zokusebenza nge-REST API yethu. Ifomethi ehambisana ne-OpenAI yokufuduka ngokulula.

I-REST API OpenAI ehambisanayo Ukuphendula kwe-JSON Usizo lokusakaza

Ukubuka konke

The TTS.ai API provides programmatic access to all platform features: text-to-speech synthesis, speech-to-text transcription, voice cloning, audio enhancement, and more. The API uses standard REST conventions with JSON request/response bodies.

Isithonjana se-API

Thola isithonjana sakho se-API kusuka Izilungiselelo ze-akhawunti. Kutholakala ku-Pro ne-Enterprise plans.

Isisekelo se-URL

https://api.tts.ai/v1/

Ukugunyazwa

I-bearer token nge Authorization okuphezulu

Ukuqinisekiswa

Zonke izicelo ze-API zidinga ukuqinisekiswa nge-Bearer token ku- Authorization okuphezulu.

Isihloko se-HTTP
Authorization: Bearer sk-tts-your-api-key-here
Gcina isithonjana sakho se-API sifihlakele. Ungayihlukanisi nekhowudi yekhasimende, ama-repositories kamphakathi, noma ama-logs. Jikelezisa izinkinobho njalo kusuka kuma-settings akho we-akhawunti.

Isisekelo se-URL

Isisekelo se-URL: https://api.tts.ai/v1/

Zonke iziqephu zihlobene nale-base URL. Umzekelo, isiqephu se-TTS yi:

POST https://api.tts.ai/v1/tts/

Umkhawulo wezinga

Imingcele yesilinganiso se-API ihluka ngohlelo:

Iphrojekthi Izicelo/imini Concurrent Ubude bombhalo obukhulu
i-Pro 60 5 Amaphawu angama-5,000
Ibhizinisi 300 20 50,000 characters

Isihloko somkhawulo wezinga siqukiwe kuwo wonke umlayezo: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Izindleko zekhedi

Izinsizakalo Izindleko Iyunithi
TTS (Amamodeli amahhala: Piper, VITS, MeloTTS) 1 credit ngamagama angama-1,000
TTS (Amamodeli ajwayelekile: Kokoro, CosyVoice 2, njll.) 2 credits ngamagama angama-1,000
TTS (Amamodeli aphezulu: Tortoise, Chatterbox, njll.) 4 credits ngamagama angama-1,000
Ukukhuluma kuMbhalo 2 credits Iminithi ngayinye yomsindo
Ukulungiswa kwezwi 4 credits ngamagama angama-1,000
Umshintshi womsindo 3 credits Iminithi ngayinye yomsindo
Ukuthuthukiswa komsindo 2 credits Iminithi ngayinye yomsindo
Ukususwa kwezwi / Ukuhlukaniswa kwezwi 3-4 credits Iminithi ngayinye yomsindo
Ukuhumusha kwezwi 5 credits Iminithi ngayinye yomsindo
Izingxoxo zomsindo 3 credits ngayinye
Iqhosha le-BPM Finder Ikhululekile --
Umguquli womsindo Ikhululekile --

Umbhalo usuka kumazwi

POST /v1/tts/

Guqula umbhalo ube umsindo wokukhuluma. Ibuyisela ihele lomsindo ngefomethi ecelwe.

Isiqu sesicelo

AmapharamithaUhloboKudingekaIncazelo
model string Yebo Imodeli ID (isibonelo, kokoro, chatterbox, piper)
text string Yebo Umbhalo oguqulwe ube ulwimi (max 5,000 characters for Pro, 50,000 for Enterprise)
voice string Yebo Umsindo ID (sebenzisa /v1/voices/ ukudweba umsindo okhona)
format string Hayi Ifomethi yokuphuma: mp3 (iphutha), wav, flac, ogg
speed float Hayi Ukuphindaphinda kwejubane lokukhuluma. Okuzenzakalelayo: 1.0. Umkhawulo: 0.5 kuya ku 2.0
language string Hayi Umbhalo ofingqiwe wesilimi (isibonelo, en, es). Uzotholakala ngokuzenzakalela uma ushiyekile.
stream boolean Hayi Vumela umlayezo wokusakaza. Okuzenzakalelayo: false

Isibonelo sezicelo

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Umlayezo

Returns the audio file as binary data with appropriate Content-Type header (audio/mpeg, audio/wav, etc.).

Izinhlamvu eziphezulu zekhasi
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Ukukhuluma kuMbhalo

POST /v1/stt/

Gcwalisa umsindo ube ngumbhalo. Ixhasa izilimi ezingu-99 nge-auto-detection.

Isiqu sesicelo (multipart/form-data)

AmapharamithaUhloboKudingekaIncazelo
file file Yebo Ihele lomsindo (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Hayi Imodeli ye-STT: whisper (iphutha), faster-whisper, sensevoice
language string Hayi Umbhalo ofingqiwe wesilimi. auto ukuhlola okuzenzakalelayo (okuzenzakalelayo).
timestamps boolean Hayi Kuhlanganise negama-level timestamps. Okuzenzakalelayo: false
diarize boolean Hayi Vumela isiqophi somsindo. Okuzenzakalelayo: false

Umlayezo

Umlayezo we-JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Ukulungiswa kwezwi

POST /v1/tts/clone/

Yenza umlayezo ngezwi elilodwa. Layisha phezulu umlayezo we-audio ne-text.

Isiqu sesicelo (multipart/form-data)

AmapharamithaUhloboKudingekaIncazelo
reference_audio file Yebo Isixhumanisi somsindo womsindo (10-30 imizuzwana evunyelwe). Max 20MB.
text string Yebo Umbhalo okhuluma ngomsindo ohlonywe.
model string Hayi Imodeli ye-clone: chatterbox (iphutha), cosyvoice2, gpt-sovits
format string Hayi Ifomethi yokuphuma: mp3 (iphutha), wav, flac
language string Hayi Ikhowudi yesizinda solimi. Kudingeka ixhaswe yimodeli ekhethiwe.

Umlayezo

Ibuyisela ihele lomsindo njengedatha ye-binary, efanayo ne-TTS endpoint.

Umshintshi womsindo

POST /v1/voice-convert/

Guqula umsindo ube yizwi elihlukile. Layisha umsindo womsuka bese ukhetha umsindo ofuna ukuwuthola.

Isiqu sesicelo (multipart/form-data)

AmapharamithaUhloboKudingekaIncazelo
file file Yebo Umsuka wehele lomsindo (MP3, WAV, FLAC). Ubuningi 50MB.
target_voice string Yebo I-ID yomsindo ofuna ukuwuguqula (sebenzisa /v1/voices/ ukudweba imisindo ekhona)
model string Hayi Imodeli yokuguqulwa kwezwi: openvoice (iphutha), knn-vc
format string Hayi Ifomethi yokuphuma: wav (iphutha), mp3, flac

Isibonelo sezicelo

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Umlayezo

Ibuyisela ihele lomsindo eliguqulwe njengedatha ye-binary.

Ukuhumusha kwezwi

POST /v1/speech-translate/

Guqula umsindo okhulumayo kusuka kulesilimi esisodwa kuya kwesinye. Ihlanganisa ukukhuluma-nokubhala, ukuguqulela, kanye nokubhala-nokukhuluma kucingo olulodwa.

Isiqu sesicelo (multipart/form-data)

AmapharamithaUhloboKudingekaIncazelo
file file Yebo Umsuka wehele lomsindo ngesilimi sakuqala. Max 100MB.
target_language string Yebo Umbhalo ofingqiwe wesizinda solimi (isibonelo, es, fr, de, ja)
voice string Hayi Umsindo wokuguqulelwa kwe-output. Ikhethiwe ngokuzenzakalelayo uma ilahlekile.
preserve_voice boolean Hayi Zama ukugcina isikhulumi sakuqala

Umlayezo

Umlayezo we-JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Ukukhuluma ku-Ukukhuluma

POST /v1/speech-to-speech/

Guqula isitayela sokukhuluma, isifiso, noma ukuthunyelwa ngenkathi ugcina okuqukethwe. Kusetshenziswa ukuhlela umsindo, ukusheshisa, nokuveza.

Isiqu sesicelo (multipart/form-data)

AmapharamithaUhloboKudingekaIncazelo
file file Yebo Umthombo wehele lomsindo lomsindo. Max 50MB.
voice string Yebo I-ID yomsindo o targetwe umsindo ophumayo
model string Hayi Imodeli: openvoice (iphutha), chatterbox
emotion string Hayi Inhloso yomqondo: neutral, happy, sad, angry, excited
speed float Hayi Ukulungiswa kwejubane. Okuzenzakalelayo: 1.0. Umkhawulo: 0.5 kuya ku 2.0

Umlayezo

Ibuyisela ihele lomsindo eliguqulwe njengedatha ye-binary.

Amathuluzi omsindo

Izinhlamvu zokuphatha umsindo zokuthuthukisa, ukususa umsindo, ukuhlukaniswa kwe-stem, njll.

POST /v1/audio/enhance/

Nciphisa ukhwalithi yomsindo: khulula umsindo, thuthukisa ukucacile, sinqumo esiphezulu.

file fileIhele lomsindo okufanele lithuthukiswe
denoise booleanVumela i-denoise (iphutha: yiqiniso)
enhance_clarity booleanNciphisa ukucacile kokukhuluma (iphutha: yiqiniso)
super_resolution booleanUpscale audio quality (default: false)
strength integer1-3 (okukhanyayo, ophakathi, onamandla). Okuzenzakalelayo: 2
POST /v1/audio/separate/

Yakha ama-vocal kusuka kuma-instrumental (ukususa ama-vocal) noma uhlukanise ngama-stems.

file fileIhele lomsindo okufanele lihlukaniswe
model stringdemucs (iphutha) noma spleeter
stems integerInani lama-stems: 2, 4, 5, noma 6 (iphutha: 2)
format stringIfomethi yokukhishwayo: wav, mp3, flac
POST /v1/audio/dereverb/

Susa i-echo ne-reverb kusuka ku-audio recordings.

file fileIhele lomsindo okufanele liqhubekekiwe
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Ikhululekile

Hlola umsindo ukuze ubone isithonjana, i-BPM, nesikhathi sokufaka isitifiketi.

file fileAudio file to analyze
Umlayezo
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Ikhululekile

Guqula umsindo phakathi kwamafomethi.

file fileIhele lomsindo okufanele liguqulwe
format stringIfomethi elindelekile: mp3, wav, flac, ogg, m4a, aac
bitrate integerI-bitrate yokukhishwayo kwi-kbps: 64, 128, 192, 256, 320
sample_rate integerIzinga lesampula: 22050, 44100, 48000
channels stringmono noma stereo

Izingxoxo zomsindo

POST /v1/voice-chat/

Thumela umsindo noma umbhalo bese uthola umlayezo we-AI ngezwi elihlanganisiwe.

Isiqu sesicelo (multipart/form-data noma JSON)

AmapharamithaUhloboKudingekaIncazelo
audio file Hayi* Isingeniso somsindo (noma audio noma text kudingeka)
text string Hayi* Isingeniso sombhalo (noma audio noma text kudingeka)
voice string Hayi Umsindo wokuphendula kwe-AI. Okuzenzakalelayo: af_bella
tts_model string Hayi Imodeli ye-TTS yokuphendula. Okuzenzakalelayo: kokoro
system_prompt string Hayi Uhlelo lokucela oluzenzakalelayo lwe-AI
conversation_id string Hayi Qhubeka nezingxoxo ezikhona

Umlayezo

Umlayezo we-JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Hlela amamodeli

GET /v1/models/

Ibuyisela uhlu lwawo wonke amamodeli atholakalayo nekhono labo.

Umlayezo

Umlayezo we-JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Uhlu lwamagama

GET /v1/voices/

Ibuyisela uhlu lwazo zonke izizwi ezikhona, ezihlanganisiwe ngemodeli noma ulwimi.

Ipharamitha yombuzo

AmapharamithaUhloboIncazelo
model string Isihlungi ngemodeli ID (e.g., kokoro)
language string Isihlungi ngekhodi yesilimi (isibonelo, en)
gender string Isihlungi ngokwesondo: male, female, neutral

Umlayezo

Umlayezo we-JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Isibonelo sekhodi

Umbhalo usuka kumazwi

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Ukukhuluma kuMbhalo

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Ukulungiswa kwezwi

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Umbhalo usuka kumazwi

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Ukukhuluma kuMbhalo

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Umbhalo usuka kumazwi

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Ukukhuluma kuMbhalo

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Ukulungiswa kwezwi

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Ukuthuthukiswa komsindo

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Amakhodi wephutha

Zonke izinkinga zibuyisela umlayezo we-JSON one- error indawo.

Iphutha lefomethi lokuphendula
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough credits for this request.",
    "credits_required": 4,
    "credits_available": 2
  }
}
Isimo se HTTPError CodeIncazelo
400 bad_request Amapharamitha esicelo angekho emthethweni. Khangela umlayezo wephutha ngemininingwane.
401 unauthorized Isithonjana se-API esilahlekile noma esingasebenzi.
402 insufficient_credits Akunamali eyanele. Thenga ngaphezulu ku /pricing/.
403 forbidden Ukungena kwe-API akukhona kuhlelo lwakho.
404 not_found Imodeli noma umsindo awutholakali.
413 file_too_large Ihele elilayishiwe lidlula umkhawulo wobukhulu.
429 rate_limited Izicelo eziningi kakhulu. Khangela umkhawulo wezinga lombhalo ophezulu ephepheni.
500 internal_error Iphutha lesisebenzisi. Zama futhi kamuva.
503 model_loading Imodeli iyalayisha. Zama kabusha emizuzwini embalwa.

I-Webhooks

Imisebenzi esebenza isikhathi eside (ukuhlukaniswa kwe-stem, i-batch TTS), unganikeza i webhook_url parameter. Xa umsebenzi uqediwe, sizo-POST imiphumela ku-URL yakho.

I-Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
Imiphumela ye-Webhook iyatholakala ukulayisha ngezansi amahora angama-24 ngemuva kokuqedwa. Qinisekisa ukuthi ulayishe ngezansi ngokushesha.

Ukulungele ukwakha?

Thola isithonjana sakho se-API bese uqala ukufaka i-TTS.ai kuzinhlelo zakho zokusebenza.