Zolemba za API

Ikani TTS.ai m'mapulogalamu anu ndi REST API yathu. OpenAI-compatible format for easy migration.

REST API OpenAI yogwirizana JSON Mafunso Streaming Support

Kufotokozera

The TTS.ai API provides programmatic access to all platform features: text-to-speech synthesis, speech-to-text transcription, voice cloning, audio enhancement, and more. The API uses standard REST conventions with JSON request/response bodies.

Chingwe cha API

Pezani chiphaso chanu cha API kuchokera Kusintha kwa Akaunti. Zilipo pa Pro ndi Enterprise miyezo.

Base URL

https://api.tts.ai/v1/

Ovomerezeka

Bearer token kudzera Authorization header

Kutsimikizira

Zosowa zonse za API zimafunikira kutsimikizika pogwiritsa ntchito token Bearer mu Authorization header.

HTTP Header
Authorization: Bearer sk-tts-your-api-key-here
Gwiritsani ntchito mawu achinsinsi a API. Musazigawana mu code ya client-side, ma repositories a anthu, kapena ma logs. Rotate keys regularly from your account settings.

Base URL

Base URL: https://api.tts.ai/v1/

All endpoints are relative to this base URL. For example, the TTS endpoint is:

POST https://api.tts.ai/v1/tts/

Mipaka ya mtengo

Malamulo a API amasiyana malinga ndi ndondomeko:

Phunziro Mafunso/mphindi Concurrent Kukula kwa Tebulo
Pro 60 5 5,000 chars
Enterprise 300 20 50,000 chars

Mtengo malire zigawo zikuluzikulu ndi kuphatikizidwa mu chilichonse yankho: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Mtengo wa ngongole

Ntchito Mtengo Chigawo
TTS (Mapangidwe aulere: Piper, VITS, MeloTTS) 1 credit per 1,000 characters
TTS (Standard mafano: Kokoro, CosyVoice 2, etc.) 2 credits per 1,000 characters
TTS (Premium models: Tortoise, Chatterbox, etc.) 4 credits per 1,000 characters
Kusintha mawu kukhala malemba 2 credits per minute of audio
Chizindikiro cha mawu 4 credits per 1,000 characters
Wosintha mawu 3 credits per minute of audio
Audio Kuwonjezera 2 credits per minute of audio
Kuchotsa kwa Vocal / Stem Splitting 3-4 malipiro per minute of audio
Kutanthauzira kwa mawu 5 credits per minute of audio
Kuyankhulana kwa mawu 3 credits per turn
Key & BPM Finder _Yaulere --
Audio Converter _Yaulere --

Text to SpeechQuery

POST /v1/tts/

Kusintha malemba kukhala mawu. Kubwezeretsa fayilo ya mawu m'njira yofunidwa.

Mtima wa funso

ParamitaChizindikiro:ZofunikaKufotokozera
model string Yes Model ID (mwachitsanzo, kokoro, chatterbox, piper)
text string Yes Text kuti atembenuke kulankhula (max 5,000 zilembo kwa Pro, 50,000 kwa Enterprise)
voice string Yes Voice ID (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
format string Palibe Format ya kutulutsa: mp3 (yosasinthika), wav, flac, ogg
speed float Palibe Kuwonjezeka kwa liwiro la kulankhula. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0
language string Palibe Kodi ya chilankhulo (mwachitsanzo, en, es). Idzapezeka mwamsanga ngati sichiperekedwa.
stream boolean Palibe Ikani yankho la kutumizira. Lolungama: false

Mphamvu

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Kuyankha

Returns the audio file as binary data with appropriate Content-Type header (audio/mpeg, audio/wav, etc.).

Kuyankhulana
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Kusintha mawu kukhala malemba

POST /v1/stt/

Transscribe audio kuti malemba. Supports 99 zinenero ndi auto-kuzindikira.

Mtima wa funso (multipart/form-data)

ParamitaChizindikiro:ZofunikaKufotokozera
file file Yes Audio wapamwamba (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Palibe STT model: whisper (zosasinthika), faster-whisper, sensevoice
language string Palibe Kodi ya Chilankhulo. auto kuti mudziwe mwamsanga (zosasinthika).
timestamps boolean Palibe Kuphatikiza ma timestamps pa level ya mawu. Cholakwika: false
diarize boolean Palibe Ikani kuyimba kwa wokamba nkhani. Cholakwika: false

Kuyankha

Kuyankha kwa JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Chizindikiro cha mawu

POST /v1/tts/clone/

Kutulutsa mawu m'mawu osinthidwa. Kutsitsa mawu ndi malemba osinthidwa.

Mtima wa funso (multipart/form-data)

ParamitaChizindikiro:ZofunikaKufotokozera
reference_audio file Yes Zitsanzo mawu audio (10-30 masekondi analimbikitsa). Max 20MB.
text string Yes Text kulankhula mu cloned mawu.
model string Palibe Clone model: chatterbox (zosasinthika), cosyvoice2, gpt-sovits
format string Palibe Format ya kutulutsa: mp3 (yosasinthika), wav, flac
language string Palibe Kodi ya chilankhulo chofunikira. Iyenera kuthandizidwa ndi mtundu wosankhidwa.

Kuyankha

Kubwerera audio fayilo monga binary deta, chimodzimodzi monga TTS endpoint.

Wosintha mawu

POST /v1/voice-convert/

Kusintha Audio kuti kuoneka ngati wina mawu. Upload m'badwo audio ndi kusankha woyenera mawu.

Mtima wa funso (multipart/form-data)

ParamitaChizindikiro:ZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera (MP3, WAV, FLAC). Max 50MB.
target_voice string Yes ID ya mawu yoyenera kusinthidwa (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
model string Palibe Model yosinthira mawu: openvoice (yosasinthika), knn-vc
format string Palibe Format ya kutulutsa: wav (yosasinthika), mp3, flac

Mphamvu

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Kuyankha

Kubwerera kusinthidwa audio fayilo monga binary deta.

Kutanthauzira kwa mawu

POST /v1/speech-translate/

Translate spoken audio from one language to another.Combines mawu-ku-mawu, kutanthauzira, ndi mawu-ku-mawu m'modzi foni.

Mtima wa funso (multipart/form-data)

ParamitaChizindikiro:ZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera m'zinenero zoyambirira. Max 100MB.
target_language string Yes Kodi ya chilankhulo chofunikira (mwachitsanzo, es, fr, de, ja)
voice string Palibe Chilankhulo chosinthidwa. Chidzasankhidwa mwamsanga ngati sichinasankhidwe.
preserve_voice boolean Palibe Kuyesetsa kuteteza wolankhula woyamba

Kuyankha

Kuyankha kwa JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Kulankhula kwa Chilankhulo

POST /v1/speech-to-speech/

Kusintha mtundu wa mawu, chisoni, kapena kubweretsa pogwiritsa ntchito zomwe zili. Zothandiza posintha tone, pacing, ndi expressionality.

Mtima wa funso (multipart/form-data)

ParamitaChizindikiro:ZofunikaKufotokozera
file file Yes Fayilo ya mawu yochokera. Max 50MB.
voice string Yes ID ya mawu yoyenera kwa mawu otuluka
model string Palibe Model: openvoice (osankhidwa), chatterbox
emotion string Palibe Mfundo yofunikira: neutral, happy, sad, angry, excited
speed float Palibe Kusintha kwa liwiro. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0

Kuyankha

Kubwerera fayilo ya audio yosinthidwa ngati deta ya binary.

Zipangizo za Audio

Audio processing endpoints for enhancement, vocal removal, stem splitting, ndi zina zambiri.

POST /v1/audio/enhance/

Kukulitsa khalidwe la audio: denoise, kukulitsa chidziwitso, super resolution.

file fileFayilo ya mawu yowonjezera
denoise booleanIkani denoising (zosasinthika: zoona)
enhance_clarity booleanKuwonjezera kumvetsetsa kwa mawu (osankhidwa: true)
super_resolution booleanKukulitsa khalidwe la mawu (osankhidwa: ayi)
strength integer1-3 (yabwino, yapakati, yolimba). Default: 2
POST /v1/audio/separate/

Osiyana vocals kuchokera instrumentals (kuchotsa vocal) kapena kugawidwa m'masamba.

file fileFayilo ya audio yogawidwa
model stringdemucs (default) kapena spleeter
stems integerMtengo wa zipatso: 2, 4, 5, kapena 6 (zosasinthika: 2)
format stringFormat ya kutulutsa: wav, mp3, flac
POST /v1/audio/dereverb/

Chotsani echo ndi reverb kuchokera zomvetsera.

file fileFayilo ya audio yogwiritsira ntchito
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ _Yaulere

Analyze audio kuzindikira batani, BPM, ndi nthawi signature.

file fileFayilo ya audio yofufuzira
Kuyankha
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ _Yaulere

Kusinthana audio pakati pa mtundu.

file fileFayilo ya audio yosinthidwa
format stringFormat yoyenera: mp3, wav, flac, ogg, m4a, aac
bitrate integerOutput bitrate in kbps: 64, 128, 192, 256, 320
sample_rate integerMtengo wa sampling: 22050, 44100, 48000
channels stringmono kapena stereo

Kuyankhulana kwa mawu

POST /v1/voice-chat/

Kutumiza audio kapena malemba ndi kulandira yankho AI ndi mawu synthesized.

Mtima wa funso (multipart/form-data kapena JSON)

ParamitaChizindikiro:ZofunikaKufotokozera
audio file Palibe* Audio input (kufunikira audio kapena text)
text string Palibe* Kulemba mawu (kufunika audio kapena text)
voice string Palibe Chilankhulo cha yankho la AI. Cholakwika: af_bella
tts_model string Palibe Model ya TTS yofunsira. Cholakwika: kokoro
system_prompt string Palibe Custom system prompt for the AI
conversation_id string Palibe Kupititsa patsogolo kulankhulana komwe kulipo

Kuyankha

Kuyankha kwa JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Lipoti Models

GET /v1/models/

Kubwerera mndandanda wa onse opezeka mafano ndi luso lawo.

Kuyankha

Kuyankha kwa JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Limbikitsani Mawu

GET /v1/voices/

Imabweretsa mndandanda wa mawu onse omwe alipo, omwe angasankhidwe kutengera mtundu kapena zinenero.

Kufunsa Ma parameters

ParamitaChizindikiro:Kufotokozera
model string Chotsani ndi ID ya model (mwachitsanzo, kokoro)
language string Fufuzani pogwiritsa ntchito kodi ya chilankhulo (mwachitsanzo, en)
gender string Fufuzani malinga ndi mtundu: male, female, neutral

Kuyankha

Kuyankha kwa JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Code Misonkho

Text to SpeechQuery

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Kusintha mawu kukhala malemba

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Chizindikiro cha mawu

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Text to SpeechQuery

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Kusintha mawu kukhala malemba

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Text to SpeechQuery

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Kusintha mawu kukhala malemba

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Chizindikiro cha mawu

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Audio Kuwonjezera

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Kodi za zolakwika

Zonse zolakwika kubwezera JSON yankho ndi error field.

Fayilo ya yankho la vuto
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough credits for this request.",
    "credits_required": 4,
    "credits_available": 2
  }
}
HTTP StatusError CodeKufotokozera
400 bad_request Zofunikira zosavomerezeka. Onani uthenga wa vutoli kuti mudziwe zambiri.
401 unauthorized Cholakwika kapena cholakwika API.
402 insufficient_credits Palibe ndalama zokwanira. Gulani zambiri pa /pricing/.
403 forbidden API kulowa sichipezeka pa ndondomeko yanu.
404 not_found Model kapena mawu sapezeka.
413 file_too_large Kutsitsa kwa fayilo kumapitilira kukula kwake.
429 rate_limited Mafunso ambirimbiri. Sankhani malire a mutu.
500 internal_error Kulephera kwa seva. Phunzirani pambuyo pake.
503 model_loading Model is loading. Retry in a few seconds.

Webhooks

Kwa ntchito yaitali yogwira ntchito (kugawidwa kwa stem, TTS ya batani), mutha kupereka webhook_url parameter. Mukamaliza ntchito, tidzatumiza zotsatira ku URL yanu.

Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
Webhook zotsatira zilipo kwa kutsitsa kwa 24 maola pambuyo kumaliza. onetsetsani kuti kutsitsa iwo mofulumira.

Ndinu okonzeka kukhazikitsa?

Pezani chizindikiro chanu cha API ndikuyamba kuphatikizira TTS.ai m'mapulogalamu anu.