Zolemba za API

Kuphatikiza TTS.ai m'mapulogalamu anu ndi REST API yathu. OpenAI-kugwirizana mtundu kwa kunyamula kosavuta.

REST API OpenAI yogwirizana JSON Mafunso Streaming Support

Kufotokozera

TTS.ai API imapatsa mwayi wogwiritsa ntchito mapulogalamu onse a platform: synthesization ya text-to-speech, transcription ya speech-to-text, cloning ya mawu, kuwonjezeka kwa audio, ndi zina zambiri.API imagwiritsa ntchito REST standard conventions ndi JSON request / response bodies.

Chithunzi cha API

Pezani chiphaso chanu cha API kuchokera Kusintha kwa Akaunti. Zilipo pa Pro ndi Enterprise miyezo.

Base URL

https://api.tts.ai/v1/

Ovomerezeka

Bearer token kudzera Authorization header

Kutsimikizira

Zosowa zonse za API zimafunikira kutsimikizika pogwiritsa ntchito token Bearer mu Authorization header.

HTTP Header
Authorization: Bearer sk-tts-your-api-key-here
Gwiritsani ntchito mawu achinsinsi a API. Musazigawana mu code ya client-side, ma repositories a anthu, kapena ma logs. Rotate keys regularly from your account settings.

SDKs

Ovomerezeka SDKs amathandiza kuti TTS.ai iphatikizidwe mu pulogalamu yanu.Zomwezo ndizotsegulidwa komanso zimapezeka pa GitHub.

Python

pip install ttsai
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-...")
audio = client.generate(
    text="Hello world!",
    model="kokoro"
)
client.save(audio, "output.wav")
GitHub

JavaScript / Node.js

npm install @ttsainpm/ttsai
const { TTSClient } = require('@ttsainpm/ttsai');

const client = new TTSClient({
  apiKey: 'sk-tts-...'
});
const audio = await client.generate({
  input: 'Hello world!',
  model: 'kokoro'
});
await client.saveToFile(audio, 'output.wav');
GitHub

Base URL

Base URL: https://api.tts.ai/v1/

Zomwezo zonse ndi zokhudzana ndi ulalo wa ulalo. Mwachitsanzo, TTS endpoint ndi:

POST https://api.tts.ai/v1/tts/

Kusinthanitsa

Malamulo a API amasiyana malinga ndi ndondomeko:

Plan Mafunso/mphindi Kugwirizana Kukula kwa Tebulo
Opanda pake 10 2 500 chars
Woyamba 30 3 100,000 chars
Pro 60 5 100,000 chars
Enterprise 300 20 50,000 chars

Mtengo malire zigawo zikuluzikulu ndi kuphatikizidwa mu chilichonse yankho: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Mtengo wa ngongole

Ntchito Mtengo _Uniti:
TTS (Mapangidwe aulere: Piper, VITS, MeloTTS) 1,000 characters per 1,000 characters
TTS (Standard mafano: Kokoro, CosyVoice 2, etc.) 2,000 characters per 1,000 characters
TTS (Premium models: Tortoise, Chatterbox, etc.) 4,000 characters per 1,000 characters
Kusintha mawu kukhala malemba 2,000 characters per minute of audio
Chizindikiro cha mawu 4,000 characters per 1,000 characters
Wosintha mawu 3,000 characters per minute of audio
Audio Kuwonjezera 2,000 characters per minute of audio
Kuchotsa kwa Vocal / Stem Splitting 3,000-4,000 characters per minute of audio
Kutanthauzira kwa mawu 5,000 characters per minute of audio
Kuyankhulana kwa mawu 3,000 characters pa mphindi
Key & BPM Finder Opanda pake --
Audio Converter Opanda pake --

Text to SpeechQuery

POST /v1/tts/

Kusintha malemba kukhala mawu. Kubwezeretsa fayilo ya mawu m'njira yofunidwa.

Mtima wa funso

ParamitaChigawoZofunikaKufotokozera
model string Yes Model ID (mwachitsanzo, kokoro, chatterbox, piper)
text string Yes Text kuti atembenuke kulankhula (max 100,000 maonekedwe pa zosowa)
voice string Yes Voice ID (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
format string Si Format ya kutulutsa: mp3 (yosasinthika), wav, flac, ogg
speed float Si Mtundu wa kulankhula. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0
language string Si Kodi ya chilankhulo (mwachitsanzo, en, es). Idzapezeka mwamsanga ngati sichiperekedwa.
stream boolean Si Ikani yankho la kutumizira. Lolungama: false

Mphatso ya lamulo

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Kuyankha

Imabweretsa fayilo ya audio monga deta ya binary ndi Content-Type yoyenera (audio/mpeg, audio/wav, etc.).

Kuyankhulana
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Kusintha mawu kukhala malemba

POST /v1/stt/

Transscribe audio kuti malemba. Supports 99 zinenero ndi auto-kuzindikira.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Audio wapamwamba (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Si STT model: whisper (zosasinthika), faster-whisper, sensevoice
language string Si Kodi ya Chilankhulo. auto kuti mudziwe mwamsanga (zosasinthika).
timestamps boolean Si Kuphatikiza ma timestamps pa level ya mawu. Cholakwika: false
diarize boolean Si Ikani kuyimba kwa wokamba nkhani. Cholakwika: false

Kuyankha

Kuyankha kwa JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Chizindikiro cha mawu

POST /v1/tts/clone/

Kutulutsa mawu m'mawu osinthidwa. Kutsitsa mawu ndi malemba osinthidwa.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
reference_audio file Yes Kulemba mawu ovomerezeka (10-30 masekondi amalimbikitsa). Max 20MB.
text string Yes Text kuti kunena mu cloned mawu.
model string Si Model ya clone: chatterbox (yosasinthika), cosyvoice2, gpt-sovits
format string Si Format ya kutulutsa: mp3 (yosasinthika), wav, flac
language string Si Kodi ya chilankhulo chofunikira. Iyenera kuthandizidwa ndi mtundu wosankhidwa.

Kuyankha

Kubwerera audio fayilo monga binary deta, monga TTS endpoint.

Wosintha mawu

POST /v1/voice-convert/

Kusintha Audio kuti kuoneka ngati wina mawu. Upload m'badwo audio ndi kusankha woyenera mawu.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera (MP3, WAV, FLAC). Max 50MB.
target_voice string Yes ID ya mawu yoyenera kusinthidwa (gwiritsani ntchito /v1/voices/ kuti muwerenge mawu omwe alipo)
model string Si Model yosinthira mawu: openvoice (yosasinthika), knn-vc
format string Si Format ya kutulutsa: wav (yosasinthika), mp3, flac

Mphatso ya lamulo

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Kuyankha

Kubwerera kamodzi kusinthidwa audio fayilo monga binary deta.

Kutanthauzira kwa mawu

POST /v1/speech-translate/

Kutanthauzira akulankhula audio kuchokera ku mtundu wina kwa wina. Kuphatikiza mawu-ku-malemba, kutanthauzira, ndi malemba-ku-malemba mu wina foni.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya audio yochokera m'zinenero zoyambirira. Max 100MB.
target_language string Yes Kodi ya chilankhulo chofunikira (mwachitsanzo, es, fr, de, ja)
voice string Si Chilankhulo cha kutanthauzira kwa mauthenga. Chidzasankhidwa mwamsanga ngati sichisankhidwa.
preserve_voice boolean Si Timafuna kuteteza khalidwe la mawu la wolankhulayo. Default: false

Kuyankha

Kuyankha kwa JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Kulankhula kwa Chilankhulo

POST /v1/speech-to-speech/

Kusintha kwa mawu, maganizo, kapena kutumiza pogwiritsa ntchito zomwe zili. Kuthandiza kuwongolera tonse, pacing, ndi kufotokoza.

Mtima wa funso (multipart/form-data)

ParamitaChigawoZofunikaKufotokozera
file file Yes Fayilo ya mawu yochokera. Max 50MB.
voice string Yes ID ya mawu yoyenera kwa mawu a kutuluka
model string Si Model: openvoice (zosasinthika), chatterbox
emotion string Si Mfundo yofunikira: neutral, happy, sad, angry, excited
speed float Si Kusintha kwa liwiro. Choyambirira: 1.0. Kuchokera pa 0.5 mpaka 2.0

Kuyankha

Imakhalanso ndi fayilo ya audio yosinthidwa ngati deta ya binary.

Zipangizo za Audio

Audio processing endpoints for enhancement, vocal removal, stem splitting, ndi zina zambiri.

POST /v1/audio/enhance/

Kukulitsa khalidwe la audio: denoise, kukulitsa chidziwitso, super resolution.

file fileFayilo ya mawu yowonjezera
denoise booleanIkani denoising (zosasinthika: zoona)
enhance_clarity booleanKuwonjezera kumvetsetsa kwa mawu (osankhidwa: true)
super_resolution booleanKukulitsa mtundu wa mawu (osankhidwa: ayi)
strength integer1-3 (wowala, wotsika, wolimba). Default: 2
POST /v1/audio/separate/

Osiyana vocals kuchokera instrumentals (kuchotsa vocal) kapena kugawidwa m'mitundu.

file fileFayilo ya audio yogawidwa
model stringdemucs (default) kapena spleeter
stems integerMtengo wa masamba: 2, 4, 5, kapena 6 (zosasinthika: 2)
format stringFormat ya kutulutsa: wav, mp3, flac
POST /v1/audio/dereverb/

Chotsani echo ndi reverb kuchokera zomvetsera.

file fileFayilo ya audio yogwiritsira ntchito
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ _Yaulere

Analyze audio kuzindikira batani, BPM, ndi nthawi signature.

file fileFayilo ya audio yofufuzira
Kuyankha
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ _Yaulere

Kusinthana audio pakati pa mavidiyo.

file fileFayilo ya audio yosinthidwa
format stringFormat yoyenera: mp3, wav, flac, ogg, m4a, aac
bitrate integerKutulutsa kwa bitrate mu kbps: 64, 128, 192, 256, 320
sample_rate integerKuyerekezera kwa sampling: 22050, 44100, 48000
channels stringmono kapena stereo

Kuyankhulana kwa mawu

POST /v1/voice-chat/

Kutumiza audio kapena malemba ndi kulandira yankho AI ndi mawu synthesized.

Mtima wa funso (multipart/form-data kapena JSON)

ParamitaChigawoZofunikaKufotokozera
audio file Si* Audio input (kufunikira audio kapena text)
text string Si* Kulemba mawu (kufunika audio kapena text)
voice string Si Chilankhulo cha yankho la AI. Cholakwika: af_bella
tts_model string Si Model ya TTS yofunsira. Choyambirira: kokoro
system_prompt string Si Kusintha kwa dongosolo la AI
conversation_id string Si Kupititsa patsogolo kulankhulana komwe kulipo

Kuyankha

Kuyankha kwa JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

Batch TTS

POST /v1/tts/batch/

Kutumiza malemba ambirimbiri kwa TTS yopanga mogwirizana. Kulandira webhook callback pamene ntchito zonse zatha.

Ma parameters

ParamitaChigawoKufotokozera
textsarrayArray of objects: {text, model, voice}. Max 50 items.
webhook_urlstringOptional URL to POST results when batch completes.

Kuyankha

Kuyankha kwa JSON
{
  "batch_id": "abc123",
  "total": 3,
  "completed": 0,
  "status": "processing"
}

Kuyankhulana kupitilira ndi GET /v1/tts/batch/result/?batch_id=abc123

Kuphatikiza kwa mawu

POST /v1/voice-embed/

Kuwerengetsa koyamba kuphatikizika kwa mawu kuchokera pa mawu ochokera. Pezani embed_id yobwezeredwa m'mafunso otsatira a kuphatikizika kwa mawu kuti mupange mwamsanga.

Ma parameters

ParamitaChigawoKufotokozera
filefileReference audio file (WAV, MP3, FLAC).
modelstringCloning model (default: chatterbox). Supported: chatterbox, cosyvoice2, openvoice, gpt-sovits, spark, indextts2, qwen3-tts.

Kuyankha

Kuyankha kwa JSON
{
  "embed_id": "emb_abc123",
  "model": "chatterbox",
  "duration_ms": 450
}

Kuyesa Kwaumoyo

GET /v1/health/

Kafukula khalidwe la seva ya GPU, mamodeli otsegulidwa, ndi kukula kwa mndandanda wazoyembekezera. Sikufunikira kutsimikizira. Kusungira kwa masekondi 30.

Kuyankha

Kuyankha kwa JSON
{
  "status": "online",
  "latency_ms": 45,
  "queue_size": 3,
  "models_loaded": ["kokoro", "chatterbox", "cosyvoice2"]
}

Lipoti Models

GET /v1/models/

Kubwerera mndandanda wa onse opezeka mafano ndi luso lawo.

Kuyankha

Kuyankha kwa JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Limbikitsani Mawu

GET /v1/voices/

Imabweretsa mndandanda wa mawu onse omwe alipo, omwe angasankhidwe kutengera mtundu kapena zinenero.

Kufunsa Ma parameters

ParamitaChigawoKufotokozera
model string Chotsani ndi ID ya model (mwachitsanzo, kokoro)
language string Fufuzani pogwiritsa ntchito kodi ya chilankhulo (mwachitsanzo, en)
gender string Fufuzani malinga ndi mtundu: male, female, neutral

Kuyankha

Kuyankha kwa JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Code Misonkho

Text to SpeechQuery

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Kusintha mawu kukhala malemba

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Chizindikiro cha mawu

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Text to SpeechQuery

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Kusintha mawu kukhala malemba

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Text to SpeechQuery

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Kusintha mawu kukhala malemba

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Chizindikiro cha mawu

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Audio Kuwonjezera

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Kodi za zolakwika

Zomwe zingakhale zoipa kubwezera JSON yankho ndi error field.

Kusintha kwa vuto
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough characters for this request.",
    "characters_required": 4000,
    "characters_available": 2000
  }
}
HTTP StatusKodi ya vutoKufotokozera
400 bad_request Ma parameters osati oyenera. Onani meseji ya vuto kuti mudziwe zambiri.
401 unauthorized Kukakamizidwa kwa API.
402 insufficient_credits Sichikwanira maonekedwe. Kugula zambiri pa / pricing /.
403 forbidden API kulowa sichipezeka pa ndondomeko yanu.
404 not_found Model kapena mawu sapezeka.
413 file_too_large Kutsitsa kwa fayilo kumaposa kukula kwake.
429 rate_limited Zosowa zambiri. Sankhani kuchepetsa malire headers.
500 internal_error Kulephera kwa seva.
503 model_loading Model ndi kulowa. Phunziraninso mu masekondi angapo.

Webhooks

Kwa ntchito yaitali yogwira ntchito (kugawidwa kwa stem, TTS ya batani), mutha kupereka webhook_url parameter. Mukamaliza ntchito, tidzatumiza zotsatira ku URL yanu.

Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
Webhook zotsatira ndi opezeka kwa kutsitsa kwa 24 maola pambuyo kumaliza.

Wokondwa Kumanga?

Pezani chida chanu cha API ndikuyamba kuphatikizira TTS.ai m'mapulogalamu anu.