Dokimantasyon API

Enkòpore TTS.ai nan aplikasyon ou yo ak REST API nou an. OpenAI-kompatib fòma pou migrasyon fasil.

REST API OpenAI Konpatib Repons JSON Streaming sipò

Aperçu

API TTS.ai a bay aksè programatik nan tout karakteristik platfòm la: sintezis tèks-nan-parole, transkripsiyon pale-nan-tèks, klonaj vwa, amelyore son, ak plis ankò.API a itilize konvansyon REST estanda ak kò demann / repons JSON.

Clé API

Jwenn kle API ou soti nan Konfigurasyon kont. Disponib sou plan Pro ak Enterprise.

URL baz

https://api.tts.ai/v1/

Auth

Bearer token via Authorization header

Authentification

Tout demann API mande pou autentifikasyon via yon Bearer token nan Authorization header.

Anndan HTTP
Authorization: Bearer sk-tts-your-api-key-here
Kenbe kle API ou a sekrè. Pa pataje li nan kòd bò-kliyan, repozitwa piblik, oswa logs. Rotate kle regilyèman soti nan paramèt kont ou.

URL baz

URL baz: https://api.tts.ai/v1/

Tout pwent bout yo se relativ a URL baz sa a. Pou egzanp, pwent bout TTS la se:

POST https://api.tts.ai/v1/tts/

Limit

Limit pousantaj API varye selon plan an:

Plan Demands/min Konpatib Longè maksimòm tèks
Pro 60 5 5,000 chars
Enterprise 300 20 50,000 chars

Entèval limit entèval yo enkli nan chak repons: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Kout kredi

Sèvis Koute Unit
TTS (Models gratis: Piper, VITS, MeloTTS) 1 kredi pou chak 1,000 karaktè
TTS (Models estanda: Kokoro, CosyVoice 2, etc.) 2 kredi pou chak 1,000 karaktè
TTS (Premium modèl: Tortoise, Chatterbox, elatriye) 4 kredi pou chak 1,000 karaktè
Konvèti pale an tèksName 2 kredi per minute of audio
Klonaj Vokal 4 kredi pou chak 1,000 karaktè
Chanjman Voy 3 kredi per minute of audio
Amelyore son 2 kredi per minute of audio
Vokal Removal / Stem divizyon 3-4 kredi per minute of audio
Tradiksyon 5 kredi per minute of audio
Konvèsasyon Vokal 3 kredi per turn
Key & BPM Finder Gratis --
Audio Convertisseur Gratis --

Text to SpeechGenericName

POST /v1/tts/

Convert text to speech audio. Returns audio file in the requested format.

Kò demann lan

ParamètTipeRequiredDeskripsyon
model string Wi ID modèl la (e.g., kokoro, chatterbox, piper)
text string Wi Text to convert to speech (max 5,000 chars for Pro, 50,000 for Enterprise)
voice string Wi Voice ID (itilize /v1/voices/ pou lis vwa ki disponib)
format string Non fòma devwa: mp3 (pa défaut), wav, flac, ogg
speed float Non Multiplier vitès pale. Pa default: 1.0. Range: 0.5 to 2.0
language string Non Kòd lang lan (e.g., en, es). Deteksyon otomatik si li omite.
stream boolean Non Enable streaming response. Default: false

Ekzanp demann

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Reponn

Retounen dosye son an kòm done binè ak Content-Type kòd (audio/mpeg, audio/wav, elatriye).

Entèfas repons
Content-Type: audio/mpeg
Content-Length: 48256
X-Credits-Used: 2
X-Credits-Remaining: 498

Konvèti pale an tèksName

POST /v1/stt/

Transkri audio pou tèks. Soti nan 99 lang ak deteksyon otomatik.

Kò demann lan (multipart/form-data)

ParamètTipeRequiredDeskripsyon
file file Wi Fichiers Audio (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Non Modèl STT: whisper (pa défaut), faster-whisper, sensevoice
language string Non Kode lang. auto pou deteksyon otomatik (pa défaut).
timestamps boolean Non Gen ladann dat/tan nan nivo mo. Pa default: false
diarize boolean Non Aktive diarization oratè. Par défaut: false

Reponn

Reponn JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Klonaj Vokal

POST /v1/tts/clone/

Kreye pale nan yon vwa klone. Upload yon referans son ak tèks.

Kò demann lan (multipart/form-data)

ParamètTipeRequiredDeskripsyon
reference_audio file Wi Referans vwa audio (10-30 segonn rekòmande). Max 20MB.
text string Wi Text to speak in the cloned voice.
model string Non Klone modèl: chatterbox (pa défaut), cosyvoice2, gpt-sovits
format string Non fòma devwa: mp3 (pa défaut), wav, flac
language string Non Kòd lang objektif la. Li dwe sipòte pa modèl la chwazi.

Reponn

Retounen dosye son an kòm done binè, menm jan ak pwent bout TTS la.

Chanjman Voy

POST /v1/voice-convert/

Convert audio to sound like a different voice. Upload source audio and choose a target voice.

Kò demann lan (multipart/form-data)

ParamètTipeRequiredDeskripsyon
file file Wi Fichiè odyo sous (MP3, WAV, FLAC). Max 50MB.
target_voice string Wi Identifyan vwa pou konvèti nan (itilize /v1/voices/ pou lis vwa ki disponib)
model string Non Modèl konvèsyon vwa: openvoice (pa défaut), knn-vc
format string Non fòma devwa: wav (pa défaut), mp3, flac

Ekzanp demann

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Reponn

Retounen nan dosye son ki te transfòme a kòm done binè.

Tradiksyon

POST /v1/speech-translate/

Tradwi odyo pale soti nan yon lang nan yon lòt. Konbine pale-a-tèks, tradiksyon, ak tèks-a-vokal nan yon sèl apèl.

Kò demann lan (multipart/form-data)

ParamètTipeRequiredDeskripsyon
file file Wi Fichiè odyo sous nan lang orijinèl la. Max 100MB.
target_language string Wi Kode lang target (e.g., es, fr, de, ja)
voice string Non Voy pou rezilta tradiksyon an. Seleksyone otomatikman si li omite.
preserve_voice boolean Non Tente pou kenbe oratè orijinal la

Reponn

Reponn JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Pale pou paleComment

POST /v1/speech-to-speech/

Travèse style pale, emosyon, oswa livrezon pandan y ap kenbe kontni an. Itil pou ajiste ton, pacing, ak ekspresyon.

Kò demann lan (multipart/form-data)

ParamètTipeRequiredDeskripsyon
file file Wi Fichiè odyo vwa sous la. Maksimòm 50MB.
voice string Wi Identifyan vwa pou pale deyò a
model string Non Modèl: openvoice (pa défaut), chatterbox
emotion string Non Emosyon target: neutral, happy, sad, angry, excited
speed float Non Ajustman vitès. Default: 1.0. Range: 0.5 to 2.0

Reponn

Retounen nan fichye son transfòme a kòm done binè.

OdinatèName

Endpoints Audio pwosesis pou amelyore, retire vokal, stem divizyon, ak plis ankò.

POST /v1/audio/enhance/

Amelyore kalite son: denoise, amelyore klète, super résolution.

file fileFichiè son pou amelyore
denoise booleanActiver la suppression du bruit (défaut: vrai)
enhance_clarity booleanEnhance speech clarity (default: true)
super_resolution booleanAmelyore kalite son (pa défaut: false)
strength integer1-3 (fèb, mwayen, fò). Pa défaut: 2
POST /v1/audio/separate/

Separe vokal soti nan enstrimantal (eliminasyon vokal) oswa divize an stems.

file fileFichiè son pou separe
model stringdemucs (default) or spleeter
stems integerNimewo tige: 2, 4, 5, oswa 6 (pa défaut: 2)
format stringfòma devwa: wav, mp3, flac
POST /v1/audio/dereverb/

Retire echo ak reverb soti nan enregistrements son.

file fileFichiè son pou trete
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Gratis

Analize son pou deteksyon kle, BPM, ak tan signature.

file fileFichiè son pou analize
Reponn
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Gratis

Konvèti audio ant fòma.

file fileFichiè son pou konvèti
format stringFòmat objektif: mp3, wav, flac, ogg, m4a, aac
bitrate integerBitrate sortie an kbps: 64, 128, 192, 256, 320
sample_rate integerSample rate: 22050, 44100, 48000
channels stringmono or stereo

Konvèsasyon Vokal

POST /v1/voice-chat/

Envoye odyo oswa tèks ak resevwa yon repons AI ak pale synthesized.

Kò demann lan (multipart/form-data or JSON)

ParamètTipeRequiredDeskripsyon
audio file Non* Entèfas son (oswa son oswa tèks nesesè)
text string Non* Enpòte tèks (oswa audio oswa text nesesè)
voice string Non Voy pou repons AI. Pa défaut: af_bella
tts_model string Non Modèl TTS pou repons lan. Pa default: kokoro
system_prompt string Non Pwompt sistèm Custom pou AI
conversation_id string Non Kontinye yon konvèsasyon ki egziste

Reponn

Reponn JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

List Models

GET /v1/models/

Retounen yon lis tout modèl ki disponib ak kapasite yo.

Reponn

Reponn JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Liy vwa

GET /v1/voices/

Retounen yon lis tout vwa ki disponib, ki ka filtre pa modèl oswa lang.

Paramèt kesyon

ParamètTipeDeskripsyon
model string Filtre pa ID modèl (e.g., kokoro)
language string Filtre pa kòd lang (e.g., fr)
gender string Filtre pa sèks: male, female, neutral

Reponn

Reponn JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Ekzanp kòd

Text to SpeechGenericName

Python - requests
import requests

API_KEY = "sk-tts-your-key"

# Text to Speech
response = requests.post(
    "https://api.tts.ai/v1/tts/",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={
        "model": "kokoro",
        "text": "Hello from TTS.ai!",
        "voice": "af_bella",
        "format": "mp3"
    }
)

with open("output.mp3", "wb") as f:
    f.write(response.content)

print(f"Credits used: {response.headers.get('X-Credits-Used')}")

Konvèti pale an tèksName

Python - requests
# Speech to Text
with open("recording.mp3", "rb") as f:
    response = requests.post(
        "https://api.tts.ai/v1/stt/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": f},
        data={"model": "faster-whisper", "timestamps": "true"}
    )

result = response.json()
print(result["text"])

Klonaj Vokal

Python - requests
# Voice Cloning
with open("reference.wav", "rb") as ref:
    response = requests.post(
        "https://api.tts.ai/v1/tts/clone/",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"reference_audio": ref},
        data={
            "text": "This speech uses a cloned voice.",
            "model": "chatterbox"
        }
    )

with open("cloned_output.mp3", "wb") as f:
    f.write(response.content)

Text to SpeechGenericName

JavaScript - fetch
const API_KEY = 'sk-tts-your-key';

// Text to Speech
const response = await fetch('https://api.tts.ai/v1/tts/', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    model: 'kokoro',
    text: 'Hello from TTS.ai!',
    voice: 'af_bella',
    format: 'mp3'
  })
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

Konvèti pale an tèksName

JavaScript - fetch
// Speech to Text
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', 'faster-whisper');

const response = await fetch('https://api.tts.ai/v1/stt/', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${API_KEY}` },
  body: formData
});

const result = await response.json();
console.log(result.text);

Text to SpeechGenericName

cURL
# Text to Speech
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"kokoro","text":"Hello!","voice":"af_bella","format":"mp3"}' \
  -o output.mp3

Konvèti pale an tèksName

cURL
# Speech to Text
curl -X POST https://api.tts.ai/v1/stt/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@recording.mp3" \
  -F "model=faster-whisper" \
  -F "timestamps=true"

Klonaj Vokal

cURL
# Voice Cloning
curl -X POST https://api.tts.ai/v1/tts/clone/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "reference_audio=@reference.wav" \
  -F "text=This uses a cloned voice." \
  -F "model=chatterbox" \
  -o cloned.mp3

Amelyore son

cURL
# Audio Enhancement
curl -X POST https://api.tts.ai/v1/audio/enhance/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@noisy_audio.mp3" \
  -F "denoise=true" \
  -F "enhance_clarity=true" \
  -o enhanced.mp3

Kòd erè

Tout erè retounen yon repons JSON ak yon error jaden.

Format repons erè
{
  "error": {
    "code": "insufficient_credits",
    "message": "You do not have enough credits for this request.",
    "credits_required": 4,
    "credits_available": 2
  }
}
Status of an itemKòd erèDeskripsyon
400 bad_request Paramèt demann lan pa valab. Tcheke mesaj erè a pou plis detay.
401 unauthorized Clè API ki manke oswa ki pa valab.
402 insufficient_credits Pa gen ase kredi. Achte plis nan /pricing/.
403 forbidden Akses API pa disponib sou plan ou a.
404 not_found Modèl oswa vwa pa jwenn.
413 file_too_large Fichiè enpòte a depase limit gwosè a.
429 rate_limited Trop demann. Tcheke ankadre limit vitès la.
500 internal_error Error server. Try again later.
503 model_loading Modèl la ap chaje. Retounen nan kèk segonn.

Webhooks

Pou travay ki ap kouri pou yon tan long (separasyon branch, TTS batch), ou ka bay yon paramèt webhook_url. Lè travay la fini, nou pral POST rezilta a nan URL ou.

Webhook Payload
{
  "event": "task.completed",
  "task_id": "task_abc123",
  "status": "success",
  "result_url": "https://api.tts.ai/v1/results/task_abc123",
  "credits_used": 12,
  "created_at": "2025-01-15T10:30:00Z",
  "completed_at": "2025-01-15T10:30:45Z"
}
Rezilta Webhook yo disponib pou telechaje pou 24 èdtan apre yo fin fini. asire w ke ou telechaje yo promptly.

Prepare pou konstwi?

Jwenn kle API ou a epi kòmanse enkòpore TTS.ai nan aplikasyon ou yo.