Dokumen

Ngagabungkeun TTS.ai kana aplikasi anjeun nganggo REST API kami. Format kompatibel OpenAI pikeun migrasi anu gampang.

REST API OpenAI kompatibel Jawaban JSON Сүрөттөр

Gambaran Umum

API TTS.ai nyayogikeun aksés program kana sadaya fitur platform: sintésis teks-ka-wacana, transkripsi basa-ka-wacana, kloning sora, paningkatan audio, sareng sajabana. API nganggo konvensi REST standar sareng awak pamundut / tanggapan JSON.

Kunci API

Njupuk kunci API sampeyan saka Pengaturan Akun. Digunakaké ing industri lan pertambangan.

URL Dasar

https://api.tts.ai/v1/

Autentikasi

Bearer token via Authorization header

Autentikasi

Kembangan: ora ana kembangan. Anonymous POSTs to /v1/tts/ kerja tanpa auth, nganti 5,000 karakter / dina saben IP, nggunakake salah siji saka model gratis kita (piper, vits, melotts, kokoro). Sign up for a free account to get 15,000 bonus characters and access to premium models.

Kabeh pitakonan API mbutuhake autentikasi liwat token Bearer ing Authorization header.

Kepala HTTP
Authorization: Bearer sk-tts-your-api-key-here
Kajaba iku, dheweke uga dadi juru main kunci. Henteu dibagikeun kana kode sisi klien, repositori umum, atawa log. Puterkeun konci sacara rutin tina pangaturan akun anjeun.

SDK

SDK resmi ngamungkinkeun gampang ngagabungkeun TTS.ai kana aplikasi anjeun. Keduanya sumber terbuka dan tersedia di GitHub.

Python

pip install ttsai
from tts_ai import TTSClient

client = TTSClient(api_key="sk-tts-...")
audio = client.generate(
    text="Hello world!",
    model="kokoro"
)
client.save(audio, "output.wav")
GitHub

JavaScript / Node.js

npm install @ttsainpm/ttsai
const { TTSClient } = require('@ttsainpm/ttsai');

const client = new TTSClient({
  apiKey: 'sk-tts-...'
});
const audio = await client.generate({
  input: 'Hello world!',
  model: 'kokoro'
});
await client.saveToFile(audio, 'output.wav');
GitHub

URL Dasar

URL Dasar: https://api.tts.ai/v1/

Kabeh titik pungkasan relatif marang URL dasar iki. Contone, titik pungkasan TTS ya iku:

POST https://api.tts.ai/v1/tts/

Batas kecepatan

API watesan kacepetan beda-beda miturut rencana:

Plane Takon/menit Sampurna Panjang Teks Maksimum
Bebas 10 2 500 karakter
Pemula 30 3 1,000,000 aksara
Pro 60 5 1,000,000 aksara
Enterprise 300 20 50,000 aksara

Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Penggunaan aksara

Layanan Biaya Unit
TTS (model gratis: Piper, VITS, MeloTTS) 1 кредит saben 1,000 aksara
TTS (model standar: Kokoro, CosyVoice 2, lsp.) 2,000 karakter saben 1,000 aksara
TTS (model Premium: Tortoise, Chatterbox, lsp.) 4 кредит saben 1,000 aksara
Suara menyang Teks 2,000 karakter per minute of audio
Kloning Suara 4 кредит saben 1,000 aksara
Pengubah SuaraName 3 кредит per minute of audio
Tambahan Audio 2,000 karakter per minute of audio
Voice Removal / Stem Splitting 3-4 kredit per minute of audio
Terjemah 5 кредит per minute of audio
Chat Suara 3 кредит per turn
Pencari Kunci & BPM Bebas --
Аудио конвертер Bebas --

Teks-ka-waca

POST /v1/tts/

Ngowahi teks dadi swara. Ngembalikeun file audio nganggo format anu dibutuhkeun.

Tubuh Panggonan

ParameterTypeDiperlukanKeterangan
model string Ora Model ID (misal, kokoro, chatterbox, piper)
text string Iya Teks kanggo dikonversi dadi swara (maksimum 5,000 aksara kanggo Pro, 50,000 kanggo Enterprise)
voice string Iya ID Suara (gunakake /v1/voices/ kanggo nyaring suara-suara sing ana)
format string Ora Bentuk output: mp3 (piawai), wav, flac, ogg
speed float Ora Multiplier kecepatan ngomong. Piawai: 1.0. Jarak: 0.5 nganti 2.0
language string Ora Kode basa (misalna, en, es). Dideteksi sacara otomatis lamun diabaikan.
instructions string Ora Akting / pengiriman cues (≤500 karakter). contone \
pronunciations object | array Ora Pangucapan saben panjaluk diwaca. {\
stream boolean Ora Aktifake balasan streaming. Piawai: false

Conto pitakonan

cURL
curl -X POST https://api.tts.ai/v1/tts/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kokoro",
    "text": "Hello from TTS.ai! This is a test.",
    "voice": "af_bella",
    "format": "mp3"
  }' \
  --output output.mp3

Tag SSML

Nglebur angka, tanggal, mata uang, nomer telpon, lan singkatan ing

tafsirkanInputDiucapkan minangka
cardinal1234one thousand two hundred thirty-four
ordinal21twenty-first
date1999-12-31December thirty-first, nineteen ninety-nine
time14:30two thirty PM
telephone+1-555-867-5309plus one five five five eight six seven…
currency$1,234.56one thousand two hundred thirty-four dollars and fifty-six cents
spell-outNASAN A S A

Format tanggal diwatesi dadi mdy kanggo basa Inggris lan dmy ing papan liya; ganti karo format=\

Conto
{
  "model": "kokoro",
  "voice": "af_bella",
  "text": "Your appointment is on <say-as interpret-as=\"date\">2026-04-26</say-as> at <say-as interpret-as=\"time\">14:30</say-as>. Please call <say-as interpret-as=\"telephone\">+1-555-867-5309</say-as> if you need to reschedule."
}

Balasan

The TTS endpoint queues your request and returns a JSON response with a job UUID. You then poll for the result.

Step 1: Submit request

Response (JSON)
{
  "uuid": "77b71db532874ce98e84a69a2d740d4c",
  "job_id": "f21316bb-aefa-480d-8523-701d1e3184ce",
  "status": "queued",
  "credits_used": 11,
  "credits_remaining": 15000
}

Step 2: Poll for result

GET /v1/speech/results/?uuid=<job_uuid>

Poll this endpoint every 1-2 seconds until status is completed or failed.

Polling response (completed)
{
  "status": "completed",
  "result_url": "https://api.tts.ai/static/downloads/77b71db5.../output.mp3"
}
Polling response (still processing)
{
  "status": "processing"
}

Step 3: Download audio

Fetch the result_url from the completed response to download the audio file.

Full example

Python
import requests, time

API_KEY = "sk-tts-your-key"
BASE = "https://api.tts.ai"

# 1. Submit TTS request
resp = requests.post(f"{BASE}/v1/tts/", json={
    "model": "kokoro",
    "text": "Hello from TTS.ai!",
    "voice": "af_bella"
}, headers={"Authorization": f"Bearer {API_KEY}"})
data = resp.json()
uuid = data["uuid"]

# 2. Poll for result
while True:
    result = requests.get(f"{BASE}/v1/speech/results/",
        params={"uuid": uuid}).json()
    if result["status"] == "completed":
        # 3. Download audio
        audio = requests.get(result["result_url"])
        with open("output.mp3", "wb") as f:
            f.write(audio.content)
        break
    elif result["status"] == "failed":
        raise Exception(result.get("error", "Generation failed"))
    time.sleep(1.5)

Streaming alternative: For supported models (Kokoro, MeloTTS), use POST /v1/tts/stream/ for real-time Server-Sent Events (SSE) streaming — no polling needed.

Suara menyang Teks

POST /v1/stt/

Nerjemahkeun audio kana teks. Dukungan 99 basa kalayan deteksi otomatis.

Tubuh Panggonan (multipart/form-data)

ParameterTypeDiperlukanKeterangan
file file Iya File audio (MP3, WAV, FLAC, OGG, M4A, MP4, WebM). Max 100MB.
model string Ora STT model: whisper (piawai), faster-whisper, sensevoice
language string Ora Kode basa. auto pikeun deteksi otomatis (default).
timestamps boolean Ora Ngemot tanda wektu tingkat tembung. Piawai: false
diarize boolean Ora Aktifake diarisasi pembicara. Piawai: false

Balasan

Jawaban JSON
{
  "text": "Hello, this is a transcription test.",
  "language": "en",
  "duration": 3.5,
  "segments": [
    {
      "start": 0.0,
      "end": 1.8,
      "text": "Hello, this is",
      "speaker": "SPEAKER_00"
    },
    {
      "start": 1.8,
      "end": 3.5,
      "text": "a transcription test.",
      "speaker": "SPEAKER_00"
    }
  ]
}

Kloning Suara

POST /v1/tts/clone/

Ngembangake swara kloning. Ngunggah audio lan teks referensi.

Tubuh Panggonan (multipart/form-data)

ParameterTypeDiperlukanKeterangan
reference_audio file Iya Referensi swara audio (10-30 detik dianjurake). Max 20MB.
text string Iya Teks kanggo diucapake nganggo swara kloning.
model string Ora Model klon: chatterbox (piawai), cosyvoice2, gpt-sovits
format string Ora Bentuk hasil: mp3 (piawai), wav, flac
language string Ora Kode basa target. Kudu didukung ku model anu dipilih.

Balasan

Balikkeun berkas audio minangka data biner, sami sareng titik akhir TTS.

Pengubah SuaraName

POST /v1/voice-convert/

Ngowahi audio dadi swara sing beda. Unggah audio sumber lan pilih swara target.

Tubuh Panggonan (multipart/form-data)

ParameterTypeDiperlukanKeterangan
file file Iya Berkas audio sumber (MP3, WAV, FLAC). Maksimum 50MB.
target_voice string Iya ID suara target kanggo dikonversi (gunakan /v1/voices/ kanggo nyaring suara-suara sing ana)
model string Ora Model konversi swara: openvoice (piawai), knn-vc
format string Ora Bentuk hasil: wav (piawai), mp3, flac

Conto pitakonan

cURL
curl -X POST https://api.tts.ai/v1/voice-convert/ \
  -H "Authorization: Bearer sk-tts-your-key" \
  -F "file=@source_audio.mp3" \
  -F "target_voice=af_bella" \
  -F "model=openvoice" \
  -o converted.wav

Balasan

Kembalikan file audio yang dikonversi sebagai data biner.

Terjemah

POST /v1/speech-translate/

Terjemahkeun audio anu diucapkeun ti hiji basa ka basa séjén. Ngagabungkeun pangucap-ka-teks, panerjemahan, jeung teks-ka-pangucap dina hiji telepon.

Tubuh Panggonan (multipart/form-data)

ParameterTypeDiperlukanKeterangan
file file Iya Berkas audio sumber ing basa asli. Maksimum 100MB.
target_language string Iya Kode basa target (kayata, es, fr, de, ja)
voice string Ora Suara kanggo output terjemahan. Dipilih sacara otomatis yen ora ana.
preserve_voice boolean Ora Coba kanggo nyimpen pembicara asli

Balasan

Jawaban JSON
{
  "original_text": "Hello, how are you?",
  "translated_text": "Hola, como estas?",
  "source_language": "en",
  "target_language": "es",
  "audio_url": "https://api.tts.ai/v1/results/translate_abc123.mp3",
  "credits_used": 5
}

Speech to Speech

POST /v1/speech-to-speech/

Ngarobah gaya pangucapan, emosi, atawa pangiriman bari ngajaga isina. Mampuh pikeun ngaluyukeun nada, laju, jeung ekspresi.

Tubuh Panggonan (multipart/form-data)

ParameterTypeDiperlukanKeterangan
file file Iya Fail audio swara sumber. Maksimum 50MB.
voice string Iya ID suara target kanggo pidato hasil
model string Ora Model: openvoice (piawai), chatterbox
emotion string Ora Emosi target: neutral, happy, sad, angry, excited
speed float Ora Pangowahan kecepatan. Piawai: 1.0. Jarak: 0.5 nganti 2.0

Balasan

Kembalikan berkas audio sing ditransformasi minangka data biner.

Alat Audio

Digunakaké kanggo nyegah gangguan pernafasan, nyeri sirah, nyeri punggung, lan liya-liyane.

POST /v1/audio/enhance/

Ngaronjatkeun kualitas audio: denoise, ningkatake kajelasan, resolusi super.

file fileFail audio kanggo dioptimalake
denoise booleanAktifake denoisising (piawai: bener)
enhance_clarity booleanNambah kejelasan swara (piawai: bener)
super_resolution booleanNingkatake kualitas audio (piawai: salah)
strength integer1-3 (cetha, menengah, kuat). Piawai: 2
POST /v1/audio/separate/

Vokal dipisahkeun tina instrumen (vokal removal) atanapi dipisahkeun kana stems.

file fileFayl audio kanggo dipisahkan
model stringdemucs (:стандарт) utawa spleeter
stems integerJumlah batang: 2, 4, 5, utawa 6 (piawai: 2)
format stringFormat hasil: wav, mp3, flac
POST /v1/audio/dereverb/

Mbusak gema lan reverberasi saka rekaman audio.

file fileBerkas audio kanggo diolah
type stringecho or reverb (default: both)
intensity integer1-5 (default: 3)
POST /v1/audio/analyze/ Bebas

Nganalisis audio kanggo ndeteksi kunci, BPM, lan tanda wektu.

file fileFayl audio kanggo dianalisis
Balasan
{
  "key": "C",
  "scale": "Major",
  "bpm": 120.0,
  "time_signature": "4/4",
  "camelot": "8B",
  "compatible_keys": ["C Major", "G Major", "F Major", "A Minor"]
}
POST /v1/audio/convert/ Bebas

Ngganti audio ing antarane format.

file fileFayl audio kanggo dikonversi
format stringFormat target: mp3, wav, flac, ogg, m4a, aac
bitrate integerBitrate output ing kbps: 64, 128, 192, 256, 320
sample_rate integerKacepetan sampel: 22050, 44100, 48000
channels stringmono utawa stereo

Chat Suara

POST /v1/voice-chat/

Kirim audio utawa teks lan nampa tanggapan AI nganggo basa sintetis.

Tubuh Panggonan (multipart/form-data utawa JSON)

ParameterTypeDiperlukanKeterangan
audio file Ora* Input audio (audio utawa text dibutuhake)
text string Ora* Input teks (audio utawa teks dibutuhake)
voice string Ora Suara kanggo balasan AI. Lawas: af_bella
tts_model string Ora Model TTS kanggo balasan. Piawai: kokoro
system_prompt string Ora Prompt sistem standar kanggo AI
conversation_id string Ora Lanjutkan percakapan yang ada

Balasan

Jawaban JSON
{
  "conversation_id": "conv_abc123",
  "user_text": "What is the capital of France?",
  "ai_text": "The capital of France is Paris.",
  "audio_url": "https://api.tts.ai/v1/audio/tmp/resp_xyz.mp3",
  "credits_used": 3
}

TTS batches

POST /v1/tts/batch/

Ngekspor teks ganda kanggo panghasilan TTS paralel. Pilihan nampa callback webhook nalika kabeh tugas rampung.

Parameter

ParameterTypeGambaran
textsarrayArray of objects: {text, model, voice}. Max 50 items.
webhook_urlstringOptional URL to POST results when batch completes.

Balasan

Jawaban JSON
{
  "batch_id": "abc123",
  "total": 3,
  "completed": 0,
  "status": "processing"
}

Poll progress with GET /v1/tts/batch/result/?batch_id=abc123

Ngembangake Suara

POST /v1/voice-embed/

Pre-hitungan ngandung sora ti audio rujukan. Gunakeun embed_id anu dikembalikan dina pamundut kloning sora saterusna pikeun ngahasilkeun langsung.

Parameter

ParameterTypeGambaran
filefileReference audio file (WAV, MP3, FLAC).
modelstringCloning model (default: chatterbox). Supported: chatterbox, cosyvoice2, openvoice, gpt-sovits, spark, indextts2, qwen3-tts.

Balasan

Jawaban JSON
{
  "embed_id": "emb_abc123",
  "model": "chatterbox",
  "duration_ms": 450
}

Pengujian Kesehatan

GET /v1/health/

Ngetes status pelayan GPU, model anu dimuat, sarta ukuran gulungan. Teu diperlukeun otentikasi. Disimpen salami 30 detik.

Balasan

Jawaban JSON
{
  "status": "online",
  "latency_ms": 45,
  "queue_size": 3,
  "models_loaded": ["kokoro", "chatterbox", "cosyvoice2"]
}

Daftar Model

GET /v1/models/

Balikkeun daftar sadaya model anu aya sareng kamampuanna.

Balasan

Jawaban JSON
{
  "models": [
    {
      "id": "kokoro",
      "name": "Kokoro",
      "type": "tts",
      "tier": "standard",
      "languages": ["en", "ja", "ko", "zh", "fr"],
      "supports_cloning": false,
      "supports_streaming": true,
      "credits_per_1k_chars": 2
    },
    {
      "id": "chatterbox",
      "name": "Chatterbox",
      "type": "tts",
      "tier": "premium",
      "languages": ["en"],
      "supports_cloning": true,
      "supports_streaming": true,
      "credits_per_1k_chars": 4
    }
  ]
}

Daftar Suara

GET /v1/voices/

Balikkeun daftar sakabéh sora anu aya, disaring numutkeun model atawa basa.

Parameter Query

ParameterTypeKeterangan
model string Sarana miturut ID model (misal, kokoro)
language string Sarana miturut kode basa (misalne, en)
gender string Saringan miturut jinis: laki, wanita, neutral

Balasan

Jawaban JSON
{
  "voices": [
    {
      "id": "af_bella",
      "name": "Bella",
      "model": "kokoro",
      "language": "en",
      "gender": "female",
      "preview_url": "https://api.tts.ai/v1/voices/preview/af_bella.mp3"
    }
  ],
  "total": 142
}

Суроо-жауап new

GET /v1/speech/subtitles/?uuid=<job_uuid>&format=srt|vtt&download=1

Nyiptakeun subtitle anu disinkronisasikeun pikeun sagala tugas TTS anu réngsé. Ngajalankeun panyajan Whisper kana audio sarta balikkeun SRT atawa WebVTT. Hasilna disimpen dina diska supados pamanggilan kadua pikeun uuid anu sami nyaéta maca diska.

Parameter Query

ParameterDiperlukanKeterangan
uuidIyaUUID tugas kang dikembalikan dening /v1/tts/ utawa /v1/voice-clone/.
formatOrasrt (piranti lunak) utawa vtt.
downloadOra1 kanggo ngirim Content-Disposition: attachment supaya panyungsi nyimpen tinimbang némbongkeun.
languageOraTip ka model panyetaraan (ditéwak sacara otomatis lamun dileupaskeun).
cURL
curl "https://api.tts.ai/v1/speech/subtitles/?uuid=$UUID&format=srt&download=1" -o subtitles.srt

Kamus Panulisan new

GET POST DELETE /api/v1/pronunciations/

Nyatakeun mesin TTS kumaha ngaucapkeun kecap-kecap husus. Catatan anu disimpan diterapkeun sacara otomatis kana unggal pancén TTS anu anjeun lakukeun. Batas 200 catetan per akun.

Tubuh Panggonan (POST)

ParameterTypeKeterangan
wordstringKata kanggo diganti (contona GIF, Anthropic). wates-kata cocog.
replacementstringCara nulisna kanggo model (misalna jiff, ann THROP ick).
languagestringKode ISO pilihan. Kosong = berlaku kanggo kabeh basa.
case_sensitivebooleanfalse piawai. Nyambungake huruf gedhé/cekak persis nalika true.
cURL
# Save an entry
curl -X POST https://tts.ai/api/v1/pronunciations/ \
  -H "Authorization: Bearer sk-tts-..." \
  -H "Content-Type: application/json" \
  -d '{"word": "GIF", "replacement": "jiff"}'

# List your entries
curl https://tts.ai/api/v1/pronunciations/ -H "Authorization: Bearer sk-tts-..."

# Delete entry by id
curl -X DELETE "https://tts.ai/api/v1/pronunciations/?id=42" -H "Authorization: Bearer sk-tts-..."

Anjeun ogé bisa ngirim overrides per-permintaan tanpa nyimpen eta — ngawengku pronunciations dina sagala /v1/tts/ panggilan salaku obyék atawa array (tingali TTS endpoint parameters).

Narasi Artikel new

Lebetkeun hiji