Text to Speech API for Developers
Build voice-enabled applications with our REST API. Add natural text-to-speech, voice cloning, speech-to-text, and audio processing to your apps, chatbots, voice assistants, and SaaS products. OpenAI-compatible format, 24+ models, simple integration.
Try It Now
API Features for Developers
Everything you need to build voice-enabled applications
Simple REST API
One POST request to generate speech. JSON request, audio response. Works with any programming language that supports HTTP.
OpenAI-Compatible
Drop-in replacement for OpenAI TTS API. Switch your base_url and API key — existing code works immediately.
24+ Models Available
Access every model through a single API. Switch models by changing one parameter. Compare quality, speed, and cost.
Sub-Second Latency
Kokoro generates audio in under 1 second. Perfect for real-time chatbots, voice assistants, and interactive applications.
Voice Cloning API
Clone any voice from a short audio sample via the API. Use cloned voices for all subsequent generations.
Multiple Formats
Output as WAV, MP3, OGG, or FLAC. Choose sample rate and bit depth. Streaming audio support for real-time apps.
Best Models for Developer Integration
Choose the right model for your application's speed, quality, and cost requirements
Kokoro
Free
Lightweight 82M parameter model delivering studio-quality speech with blazing-fast inference.
Bäst för: Fastest model — sub-second latency, ideal for real-time apps and chatbots
Försök Kokoro
CosyVoice 2
Standard
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
Bäst för: Streaming TTS with voice cloning for voice assistant applications
Försök CosyVoice 2
Sesame CSM
Premium
Conversational speech model generating natural dialogue with appropriate timing and emotion.
Bäst för: Conversational AI with natural timing for chatbot and assistant voice
Försök Sesame CSM
Piper
Free
A fast, local neural text to speech system optimized for Raspberry Pi and embedded devices.
Bäst för: Free, CPU-only model for high-volume applications with zero credit cost
Försök Piper
Bark
Standard
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
Bäst för: Audio generation with sound effects for creative and entertainment apps
Försök BarkHow to Integrate the TTS API
From signup to first API call in under 5 minutes
Get Your API Key
Sign up for free and generate an API key from your account dashboard. 50 credits included.
Make Your First Call
POST to /v1/tts with text, model, and voice. Get audio bytes back. Under 5 lines of code.
Choose Your Model
Test different models for your use case. Compare speed, quality, and cost per generation.
Ship to Production
Scale with pay-as-you-go credits. No rate limits on paid plans. Monitor usage in your dashboard.
Quick Start Code Examples
Integrate TTS.ai in any language with our REST API
import requests
response = requests.post(
"https://api.tts.ai/v1/tts",
json={
"text": "Hello from my app!",
"model": "kokoro",
"voice": "af_heart",
"format": "mp3"
},
headers={
"Authorization": "Bearer sk-tts-xxx"
}
)
with open("output.mp3", "wb") as f:
f.write(response.content)
const response = await fetch(
"https://api.tts.ai/v1/tts",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer sk-tts-xxx"
},
body: JSON.stringify({
text: "Hello from my app!",
model: "kokoro",
voice: "af_heart",
format: "mp3"
})
}
);
const audio = await response.blob();
curl -X POST https://api.tts.ai/v1/tts \
-H "Authorization: Bearer sk-tts-xxx" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello from my app!",
"model": "kokoro",
"voice": "af_heart",
"format": "mp3"
}' \
--output output.mp3
# Works with OpenAI client library
from openai import OpenAI
client = OpenAI(
api_key="sk-tts-xxx",
base_url="https://api.tts.ai/v1"
)
response = client.audio.speech.create(
model="kokoro",
voice="af_heart",
input="Hello from my app!"
)
response.stream_to_file("output.mp3")
What Developers Build with TTS.ai
Common integration patterns and applications
AI Chatbots & Assistants
Add voice output to your chatbot or AI assistant. Pipe LLM responses through TTS for voice-enabled interfaces. Kokoro delivers sub-second latency for real-time conversations. Sesame CSM generates conversational speech with natural timing.
- LLM response to speech pipeline
- Underandra latensen med Kokoro
- Conversational speech with Sesame CSM
- Streaming audio output
Mobile & Voice Apps
Build voice-enabled mobile apps, accessibility tools, reading apps, and language learning platforms. Our REST API works with any mobile framework. Download audio files or stream directly to the client.
- React Native, Flutter, Swift, Kotlin
- Accessibility and reading apps
- Language learning platforms
- Audio content generation
SaaS Products
White-label voice capabilities in your SaaS product. Add TTS, STT, voice cloning, and audio processing as features in your platform. Use our API as your voice backend without managing GPU infrastructure.
- White-label voice features
- No GPU infrastructure needed
- Pay-per-use pricing
- 24+ models to offer your users
Automation Pipelines
Integrate voice generation into CI/CD pipelines, content automation, and batch processing workflows. Generate thousands of audio files from spreadsheet data, automate podcast production, or build content localization pipelines.
- Batch processing via API
- Content localization pipelines
- CI/CD integration
- Spreadsheet to audio automation
API Specifications
Built for production applications
24+
TTS Models
100+
Voices
30+
Languages
<1s
Latency (Kokoro)
Vanliga frågor
Common questions about the TTS.ai developer API
Ready to Build with Voice AI?
Get your free API key and start building. 50 credits on signup, free models available, comprehensive documentation.