Private Cloud
Your own dedicated AI voice infrastructure. Full data isolation, all open-source models, no per-character fees. Deploy on your cloud or ours.
Get Started API DocsWhy Private Cloud?
Full Data Isolation
Your text, audio, and voice data never touch shared infrastructure. No data leaves your network. Ideal for healthcare, legal, finance, and government use cases where data residency matters.
Dedicated GPU Resources
No shared queues, no noisy neighbors. Your GPU servers are reserved exclusively for your workloads. Predictable latency and throughput for production voice applications.
No Per-Character Fees
Generate unlimited speech, clone unlimited voices, transcribe unlimited audio. You pay for infrastructure, not usage. Dramatically lower costs at scale versus per-character pricing.
What's Included
Text to Speech
- All 20+ open-source TTS models
- Kokoro, Chatterbox, CosyVoice 2, Bark, Orpheus, and more
- Streaming and batch generation
- 100+ pre-built voices across 30+ languages
Voice Cloning
- 9 cloning models (Chatterbox, GPT-SoVITS, OpenVoice, etc.)
- Clone from 5-second reference audio
- Unlimited voice clones
- Voice embeddings stored on your servers only
Speech to Text
- Faster Whisper (4x speed), SenseVoice
- 99 languages with timestamps and speaker detection
- Unlimited transcription hours
- Real-time streaming transcription
Audio Processing
- Audio enhancement (noise removal, clarity)
- Vocal separation and stem splitting (Demucs)
- Echo and reverb removal
- Format conversion, speech translation
Deployment Architecture
{{ g.i18n.pc_arch_diagram|default:"Your Application
|
v
[Private API Server] ---- REST API (OpenAI-compatible)
|
v
[GPU Inference Workers] -- NVIDIA A100/H100/L40S
|-- TTS Models (Kokoro, Chatterbox, Bark, etc.)
|-- Voice Cloning (GPT-SoVITS, OpenVoice, etc.)
|-- STT (Faster Whisper, SenseVoice)
|-- Audio Processing (Demucs, Enhancement)
|
[Your Cloud / On-Premises]
AWS | GCP | Azure | OCI | Bare Metal" }}
- Same REST API as api.tts.ai
- OpenAI-compatible endpoints
- Python and JavaScript SDKs work unchanged
- Dynamic GPU allocation across models
- Priority queue system for optimal throughput
- Models pre-loaded in VRAM for instant inference
Built For
Healthcare
Patient-facing voice interfaces, medical dictation, clinical documentation. Keep PHI within your compliant infrastructure.
Financial Services
Voice-enabled banking, compliance call transcription, automated customer service. Data residency in your chosen region.
Government
Accessible public services, multilingual citizen communications, classified document processing on air-gapped networks.
Contact Centers
High-volume IVR systems, real-time agent assist, call transcription and analytics. Predictable cost at any scale.
Shared Cloud vs Private Cloud
| Shared Cloud | Private Cloud | |
|---|---|---|
| Data isolation | Shared infrastructure, auto-deleted in 24h | Full isolation, your servers only |
| Pricing model | Per-character | Flat monthly, unlimited usage |
| AI models | All models | All models + custom |
| Latency | Shared queue | Dedicated, predictable |
| Data residency | Our data center | Your choice of region |
| SLA | Best effort | Custom SLA available |
| Support | Dedicated account manager |
Open-Source Models, No Vendor Lock-In
Every model in TTS.ai Private Cloud is open-source (MIT or Apache 2.0). If you ever stop using our service, you keep full access to the underlying models. No proprietary dependencies, no licensing traps.
Private Cloud Plans
From self-hosted to fully managed. All plans include every open-source model.
Self-Hosted
Run on your own GPU hardware. We provide the Docker image and license.
- Docker image with all models
- Your GPU, your servers
- License key validation
- Email support
- Unlimited usage
Starter
Dedicated single-GPU instance managed by TTS.ai.
- 1x A100 GPU
- 5 concurrent generations
- All models included
- Auto-scaling
- Email support
Pro
High-throughput instance with priority queue and 20 concurrent slots.
- 1x A100 GPU
- 20 concurrent generations
- Priority queue
- Auto-scaling
- Priority support
Enterprise
Multi-GPU cluster with SLA, unlimited concurrent, and dedicated account manager.
- Multi-GPU (H100)
- Unlimited concurrent
- 99.9% SLA
- Dedicated account manager
- Custom deployment region
Private Cloud FAQ
Ready to Deploy?
Choose a plan above or contact us for custom enterprise requirements.
Get Started Contact Sales