Enterprise

Private Cloud

Your own dedicated AI voice infrastructure. Full data isolation, all open-source models, no per-character fees. Deploy on your cloud or ours.

Get Started API Docs

Why Private Cloud?

Full Data Isolation

Your text, audio, and voice data never touch shared infrastructure. No data leaves your network. Ideal for healthcare, legal, finance, and government use cases where data residency matters.

Dedicated GPU Resources

No shared queues, no noisy neighbors. Your GPU servers are reserved exclusively for your workloads. Predictable latency and throughput for production voice applications.

No Per-Character Fees

Generate unlimited speech, clone unlimited voices, transcribe unlimited audio. You pay for infrastructure, not usage. Dramatically lower costs at scale versus per-character pricing.

What's Included

Text to Speech

  • All 20+ open-source TTS models
  • Kokoro, Chatterbox, CosyVoice 2, Bark, Orpheus, and more
  • Streaming and batch generation
  • 100+ pre-built voices across 30+ languages

Voice Cloning

  • 9 cloning models (Chatterbox, GPT-SoVITS, OpenVoice, etc.)
  • Clone from 5-second reference audio
  • Unlimited voice clones
  • Voice embeddings stored on your servers only

Speech to Text

  • Faster Whisper (4x speed), SenseVoice
  • 99 languages with timestamps and speaker detection
  • Unlimited transcription hours
  • Real-time streaming transcription

Audio Processing

  • Audio enhancement (noise removal, clarity)
  • Vocal separation and stem splitting (Demucs)
  • Echo and reverb removal
  • Format conversion, speech translation

Deployment Architecture

{{ g.i18n.pc_arch_diagram|default:"Your Application
    |
    v
[Private API Server] ---- REST API (OpenAI-compatible)
    |
    v
[GPU Inference Workers] -- NVIDIA A100/H100/L40S
    |-- TTS Models (Kokoro, Chatterbox, Bark, etc.)
    |-- Voice Cloning (GPT-SoVITS, OpenVoice, etc.)
    |-- STT (Faster Whisper, SenseVoice)
    |-- Audio Processing (Demucs, Enhancement)
    |
[Your Cloud / On-Premises]
    AWS | GCP | Azure | OCI | Bare Metal" }}
  • Same REST API as api.tts.ai
  • OpenAI-compatible endpoints
  • Python and JavaScript SDKs work unchanged
  • Dynamic GPU allocation across models
  • Priority queue system for optimal throughput
  • Models pre-loaded in VRAM for instant inference

Built For

Healthcare

Patient-facing voice interfaces, medical dictation, clinical documentation. Keep PHI within your compliant infrastructure.

Financial Services

Voice-enabled banking, compliance call transcription, automated customer service. Data residency in your chosen region.

Government

Accessible public services, multilingual citizen communications, classified document processing on air-gapped networks.

Contact Centers

High-volume IVR systems, real-time agent assist, call transcription and analytics. Predictable cost at any scale.

Shared Cloud vs Private Cloud

Shared Cloud Private Cloud
Data isolation Shared infrastructure, auto-deleted in 24h Full isolation, your servers only
Pricing model Per-character Flat monthly, unlimited usage
AI models All models All models + custom
Latency Shared queue Dedicated, predictable
Data residency Our data center Your choice of region
SLA Best effort Custom SLA available
Support Email Dedicated account manager

Open-Source Models, No Vendor Lock-In

Every model in TTS.ai Private Cloud is open-source (MIT or Apache 2.0). If you ever stop using our service, you keep full access to the underlying models. No proprietary dependencies, no licensing traps.

Kokoro
Chatterbox
CosyVoice 2
Bark
Orpheus
GPT-SoVITS
StyleTTS2
Tortoise
OpenVoice
Piper
VITS
MeloTTS
Faster Whisper
Demucs

Private Cloud Plans

From self-hosted to fully managed. All plans include every open-source model.

Self-Hosted

Run on your own GPU hardware. We provide the Docker image and license.

$99 /月
  • Docker image with all models
  • Your GPU, your servers
  • License key validation
  • Email support
  • Unlimited usage
Get Started

Starter

Dedicated single-GPU instance managed by TTS.ai.

$499 /月
  • 1x A100 GPU
  • 5 concurrent generations
  • All models included
  • Auto-scaling
  • Email support
Get Started
Most Popular

Pro

High-throughput instance with priority queue and 20 concurrent slots.

$999 /月
  • 1x A100 GPU
  • 20 concurrent generations
  • Priority queue
  • Auto-scaling
  • Priority support
Get Started

Enterprise

Multi-GPU cluster with SLA, unlimited concurrent, and dedicated account manager.

$2,499 /月
  • Multi-GPU (H100)
  • Unlimited concurrent
  • 99.9% SLA
  • Dedicated account manager
  • Custom deployment region
Get Started

Private Cloud FAQ

TTS.ai Private Cloud is a dedicated AI voice infrastructure deployment. You get your own GPU servers running the full TTS.ai stack — text-to-speech, voice cloning, speech-to-text, and audio processing — completely isolated from our shared platform. No data leaves your infrastructure.

All open-source models available on TTS.ai are included: Kokoro, Chatterbox, CosyVoice 2, Bark, Orpheus, Dia, GLM-TTS, Spark, GPT-SoVITS, StyleTTS2, Tortoise, OpenVoice, Piper, VITS, MeloTTS, and more. Voice cloning models are also included. All models are MIT or Apache 2.0 licensed.

We deploy on NVIDIA GPU servers — typically A100, H100, or L40S depending on your throughput needs. A minimum configuration for small teams is 1x A100 (80GB) supporting 5-10 concurrent requests. For high-volume production, we configure multi-GPU clusters with load balancing.

We support deployment on major cloud providers (AWS, GCP, Azure, OCI) in any region, or on your own on-premises hardware. You choose the hosting location based on your compliance and latency requirements.

Yes. The private cloud deployment exposes the same REST API as api.tts.ai, including the OpenAI-compatible endpoint. Your existing code, SDKs, and integrations work without changes — just point to your private API URL.

We provide regular updates including new models, security patches, and performance improvements. Updates are tested on our shared platform first, then packaged for private cloud deployment. You control when updates are applied.

Self-hosting individual models requires significant ML engineering: managing VRAM allocation, building inference pipelines, handling queuing, audio post-processing, and maintaining multiple model environments. TTS.ai Private Cloud provides a production-ready stack with all of this built in, plus ongoing support and updates.

Pricing depends on the GPU configuration, number of models deployed, and support level. Contact us for a quote. There are no per-character or per-request fees — you pay only for infrastructure and support.

Ready to Deploy?

Choose a plan above or contact us for custom enterprise requirements.

Get Started Contact Sales