Report Bug / Feature Request

Voice Library

Browse, preview, and compare 100+ AI voices across 20+ models. Find the perfect voice for your project.

282+ Voices

Featured Voices

English Female

CosyVoice3

Use

Search Voices

Model

Language

Gender

Quality Tier

200 voices found

Sort:

Default

Chatterbox Turbo

Standard English Neutral

Use

Default

Indic Parler TTS

Standard English Neutral

Use

Default

MOSS-TTS Nano

Standard English Neutral

Use

Portuguese Speaker

Bark Small

Standard Portuguese Neutral

Use

Aditi (Bengali)

Indic Parler TTS

Standard Bengali Female

Use

Amit (Assamese)

Indic Parler TTS

Standard Assamese Male

Use

Anjali (Malayalam)

Indic Parler TTS

Standard Malayalam Female

Use

Anu (Kannada)

Indic Parler TTS

Standard Kannada Female

Use

Arjun (Bengali)

Indic Parler TTS

Standard Bengali Male

Use

Divjot (Punjabi)

Indic Parler TTS

Standard Punjabi Male

Use

Gurpreet (Punjabi)

Indic Parler TTS

Standard Punjabi Female

Use

Harish (Malayalam)

Indic Parler TTS

Standard Malayalam Male

Use

Jaya (Tamil)

Indic Parler TTS

Standard Tamil Female

Use

Kavya (Tamil)

Indic Parler TTS

Standard Tamil Female

Use

Lalitha (Telugu)

Indic Parler TTS

Standard Telugu Female

Use

Neha (Gujarati)

Indic Parler TTS

Standard Gujarati Female

Use

Sanjay (Marathi)

Indic Parler TTS

Standard Marathi Male

Use

Sita (Assamese)

Indic Parler TTS

Standard Assamese Female

Use

Sunita (Marathi)

Indic Parler TTS

Standard Marathi Female

Use

Suresh (Kannada)

Indic Parler TTS

Standard Kannada Male

Use

Yash (Gujarati)

Indic Parler TTS

Standard Gujarati Male

Use

Chinese

MOSS-TTS Nano

Standard Chinese Neutral

Use

Italian

MOSS-TTS Nano

Standard Italian Neutral

Use

Japanese

MOSS-TTS Nano

Standard Japanese Neutral

Use

Voices by AI Model

Each TTS model has its own set of voices with unique characteristics. Some models support voice cloning, allowing you to use any voice as a reference.

Bark 28 Voices Standard

Try Model

Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.

English Female 1

English

Use

English Female 2

English

Use

English Female 3

English

Use

English Female 4

English

Use

English Male 1

English

Use

English Male 2

English

Use

English Male 3

English

Use

English Male 4

English

Use

View all 28 Bark Voices

Bark Small 15 Voices Standard

Try Model

Lighter version of Bark with faster inference and lower memory usage.

English Female 1

English

Use

English Female 2

English

Use

English Male 1

English

Use

Chinese Speaker

Chinese

Use

French Speaker

French

Use

German Speaker

German

Use

Hindi Speaker

Hindi

Use

Italian Speaker

Italian

Use

View all 15 Bark Small Voices

Chatterbox 1 Voices Premium

Try Model

State-of-the-art zero-shot voice cloning with emotion control from Resemble AI.

Default

English

Use

Chatterbox Turbo 1 Voices Standard

Try Model

Faster Chatterbox with sub-200ms latency and paralinguistic tags for laughs, coughs, and more.

Default

English

Use

CosyVoice 2 10 Voices Standard

Try Model

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

English Female

English

Use

English Male

English

Use

Chinese Female

Chinese

Use

Chinese Male

Chinese

Use

French Female

French

Use

German Female

German

Use

Italian Female

Italian

Use

Japanese Female

Japanese

Use

View all 10 CosyVoice 2 Voices

CosyVoice3 11 Voices Standard

Try Model

Next-generation multilingual TTS with bi-streaming, emotion control, and zero-shot voice cloning.

English Female

English

Use

English Male

English

Use

Chinese Female

Chinese

Use

Chinese Male

Chinese

Use

French Female

French

Use

German Female

German

Use

Italian Female

Italian

Use

Japanese Female

Japanese

Use

View all 11 CosyVoice3 Voices

Darwin TTS 4 Voices Standard

Try Model

Cross-modal Qwen3-TTS variant with FFN weights blended from the Qwen3-1.7B language model for sharper multilingual cloning.

Default

English

Use

Default (Chinese)

Chinese

Use

Default (Japanese)

Japanese

Use

Default (Korean)

Korean

Use

Dia TTS 2 Voices Standard

Try Model

Multi-speaker dialog generation model that creates natural conversations between speakers.

Speaker 1

English

Use

Speaker 2

English

Use

GPT-SoVITS 4 Voices Standard

Try Model

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

English Default

English

Use

Default

Chinese

Use

Japanese Default

Japanese

Use

Korean Default

Korean

Use

IndexTTS-2 2 Voices Standard

Try Model

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Default

English

Use

Chinese Default

Chinese

Use

Indic Parler TTS 21 Voices Standard

Try Model

High-quality speech for 8+ Indian languages with natural-language voice control.

Default

English

Use

Aditi (Bengali)

Bengali

Use

Amit (Assamese)

Assamese

Use

Anjali (Malayalam)

Malayalam

Use

Anu (Kannada)

Kannada

Use

Arjun (Bengali)

Bengali

Use

Debjani (Odia)

Odia

Use

Divjot (Punjabi)

Punjabi

Use

View all 21 Indic Parler TTS Voices

Kani TTS 2 1 Voices Standard

Try Model

Ultra-lightweight 400M English TTS model running in just 3GB VRAM.

Default

English

Use

Kitten TTS 8 Voices Free

Try Model

Ultra-lightweight TTS under 80MB. Runs on CPU without GPU.

Bella

English

Use

Bruno

English

Use

Hugo

English

Use

Jasper

English

Use

Kiki

English

Use

Leo

English

Use

Luna

English

Use

Rosie

English

Use

Kokoro 26 Voices Free

Try Model

Lightweight 82M parameter model delivering studio-quality speech with blazing-fast inference.

Adam

English

Use

Bella

English

Use

Emma (British)

English

Use

George (British)

English

Use

Heart

English

Use

Isabella (British)

English

Use

Lewis (British)

English

Use

Michael

English

Use

View all 26 Kokoro Voices

MeloTTS 7 Voices Free

Try Model

High-quality multilingual text-to-speech that runs on CPU with minimal latency.

English British

English

Use

English US

English

Use

Chinese

Use

French

Use

Japanese

Use

Korean

Use

Spanish

Use

Ming-Omni TTS 2 Voices Free

Try Model

Compact 0.5B omni-modal speech model from inclusionAI with high-fidelity 44.1kHz output and zero-shot voice cloning.

Default

English

Use

Default (Chinese)

Chinese

Use

MOSS-TTS Nano 8 Voices Standard

Try Model

Tiny 100M MOSS-TTS variant — same architecture, 80x smaller, free-tier latency.

Default

English

Use

Arabic

Use

Chinese

Use

French

Use

German

Use

Italian

Use

Japanese

Use

Korean

Use

MOSS-TTSD 1 Voices Standard

Try Model

Multi-speaker dialogue continuation model — generate podcast-style conversations with up to 5 speakers and 60 minutes of coherent audio.

Default Speaker

English

Use

OpenVoice 1 Voices Premium

Try Model

Instant voice cloning with granular control over style, emotion, and accent.

Default

English

Use

Orpheus 8 Voices Standard

Try Model

Human-level emotional TTS model trained on 100K hours of speech data.

Dan

English

Use

Jess

English

Use

Leah

English

Use

Leo

English

Use

Mia

English

Use

Tara

English

Use

Zac

English

Use

Zoe

English

Use

OuteTTS 1 Voices Free

Try Model

LLM-based TTS that runs on CPU, GPU, or browser via llama.cpp and Transformers.js.

Female 1 (Neutral)

English

Use

Parler TTS 1 Voices Standard

Try Model

Describe the voice you want in natural language and Parler generates matching speech.

Default

English

Use

Piper 7 Voices Free

Try Model

A fast, local neural text to speech system optimized for Raspberry Pi and embedded devices.

Alan (UK)

English

Use

Alba (UK)

English

Use

Amy (US)

English

Use

Jenny (UK)

English

Use

Joe (US)

English

Use

Lessac (US)

English

Use

Ryan (US)

English

Use

Pocket TTS 8 Voices Free

Try Model

Lightweight 100M parameter model by Kyutai with voice cloning from a single sample.

Alba

English

Use

Azelma

English

Use

Cosette

English

Use

Eponine

English

Use

Fantine

English

Use

Javert

English

Use

Jean

English

Use

Marius

English

Use

Qwen3 TTS 6 Voices Standard

Try Model

Alibaba's multilingual TTS with preset voices and voice design from text.

Aiden

English

Use

Dylan

English

Use

Eric

English

Use

Ryan

English

Use

Serena

English

Use

Vivian

English

Use

Sesame CSM 2 Voices Premium

Try Model

Conversational speech model generating natural dialogue with appropriate timing and emotion.

Speaker 0

English

Use

Speaker 1

English

Use

Spark TTS 1 Voices Standard

Try Model

Voice cloning TTS with controllable emotion and speaking style via prompts.

Default

English

Use

StyleTTS 2 1 Voices Premium

Try Model

Human-level text-to-speech through style diffusion and adversarial training.

Default

English

Use

Tortoise TTS 1 Voices Premium

Try Model

Multi-voice text-to-speech focused on quality with autoregressive architecture.

Random

English

Use

VibeVoice 4 Voices Standard

Try Model

Microsoft's multi-speaker long-form TTS generating up to 90 minutes with 4 distinct speakers.

Speaker 1

English

Use

Speaker 2

English

Use

Speaker 3

English

Use

Speaker 4

English

Use

VITS 1 Voices Free

Try Model

Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech.

Default

English

Use

VoxCPM 1 Voices Standard

Try Model

Tokenizer-free TTS producing 44.1kHz audio with context-aware paragraph consistency.

Default

English

Use

KhanomTan TTS 5 Voices Standard

Try Model

Thai-first text-to-speech with a choice of speaker voices.

Bernard

Thai

Use

Default

Thai

Use

Kerstin

Thai

Use

Linda

Thai

Use

Thorsten

Thai

Use

Understanding AI Voices

Voice Quality Tiers

TTS.ai offers voices across three quality tiers. Free-tier voices from Piper, VITS, and MeloTTS deliver fast, good-quality synthesis at no cost. Standard-tier voices from models like Kokoro and CosyVoice 2 offer more natural prosody and emotion. Premium-tier voices from OpenVoice, Chatterbox, and StyleTTS 2 provide the most realistic, human-like speech available in open-source TTS.

Multilingual Voices

Many voices support multiple languages. Some models like CosyVoice 2 and GPT-SoVITS support cross-lingual synthesis, where a voice trained in one language can speak naturally in another. The language filter above lets you find voices that natively support your target language, ensuring the best pronunciation and intonation.

Voice Cloning

Some models support voice cloning, which means you can use any voice as a reference to create speech that sounds like that person. Upload a short audio sample (10-30 seconds) and the model will adapt to match the voice characteristics. Models that support cloning include GPT-SoVITS, CosyVoice 2, and Chatterbox.

Choosing the Right Voice

The best voice depends on your use case. For audiobooks and podcasts, use premium voices with natural prosody. For game characters, explore diverse voices across models. For accessibility and screen readers, choose clear, well-paced voices. For quick prototyping, free-tier voices offer instant results with no character cost. Preview each voice with the play button before making your choice.

Text to Speech by Language

Arabic Assamese Bengali Bulgarian Catalan Chinese (Mandarin) Czech Danish Dutch English Finnish French Georgian German Greek Gujarati Hindi Hungarian Icelandic Italian Japanese Kannada Kazakh Korean Latvian Luxembourgish Malayalam Marathi Nepali Norwegian Odia Persian Polish Portuguese Punjabi Romanian Russian Serbian Slovak Slovenian Spanish Swahili Swedish Tamil Telugu Thai Turkish Ukrainian Vietnamese Welsh

Voices by AI Model

Bark Bark Small Chatterbox Chatterbox Turbo CosyVoice 2 CosyVoice3 Darwin TTS Dia TTS GPT-SoVITS IndexTTS-2 Indic Parler TTS Kani TTS 2 KhanomTan TTS Kitten TTS Kokoro MOSS-TTS Nano MOSS-TTSD MeloTTS Ming-Omni TTS NAMAA Saudi TTS OpenVoice Orpheus OuteTTS Parler TTS Piper Pocket TTS Qwen3 TTS Sesame CSM Spark TTS StyleTTS 2 Tortoise TTS VITS VibeVoice VieNeu-TTS-v2 VoxCPM

Frequently Asked Questions

TTS.ai offers 100+ AI voices across 24 text-to-speech models. Voices span multiple languages, genders, accents, and speaking styles. New voices are added regularly as we expand our model library.

Yes, many voices have audio previews you can listen to directly on this page. Click the play button next to any voice with a preview to hear a sample. You can also test any voice on the Text to Speech page with your own text.

Use the filter controls at the top of the page to narrow voices by model, language, or gender. You can combine filters to find exactly the voice you need — for example, female English voices from the Kokoro model.

Free voices (Kokoro, Piper, VITS, MeloTTS) require no characters. Standard voices (Bark, CosyVoice 2, Dia) use 2x characters. Premium voices (Chatterbox, Tortoise) use 4x characters and offer the highest quality.

Kokoro (free tier) is rated 5/5 for quality and is the most natural-sounding free option. For premium quality, Chatterbox and Tortoise offer exceptional naturalness with voice cloning support. Listen to the previews to judge which voice suits your needs best.

Yes, all voices can be used commercially. Our models use open-source licenses (MIT, Apache 2.0). The generated audio is yours to use in videos, podcasts, apps, games, and any other commercial application.

Our voice library covers 30+ languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Russian, Hindi, Dutch, Polish, Turkish, and many more. Language availability varies by model.

Yes, use our Voice Cloning tool to create a custom voice from just 5-30 seconds of reference audio. Cloned voices appear in your account under "My Voices" and can be reused for future text-to-speech generations.

Consider your use case: for audiobooks, choose expressive voices like those from Bark or Chatterbox. For apps and IVR, choose clear voices from Kokoro or MeloTTS. For multilingual content, use CosyVoice 2 or GPT-SoVITS. Preview several options to find the best fit.

Yes, several models offer accent varieties. MeloTTS provides American, British, Indian, and Australian English accents. Other models have regional voice variants for Spanish, French, Portuguese, and Chinese. Filter by language to explore accent options.

Yes, all voices are accessible through our REST API. Specify the model and voice ID in your API request to generate speech with any voice programmatically. See our API Documentation page for code examples and voice ID references.

We regularly add new voices as we integrate additional TTS models and expand existing ones. Follow our updates to stay informed about new voice additions, model improvements, and language expansions.

Record, Enhance, and Transform Your Voice

Use the Voice Recorder with our full suite of AI audio tools. Clone your voice, transcribe speech, enhance quality, and more.