English Voices

Browse English AI voices. Preview, compare, and generate speech.

223+ Voices

98 voices found

--

No voices match your filters. Try adjusting your search criteria.

Voices by AI Model

Each TTS model has its own set of voices with unique characteristics. Some models support voice cloning, allowing you to use any voice as a reference.

BarkBark 10 Voices Standard

Try Model

Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.

English Female 1

English
Use

English Female 2

English
Use

English Female 3

English
Use

English Female 4

English
Use

English Male 1

English
Use

English Male 2

English
Use

English Male 3

English
Use

English Male 4

English
Use

View all 10 Bark Voices

Bark SmallBark Small 3 Voices Standard

Try Model

Lighter version of Bark with faster inference and lower memory usage.

English Female 1

English
Use

English Female 2

English
Use

English Male 1

English
Use

ChatterboxChatterbox 1 Voices Premium

Try Model

State-of-the-art zero-shot voice cloning with emotion control from Resemble AI.

Default

English
Use

Chatterbox TurboChatterbox Turbo 1 Voices Standard

Try Model

Faster Chatterbox with sub-200ms latency and paralinguistic tags for laughs, coughs, and more.

Default

English
Use

CosyVoice 2CosyVoice 2 2 Voices Standard

Try Model

Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.

English Female

English
Use

English Male

English
Use

CosyVoice3CosyVoice3 2 Voices Standard

Try Model

Next-generation multilingual TTS with bi-streaming, emotion control, and zero-shot voice cloning.

English Female

English
Use

English Male

English
Use

Dia TTSDia TTS 2 Voices Standard

Try Model

Multi-speaker dialog generation model that creates natural conversations between speakers.

Speaker 1

English
Use

Speaker 2

English
Use

Dia 2Dia 2 1 Voices Standard

Try Model

Streaming-first conversational TTS with multi-speaker dialogue and paralinguistic cues.

Default

English
Use

GLM-TTSGLM-TTS 2 Voices Standard

Try Model

Achieves the lowest character error rate among open-source TTS models.

Jiayan (English)

English
Use

Jiayan Alt (English)

English
Use

GPT-SoVITSGPT-SoVITS 1 Voices Standard

Try Model

Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.

English Default

English
Use

IndexTTS-2IndexTTS-2 1 Voices Standard

Try Model

Zero-shot TTS with fine-grained emotion control and high expressiveness.

Default

English
Use

Kitten TTSKitten TTS 8 Voices Free

Try Model

Ultra-lightweight TTS under 80MB. Runs on CPU without GPU.

Bella

English
Use

Bruno

English
Use

Hugo

English
Use

Jasper

English
Use

Kiki

English
Use

Leo

English
Use

Luna

English
Use

Rosie

English
Use

KokoroKokoro 11 Voices Free

Try Model

Lightweight 82M parameter model delivering studio-quality speech with blazing-fast inference.

Adam

English
Use

Bella

English
Use

Emma (British)

English
Use

George (British)

English
Use

Heart

English
Use

Isabella (British)

English
Use

Lewis (British)

English
Use

Michael

English
Use

View all 11 Kokoro Voices

MegaTTS3MegaTTS3 1 Voices Premium

Try Model

ByteDance's sparse alignment TTS with adjustable intelligibility vs. speaker similarity.

Default

English
Use

MeloTTSMeloTTS 2 Voices Free

Try Model

High-quality multilingual text-to-speech that runs on CPU with minimal latency.

English British

English
Use

English US

English
Use

MOSS-TTSMOSS-TTS 1 Voices Premium

Try Model

Ultra-long 20-language TTS supporting up to 1 hour of continuous generation with phoneme-level control.

Default

English
Use

OpenVoiceOpenVoice 1 Voices Premium

Try Model

Instant voice cloning with granular control over style, emotion, and accent.

Default

English
Use

OrpheusOrpheus 8 Voices Standard

Try Model

Human-level emotional TTS model trained on 100K hours of speech data.

Dan

English
Use

Jess

English
Use

Leah

English
Use

Leo

English
Use

Mia

English
Use

Tara

English
Use

Zac

English
Use

Zoe

English
Use

OuteTTSOuteTTS 1 Voices Free

Try Model

LLM-based TTS that runs on CPU, GPU, or browser via llama.cpp and Transformers.js.

Female 1 (Neutral)

English
Use

Parler TTSParler TTS 1 Voices Standard

Try Model

Describe the voice you want in natural language and Parler generates matching speech.

Default

English
Use

PiperPiper 7 Voices Free

Try Model

A fast, local neural text to speech system optimized for Raspberry Pi and embedded devices.

Alan (UK)

English
Use

Alba (UK)

English
Use

Amy (US)

English
Use

Jenny (UK)

English
Use

Joe (US)

English
Use

Lessac (US)

English
Use

Ryan (US)

English
Use

Pocket TTSPocket TTS 8 Voices Free

Try Model

Lightweight 100M parameter model by Kyutai with voice cloning from a single sample.

Alba

English
Use

Azelma

English
Use

Cosette

English
Use

Eponine

English
Use

Fantine

English
Use

Javert

English
Use

Jean

English
Use

Marius

English
Use

Qwen3 TTSQwen3 TTS 6 Voices Standard

Try Model

Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.

Aiden

English
Use

Dylan

English
Use

Eric

English
Use

Ryan

English
Use

Serena

English
Use

Vivian

English
Use

Sesame CSMSesame CSM 2 Voices Premium

Try Model

Conversational speech model generating natural dialogue with appropriate timing and emotion.

Speaker 0

English
Use

Speaker 1

English
Use

Spark TTSSpark TTS 1 Voices Standard

Try Model

Voice cloning TTS with controllable emotion and speaking style via prompts.

Default

English
Use

StyleTTS 2StyleTTS 2 1 Voices Premium

Try Model

Human-level text-to-speech through style diffusion and adversarial training.

Default

English
Use

TADATADA 6 Voices Standard

Try Model

Zero-hallucination TTS with text-acoustic dual alignment, 5x faster than comparable LLM TTS.

Aria

English
Use

Bella

English
Use

James

English
Use

Marcus

English
Use

Oliver

English
Use

Sophia

English
Use

Tortoise TTSTortoise TTS 1 Voices Premium

Try Model

Multi-voice text-to-speech focused on quality with autoregressive architecture.

Random

English
Use

VibeVoiceVibeVoice 4 Voices Standard

Try Model

Microsoft's multi-speaker long-form TTS generating up to 90 minutes with 4 distinct speakers.

Speaker 1

English
Use

Speaker 2

English
Use

Speaker 3

English
Use

Speaker 4

English
Use

VITSVITS 1 Voices Free

Try Model

Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech.

Default

English
Use

VoxCPMVoxCPM 1 Voices Standard

Try Model

Tokenizer-free TTS producing 44.1kHz audio with context-aware paragraph consistency.

Default

English
Use

Understanding AI Voices

Voice Quality Tiers

TTS.ai offers voices across three quality tiers. Free-tier voices from Piper, VITS, and MeloTTS deliver fast, good-quality synthesis at no cost. Standard-tier voices from models like Kokoro and CosyVoice 2 offer more natural prosody and emotion. Premium-tier voices from OpenVoice, Chatterbox, and StyleTTS 2 provide the most realistic, human-like speech available in open-source TTS.

Multilingual Voices

Many voices support multiple languages. Some models like CosyVoice 2 and GPT-SoVITS support cross-lingual synthesis, where a voice trained in one language can speak naturally in another. The language filter above lets you find voices that natively support your target language, ensuring the best pronunciation and intonation.

Voice Cloning

Some models support voice cloning, which means you can use any voice as a reference to create speech that sounds like that person. Upload a short audio sample (10-30 seconds) and the model will adapt to match the voice characteristics. Models that support cloning include GPT-SoVITS, CosyVoice 2, and Chatterbox.

Choosing the Right Voice

The best voice depends on your use case. For audiobooks and podcasts, use premium voices with natural prosody. For game characters, explore diverse voices across models. For accessibility and screen readers, choose clear, well-paced voices. For quick prototyping, free-tier voices offer instant results with no character cost. Preview each voice with the play button before making your choice.

Frequently Asked Questions

TTS.ai offers 100+ AI voices across 24 text-to-speech models. Voices span multiple languages, genders, accents, and speaking styles. New voices are added regularly as we expand our model library.

Yes, many voices have audio previews you can listen to directly on this page. Click the play button next to any voice with a preview to hear a sample. You can also test any voice on the Text to Speech page with your own text.

Use the filter controls at the top of the page to narrow voices by model, language, or gender. You can combine filters to find exactly the voice you need — for example, female English voices from the Kokoro model.

Free voices (Kokoro, Piper, VITS, MeloTTS) require no characters. Standard voices (Bark, CosyVoice 2, Dia) use 2x characters. Premium voices (Chatterbox, Tortoise) use 4x characters and offer the highest quality.

Kokoro (free tier) is rated 5/5 for quality and is the most natural-sounding free option. For premium quality, Chatterbox and Tortoise offer exceptional naturalness with voice cloning support. Listen to the previews to judge which voice suits your needs best.

Yes, all voices can be used commercially. Our models use open-source licenses (MIT, Apache 2.0). The generated audio is yours to use in videos, podcasts, apps, games, and any other commercial application.

Our voice library covers 30+ languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, Russian, Hindi, Dutch, Polish, Turkish, and many more. Language availability varies by model.

Yes, use our Voice Cloning tool to create a custom voice from just 5-30 seconds of reference audio. Cloned voices appear in your account under "My Voices" and can be reused for future text-to-speech generations.

Consider your use case: for audiobooks, choose expressive voices like those from Bark or Chatterbox. For apps and IVR, choose clear voices from Kokoro or MeloTTS. For multilingual content, use CosyVoice 2 or GPT-SoVITS. Preview several options to find the best fit.

Yes, several models offer accent varieties. MeloTTS provides American, British, Indian, and Australian English accents. Other models have regional voice variants for Spanish, French, Portuguese, and Chinese. Filter by language to explore accent options.

Yes, all voices are accessible through our REST API. Specify the model and voice ID in your API request to generate speech with any voice programmatically. See our API Documentation page for code examples and voice ID references.

We regularly add new voices as we integrate additional TTS models and expand existing ones. Follow our updates to stay informed about new voice additions, model improvements, and language expansions.

Record, Enhance, and Transform Your Voice

Use the Voice Recorder with our full suite of AI audio tools. Clone your voice, transcribe speech, enhance quality, and more.