Bark

Bark TTS

Suno's transformer-based text-to-audio model that generates speech plus laughter, sighs, music, and sound effects.

Bark comes from Suno and takes a different approach from most TTS systems: it is a GPT-style transformer trained as a text-to-audio model rather than a pure text-to-speech one. Because it generates raw audio tokens, it can produce nonverbal sounds — laughing, sighing, crying — as well as background music and sound effects alongside the spoken words. It ships with 100+ speaker presets and handles 13+ languages including English, Chinese, French, German, Hindi, Japanese, and Korean. The trade-off is speed and length: at 350M parameters it runs slowly (~15s per clip) and caps at 200 characters, so it shines for short, emotive, creative audio rather than long narration.

At a glance

Developer
Suno
License
MIT
Tier
standard
Speed
slow
Voice cloning
No
Languages
English, Chinese, French, German, Hindi, Italian, Japanese, Korean, Polish, Portuguese, Russian, Spanish, Turkish
Max characters
200

Bark AI Voices

Chinese Speaker 1

Chinese
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Chinese Speaker 2

Chinese
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

English Female 1

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Female
ବ୍ୟବହାର କରନ୍ତୁ

English Female 2

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Female
ବ୍ୟବହାର କରନ୍ତୁ

English Female 3

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Female
ବ୍ୟବହାର କରନ୍ତୁ

English Female 4

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Female
ବ୍ୟବହାର କରନ୍ତୁ

English Male 1

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

English Male 2

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

English Male 3

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

English Male 4

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

English Male 5

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

English Male 6

English
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Male
ବ୍ୟବହାର କରନ୍ତୁ

French Speaker 1

French
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

French Speaker 2

French
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

German Speaker 1

German
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

German Speaker 2

German
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Hindi Speaker 1

Hindi
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Italian Speaker 1

Italian
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Japanese Speaker 1

Japanese
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Japanese Speaker 2

Japanese
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Korean Speaker 1

Korean
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Korean Speaker 2

Korean
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Polish Speaker 1

Polish
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Portuguese Speaker 1

Portuguese
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Russian Speaker 1

Russian
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Spanish Speaker 1

Spanish
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Spanish Speaker 2

Spanish
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Turkish Speaker 1

Turkish
ପୂର୍ବନିର୍ଦ୍ଧାରିତ Neutral
ବ୍ୟବହାର କରନ୍ତୁ

Best for

Creative audio content, audiobooks with emotion, sound effects

Bark TTS — FAQ

Yes. Bark is a text-to-audio model, so beyond speech it can generate nonverbal cues like laughing, sighing and crying, plus music and background sound effects — one of its defining capabilities.

Yes. Bark is MIT-licensed, which permits commercial use.

Bark caps at 200 characters per request and is on the slower side (around 15 seconds per clip), so it is best suited to short, expressive snippets rather than long-form audio. It does not support voice cloning.
← All voices