Dia TTS TTS
A 1.6B-parameter model purpose-built for generating natural multi-speaker dialogue, not just single-voice narration.
Dia by Nari Labs is a 1.6-billion-parameter text-to-speech model designed from the ground up for dialogue rather than monologue. It generates conversations between two speakers with realistic turn-taking, prosody, and emotional expression, producing audio that sounds like a real exchange instead of two voices read separately. Architecturally it pairs an autoregressive transformer with the Descript Audio Codec (DAC) for waveform generation. Dia is a strong fit for podcast-style content, scripted audiobook dialogue, and conversational scenes, and is released under Apache 2.0. Generations are heavier than single-voice models, so it favors quality over raw speed.
A colpo d'occhio
- Sviluppatore
- Nari Labs
- Licenza
- Apache 2.0
- Livello
- standard
- Velocità
- medium
- Clonazione vocale
- No.
- Lingue
- English
- Caratteri massimi
- 800
Dia TTS voci
Meglio per
Podcasts, audiobook dialogues, conversational content