Kokoro TTS
An 82M-parameter open model from Hexgrad that delivers studio-quality speech at nearly 100x real-time.
Kokoro, built by Hexgrad, is a deliberately tiny 82-million-parameter model that punches far above its size class. It uses a StyleTTS + ISTFTNet architecture and was trained on roughly 1,200 hours of speech, yet generates audio close to 100x faster than real-time on a GPU while staying remarkably natural and expressive. It covers English, Japanese, Chinese, French, Italian, Portuguese, Spanish, and Hindi with a varied set of expressive voicepacks, and supports streaming. The combination of small footprint, low latency, and a permissive license has made Kokoro one of the most popular free models for high-volume and streaming use — it carries the largest share of traffic on TTS.ai.
At a glance
- Developer
- Hexgrad
- License
- Apache 2.0
- Tier
- free
- Speed
- fast
- Voice cloning
- No
- Languages
- English, Japanese, Chinese, French, Italian, Portuguese, Spanish, Hindi
- Max characters
- 500
Kokoro AI Voices
Best for
High-quality TTS with minimal latency, streaming applications