免费AI 免费AI 文本到语音
22+开放源码模型,100+声音, 32+ 不需要账户。
呼声AI需要的一切
26个工具,由24+开放源的AI 模型驱动
22+ AI 语音模型
在一个平台最全面地收集开放源代码 TTS 模型
Kokoro Free
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
最佳用于: High-quality TTS with minimal latency, streaming applications
尝试自由
Piper Free
Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
最佳用于: Quick previews, accessibility, and embedded applications
尝试自由
VITS Free
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
最佳用于: General-purpose text-to-speech with natural prosody
尝试自由
MeloTTS Free
MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
最佳用于: 需要快速、多语言TTS的生产应用
尝试自由
Bark Standard
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
开发者 : Suno · 驾照 : MIT
试试
Bark Small Standard
Lighter version of Bark with faster inference and lower memory usage.
开发者 : Suno · 驾照 : MIT
试试
CosyVoice 2 Standard
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
开发者 : Alibaba (Tongyi Lab) · 驾照 : Apache 2.0
试试
Parler TTS Standard
Describe the voice you want in natural language and Parler generates matching speech.
开发者 : Hugging Face · 驾照 : Apache 2.0
试试
IndexTTS-2 Standard
Zero-shot TTS with fine-grained emotion control and high expressiveness.
开发者 : Index Team · 驾照 : Apache 2.0
试试
Spark TTS Standard
Voice cloning TTS with controllable emotion and speaking style via prompts.
开发者 : SparkAudio · 驾照 : Apache 2.0
试试
GPT-SoVITS Standard
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
开发者 : RVC-Boss · 驾照 : MIT
试试
Orpheus Standard
Human-level emotional TTS model trained on 100K hours of speech data.
开发者 : Canopy Labs · 驾照 : Llama 3.2 Community
试试
Qwen3 TTS Standard
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
开发者 : Alibaba (Qwen) · 驾照 : Apache 2.0
试试
CosyVoice 2
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
语言: en, zh, ja, ko, fr, de, it, es
克隆声音
Spark TTS
Voice cloning TTS with controllable emotion and speaking style via prompts.
语言: en, zh
克隆声音
GPT-SoVITS
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
语言: en, zh, ja, ko
克隆声音
OpenVoice
Instant voice cloning with granular control over style, emotion, and accent.
语言: en, zh, ja, ko, fr, de, es, it
克隆声音
Qwen3 TTS
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
语言: en, zh, ja, ko, de, fr, ru, pt, es, it
克隆声音开发者- 第一 API
OpenAI-兼容的REST API. 一个端点, 22+模型, 流传实时应用支持 。
- OpenAI-兼容格式
- 实时应用程序流流 TTS
- 大型工作的批次处理
- WebHook 通知
import requests
response = requests.post(
"https://api.tts.ai/v1/tts/",
headers={"Authorization": "Bearer sk-tts-xxx"},
json={
"model": "kokoro",
"text": "Hello from TTS.ai!",
"voice": "af_bella",
}
)
with open("output.mp3", "wb") as f:
f.write(response.content)
简单、透明定价
开始自由。 随你成长, 缩放 。