フリーAI テキストを音声に変換
22以上のオープンソースモデル 100以上の声 32+ アカウントは必要ありません
音声AIに必要なすべて
24以上のオープンソースAIモデルで動作する26のツール
22以上のAIボイスモデル
1つのプラットフォームにおけるオープンソースTTSモデルの最も包括的なコレクション
Kokoro Free
Kokoro is an 82 million parameter text-to-speech model that punches well above its weight class. Despite its tiny size, it produces remarkably natural and expressive speech. Kokoro supports multiple languages including English, Japanese, Chinese, and Korean with a variety of expressive voices. It runs incredibly fast — generating audio nearly 100x faster than real-time on a GPU.
最適な場所: High-quality TTS with minimal latency, streaming applications
無料トライ
Piper Free
Piper is a lightweight text-to-speech engine developed by Rhasspy that uses VITS and larynx architectures. It runs entirely on CPU, making it ideal for edge devices, home automation, and applications requiring offline TTS. With over 100 voices across 30+ languages, Piper delivers natural-sounding speech at real-time speeds even on a Raspberry Pi 4.
最適な場所: Quick previews, accessibility, and embedded applications
無料トライ
VITS Free
VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech) is a parallel end-to-end TTS method that generates more natural sounding audio than current two-stage models. It adopts variational inference augmented with normalizing flows and an adversarial training process, achieving a significant improvement in naturalness.
最適な場所: General-purpose text-to-speech with natural prosody
無料トライ
MeloTTS Free
MeloTTS by MyShell.ai is a multilingual TTS library supporting English (American, British, Indian, Australian), Spanish, French, Chinese, Japanese, and Korean. It is extremely fast, processing text at near real-time speed on CPU alone. MeloTTS is designed for production use and supports both CPU and GPU inference.
最適な場所: 高速で多言語のTTSを必要とするプロダクションアプリケーション
無料トライ
Bark Standard
Transformer-based text-to-audio model that generates realistic speech, music, and sound effects.
開発者: Suno · ライセンス: MIT
やってみろ
Bark Small Standard
Lighter version of Bark with faster inference and lower memory usage.
開発者: Suno · ライセンス: MIT
やってみろ
CosyVoice 2 Standard
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
開発者: Alibaba (Tongyi Lab) · ライセンス: Apache 2.0
やってみろ
Parler TTS Standard
Describe the voice you want in natural language and Parler generates matching speech.
開発者: Hugging Face · ライセンス: Apache 2.0
やってみろ
IndexTTS-2 Standard
Zero-shot TTS with fine-grained emotion control and high expressiveness.
開発者: Index Team · ライセンス: Apache 2.0
やってみろ
Spark TTS Standard
Voice cloning TTS with controllable emotion and speaking style via prompts.
開発者: SparkAudio · ライセンス: Apache 2.0
やってみろ
GPT-SoVITS Standard
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
開発者: RVC-Boss · ライセンス: MIT
やってみろ
Orpheus Standard
Human-level emotional TTS model trained on 100K hours of speech data.
開発者: Canopy Labs · ライセンス: Llama 3.2 Community
やってみろ
Qwen3 TTS Standard
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
開発者: Alibaba (Qwen) · ライセンス: Apache 2.0
やってみろ
CosyVoice 2
Alibaba's scalable streaming TTS with human-parity naturalness and near-zero latency.
言語: en, zh, ja, ko, fr, de, it, es
クローン・ボイス
IndexTTS-2
Zero-shot TTS with fine-grained emotion control and high expressiveness.
言語: en, zh
クローン・ボイス
Spark TTS
Voice cloning TTS with controllable emotion and speaking style via prompts.
言語: en, zh
クローン・ボイス
GPT-SoVITS
Few-shot voice cloning TTS that replicates any voice from just 5 seconds of audio.
言語: en, zh, ja, ko
クローン・ボイス
OpenVoice
Instant voice cloning with granular control over style, emotion, and accent.
言語: en, zh, ja, ko, fr, de, es, it
クローン・ボイス
Qwen3 TTS
Alibaba's multilingual TTS with voice cloning, preset voices, and voice design from text.
言語: en, zh, ja, ko, de, fr, ru, pt, es, it
クローン・ボイスデベロッパーファーストAPI
OpenAI 互換の REST API。一つのエンドポイント、22以上のモデル。リアルタイムアプリケーションのストリーミングサポート。
- OpenAI互換フォーマット
- リアルタイムアプリケーションのためのストリーミングTTS
- 大型ジョブのバッチ処理
- ウェブフック通知
import requests
response = requests.post(
"https://api.tts.ai/v1/tts/",
headers={"Authorization": "Bearer sk-tts-xxx"},
json={
"model": "kokoro",
"text": "Hello from TTS.ai!",
"voice": "af_bella",
}
)
with open("output.mp3", "wb") as f:
f.write(response.content)
シンプルで透明な価格設定
自由に始めて 成長するにつれて拡大する