Kuzivisa Bug / Feature Request

Spark TTS TTS

Voice cloning from five seconds of audio combined with prompt-based control over emotion, speed, and speaking style.

0/500 chiratidzo · Sign up for 5,000 per generation →

Kugadzwa for 5,000 character limit

SSML Mode (Speech Synthesis Markup Language yekugadzirisa zvakanaka)

Wrap yako tenzi mu SSML tags kuti zvive nyore kudzora:

<speak><prosody rate="slow">Slow speech</prosody></speak>

Emotion / Style Tags

Tags iyo yakasarudzwa model inonzwisiswa — tinya kuti utore imwe muchinyorwa chako apo inoitika:

Chikamu chekutaura

Define custom pronunciations (word = pronunciation):

Pitch 0

-12 +12

Dia Dialog Format: Usashandisa [S1] uye

[S2][S1] Hello there! [S2] Hi, how are you?



                

                
                
                    
                    
                        AI Model
                        
                    

                    
                    
                        
                            Mutauro
                            
                        
                        
                            
                            
                                
                                
                                
                            
                            
                        
                    
                
                

                
                
                    
                    
                        Chirungu
                        
                    

                    
                    
                        Output Format
                        
                    

                    
                    
                        
                            Speed
                            1.0x
                        
                        
                        
                            0.5x
                            2.0x
                        
                    
                

                
                
                    
                    
                        
                        Free with Piper, VITS, MeloTTS



        
        
            
                Yako yakagadzirwa audio ichaonekwa pano. Choose a model, enter text, and click Generate.
            
            
            
                
                
                    Kugadzira kwakakundikana
                    
                
            
        

            
                
                    
                        
                            Audio Yakagadzirwa Nekubudirira
                            
                        
                        






    
        
            
                
                
                
                0:00
                
                    
                    
                        
                    
                
                
                    
                
                
            
        
    



                        
                            
                                Download Audio
                            
                            
                                Dhawunirodha.srt
                            
                            
                            
                            Link inotanga kushanda mu 24h
                            
                                
                                
                                
                                
                                
                            
                        
                        
                        
                            Free tier: personal usage. Commercial license from $5/mo
                        
                        
                    
                
            
        

        
        
            
                
                    Kutangazve
                    Get 200K characters every month — $5/mo
                    kana imwe nguva 100K pack ye $ 5
                
            
            
                
                    Ita kuti izvi zvive zvako
                    Clone a voice in 30 seconds
                    
                
            
        

        

    
        
            
                
                    Love TTS.ai? Tiudza shamwari dzako!





    
        
            
                ✨ Premium Voice Model
                
            
            
                Iyi ndiyo premium voice model, inowanikwa pane chero yakabhadharwa plano. Iwe unogona zvakare kuona mazita ayo emahara nekudzvanya bhatani rekutamba pedyo nemugadziri wezwi.
                
                    Unlock premium mazita — $5/mo
                    Tsananguro yezvirongwa
                
            
        
    





    
        
            
                
                
                    Kutenga zvimwe zvinyorwa
                    
    Hapana kushambadzira
    Kushandiswa pasina muganho
    Priority rutsigiro
    Early access kune nyowani maficha


                
                

                
                    
                        Get More Characters






    
    
        
            Chii Spark TTS
            Spark TTS by SparkAudio merges voice cloning with controllable delivery in a single prompt-driven system. Using just five seconds of reference audio it clones a voice, then lets you steer emotion, speed, and speaking style while keeping that cloned identity intact. Under the hood it combines a BiCodec audio tokenizer, an LLM, and flow matching, and it supports English and Chinese. It is aimed at content creation where a single cloned voice needs to express a range of moods and pacing. Note the licensing split: Spark's code is Apache 2.0, but the model weights are released under CC BY-NC-SA 4.0, which restricts commercial use.
            
            Yakanaka kune: Content creation with cloned voices and emotional control
            
            Tarisa zvese Spark TTS mazwi
        
        
            
                
                    Mufananidzo
                    
                        Developer
SparkAudio
                        License
CC BY-NC-SA 4.0
                        Tier
standard
                        Speed
medium
                        Kutaura
Yes
                        Zvinhu
English, Chinese
                        Max characters
1000
                    
                
            
        
    

    
    
    Spark TTS mazwi
    
        
        
            
                
                    
                        
                            Chinese Default
                            Chinese
                        
                        
                        
                        
                    
                    
                        Chimiro
                        Neutral
                    
                    
                    
                    
                
            
        
        
        
            
                
                    
                        
                            Default
                            English
                        
                        
                        
                        
                    
                    
                        Chimiro
                        Neutral
                    
                    
                    
                    
                
            
        
        
    
    

    
    
    Spark TTS TTS — Zvinyorwa
    
        
        
            
                
            
            
                It uses a prompt-based control system layered on top of voice cloning, so you can adjust emotion, speed, and speaking style while preserving the identity of the cloned voice.
            
        
        
        
            
                
            
            
                About five seconds of reference audio is enough to clone a voice in English or Chinese.
            
        
        
        
            
                
            
            
                Its model weights are licensed CC BY-NC-SA 4.0, which prohibits commercial use, even though the project code is Apache 2.0. Choose a permissively-licensed model for commercial work.
            
        
        
    
    

    ← All voices

Spark TTS TTS

Love TTS.ai? Tiudza shamwari dzako!

Chii Spark TTS

Mufananidzo

Spark TTS mazwi

Chinese Default

Default

Spark TTS TTS — Zvinyorwa

How does Spark TTS control emotion and style?

How much audio does Spark TTS need to clone a voice?

Can I use Spark TTS commercially?