AI Lip Sync Video Generator

Kutsitsa chithunzi cha nkhope ndi kanema wa audio — mudzalandira kanema wolankhulana ndi nkhope ndi nkhope, nkhope ndi nkhope, ndi nkhope. Powered by SadTalker (MIT).

Tilibe mawu a TTS m'chilankhulo chanu. Tikuthandizeni kuwonjezera anu! Kugulitsa mawu anu

Kutsitsa Face + Audio

1,000 characters per second

Drag & drop wanu fayilo apa, kapena browse

JPG, PNG, or short MP4/WebM. Max 10MB. One clear, well-lit face works best.

Mp3

0 MB

Drag & drop wanu fayilo apa, kapena browse

MP3, WAV, M4A, or FLAC. Max 10MB. Free: up to 30 sec. Pro: up to 5 min.

Mp3

0 MB

Kuchita...

Kujambula kanema wanu. Izi zimatenga nthawi ya 30 masekondi mpaka mphindi 2.

Video yanu yolankhula-m'manja

Kutsitsa

Za SadTalker

SadTalker (CVPR 2023, Tencent ARC) ndi mtundu wa open-source wolankhula-m'manja womwe uma animates chithunzi chimodzi cha nkhope kuti anene chilichonse cha audio.Mosiyana ndi mitundu ya Wav2Lip, SadTalker ima animatenso phokoso la m'manja, kuwala, ndi kufotokoza kwa zotsatira zambiri zachilengedwe.

Kodi ndi mawuwo ndi MIT-licensed end to end - palibe Llama, Gemma, kapena backbone yosagulitsa - kuti mavidiyo omwe mumapanga ndi otetezeka pogwiritsa ntchito malonda.

Malangizo kwa Best Results

  • Pezani chithunzi chowoneka bwino, chowala bwino - maso akuwoneka, m'mawere otsegulidwa
  • Kuzungulira, kwachisanu kapena 4:5 aspect ratio zimagwira ntchito bwino
  • Chilankhulo choyera cha mawu (sinali nyimbo) chimabweretsa kulumikizana kwa lip sync
  • Kukhazikitsa GFPGAN kwa ma shots a hero - amadutsa nthawi yopanga koma amadula deta
  • Use the Still preset when you want a steady avatar shot

Lip Sync Video Maphunziro

Kuyambira kwaulere, kusinthidwa pamene mukufuna zambiri

Opanda pake
  • Kuletsa mawu kwa mphindi 30
  • 256 px output
  • "Still" preset only
  • Palibe face enhancer
Otchuka kwambiri
Kukhazikitsa Akaunti yaulere
  • Kuletsa mawu kwa mphindi 30
  • Maziko onse "okwanira" ndi "okhazikika"
  • 256 / 512 px kutulutsa
  • GFPGAN face enhancer
Kulembetsa kwaulere
Pro
  • Kuletsa mawu kwa mphindi 5
  • Kusintha kwa GPU
  • Kupeza API (kutsitsa kwa magawo ambiri)
  • Webhook kumaliza callbacks
  • Kugwiritsa ntchito kwamalonda (lisensi ya MIT)
Kusintha

Funso Lofunsidwa Kawirikawiri

Kutsitsa chithunzi cha nkhope ndi kanema wa audio, ndipo AI imapanga kanema wa nkhope yomwe imalankhula mawu ndi maonekedwe a lip, head pose, ndi ma blinks.Imapangidwa pa SadTalker (CVPR 2023), mtundu wa MIT-licensed wolankhula-m'manja womwe uma animate chilankhulo kuwonjezera pa mawonekedwe a m'magazi.

Kulowa kwa nkhope kungakhale JPG kapena PNG (kupitilira 10 MB) kapena kanema woyendetsa MP4 / WebM wofupi (tigwiritsa ntchito gawo loyamba). Kuyendetsa galimoto kumatha kukhala MP3, WAV, M4A kapena FLAC mpaka 10 MB.

Maakaunti aulere: mpaka masekondi 30 pa clip. Ogwiritsa ntchito omwe amalipira: mpaka maola 5 pa zosowa. Mavidiyo opitilira nthawi yayitali amatanthauza nthawi yayitali yopanga komanso mtengo wotsika kwambiri wa chilankhulo.

Lip sync video imagwiritsa ntchito maonekedwe 1,000 pa mphindi ya kanema wopangidwa. Clip ya mphindi 30 = maonekedwe 30,000. Mtengo umagulitsidwa kumbuyo kuchokera ku balans yanu ya maonekedwe ndipo imabwezedwanso mwamsanga ngati kukhazikitsidwa kulephera.

Ndikofunika kuti mudziwe kuti SadTalker ndi pulogalamu yovomerezeka ya MIT (no Llama, Gemma, or non-commercial backbone). Mavidiyo omwe mumapanga ndi anu kuti muwagwiritse ntchito pantchito yanu. Muyenera kukhala ndi ufulu wogwiritsa ntchito zithunzi za mafano ndi mavidiyo omwe mumatumiza.

About 30 seconds for a 5-second clip on our A100 server, scaling roughly linearly with audio length. Enabling the GFPGAN face enhancer roughly doubles render time but produces sharper, higher-quality output.

Kusintha kwatsopano (kuyambiranso) kumasintha mawonekedwe a m'mapewa, kuwala kwa maso, ndi kuoneka kwa m'mapewa, popanga kanema wolankhulana kwambiri. Kusintha kwatsopano kumaletsa m'mapewa pamalo ake ndipo kumasintha m'mapewa okha - kothandiza pamene mukufuna kujambula avatar yolimba.

GFPGAN ndi mtundu wa kubwezeretsa nkhope yomwe imachotsa zolemba za nkhope pambuyo posintha lip-sync. Imachotsa zosagwirizana ndi kujambula ndi kupangitsa kuti 256-pixel output ikhale yogwirizana ndi 512. Imadutsa nthawi yopanga, koma ndi yoyenera kujambula kwa akatswiri.

SadTalker imapanga zithunzi ndi 256 px mosalekeza. Sungani ku 512 px kuti mupange zithunzi zowoneka bwino (zothamanga, VRAM yochuluka) kapena kulola kuwonjezera kwa GFPGAN kuti muwonjezere zithunzi za maso. Kuti mupeze zotsatira zabwino, lowani zithunzi zowoneka bwino komanso zowala.

Ndikofunika. Lowani MP4 kapena WebM ngati chithunzi cha nkhope ndipo tidzagwiritsa ntchito chithunzi choyamba ngati chidziwitso choyendetsa. Kuti mudziwe zambiri za kubwezeretsa kanema (kusintha kwa mutu pazithunzi), onani kanema wa Dubbing Studio.

Yes. POST pemphero la mbali zambiri ku /api/v1/lipsync/ ndi face ndi audio fields, kenako pitani ku /api/v1/lipsync/result/?uuid= mpaka lamulo likhale "lidzamaliza". Chotseguka chili ndi URL ya MP4 yopangidwa. Kupeza API kumafuna ndalama.

SadTalker amagwiritsa ntchito face-alignment kuzindikira ndi kuika mbewu yachilendo kwambiri. Kwa zotsatira zabwino, kutsitsa portrait ndi munthu wina kuzungulira, maso owoneka bwino, ndi occlusion minimal. Group photos angayambitse zotsatira zosayembekezereka.
5.0/5 (1)

Kodi tingachitire chiyani kuti tisinthe? Maganizo anu amatithandiza kuchotsa mavuto.

Ndinu okonzeka kuyamba?

Kulembetsa kwaulere ndi kulandira 15,000 characters. No credit card required.