Ko te AI Lip Sync Video Generator

Whakataki i tētahi whakaahua mata me tētahi tātari oro - ki te whiwhi i tētahi ataata-kau me te whakakotahitanga mata tūturu, te āhua o te uaua, me ngā tītaha. Ka whakahaua e te SadTalker (MIT). Ka tika te whakamahinga hokohoko.

Kua mahia e tātau Whakapā atu i tō tou reo

Whakarewa i te āhua + oro

1,000 ngā tohu i ia wa

Kātaki me te tuku i tātou pūranga ki konei, Ka tirohia

JPG, PNG, or short MP4/WebM. Max 10MB. One clear, well-lit face works best.

file.mp3

0 MB

Kātaki me te tuku i tātou pūranga ki konei, Ka tirohia

MP3, WAV, M4A, or FLAC. Max 10MB. Free: up to 30 sec. Pro: up to 5 min.

file.mp3

0 MB

Ka tukatuka...

E whakarārangi ana i tōna ataata. Ko te tikanga e mau ana tēnei i te 30 waeine ki te 2 minu.

Ko tōtou Video Talking-Head

Waihoki

Mo te SadTalker

Ko te SadTalker (CVPR 2023, Tencent ARC) he tauira pūtake-whakawhitiwhiti e whakaata ana i tētahi whakaahua mata kotahi hei kōrero i tētahi oro. Kāore i te ōrite ki ngā tāupetanga Wav2Lip, ko te SadTalker e whakaata ana hoki i te āhua o te ulu, i ngā kōruru, me te kīanga mō tētahi hua nui ake.

Ko te waehere me ngā taumahatanga he whakawhiwhinga MIT-ki te mutunga ki te mutunga — kāore he Llama, he Gemma, he pūtake kore-kauhanga rānei — kia haumaru ai ngā ataata e hangaia ai mō te whakamahi hokohoko.

Whakapānga mō ngā hua pai rawa

  • Ka whakamahia tētahi whakaahua pai, mārama, mārama - e kitea ana ngā mata, ka kati te mīti.
  • Ko te mata pūtahi, tapawhā, 4:5 rānei te ōwehenga aronga pai rawa
  • Ko te oro kōrero mārama (kore pūoro) e tuku ana i te whakahoahoa mata ā-ringa
  • Mā te whakahohe i te GFPGAN mō ngā tākaro kaiārahi - e rua nga wā whakairo engari e whakamātau ana i ngā taipitopito
  • Ka whakamahia te Still preset ina hiahia koe ki tētahi tākaro avatar pūmau

Ka whakatata te Papatono Vitio

Ka tīmata wātea, whakapiki ina hiahiatia he nui ake

Waihoki
  • 30-waeine te tepe oro
  • 256 px te huaputa
  • "I te mea tonu" anake te whakaritenga tuatahi
  • Kāore he whakarei ake mata
Ko te tino rongonui
Kāreti
  • 30-waeine te tepe oro
  • Ko ngā "whāiti" me ngā "kore" i te wā i whakaritea
  • 256 / 512 px te huaputa
  • Ka whakarei ake te kanohi GFPGAN
Ka tāuru i te wātea
Ka taea
  • Te tepe oro 5-rohe
  • GPU whakahau tuatahi
  • Ka uru mai te API (whakahauhau-maha)
  • Ko ngā whakahau whakaotitanga Webhook
  • Ka whakamahia te hokohoko (whakahaere MIT)
Whakahauhau

E pā ana ngā pātai

Whakataki i tētahi whakaahua mata me tētahi rīpene oro, ā, ka whakaputaina e te AI he ataata o taua mata e kōrero ana i te oro me ngā nekeneketanga mata, te āhua o te ulu, me ngā kōruru. I hangaia ki te SadTalker (CVPR 2023), he tauira-kaua kōrero MIT-licensed e whakaata ana i te kīanga i tua atu i te hanga oro.

Ka taea e te tāuru mata te whakaahua JPG, PNG rānei (tata ki te 10 MB), he pikitia MP4/WebM poto rānei (ka whakamahia e tātau te tāurunga tuatahi). Ka taea e te orooro te MP3, WAV, M4A, FLAC rānei tae atu ki te 10 MB. Ka tāurunga anō tātau i te orooro ki te 16 kHz i roto.

Ko ngā kāwanatanga wātea: tae atu ki te 30 ngā takiwā i ia tā. Ko ngā kaiwhakamaori e utu ana: tae atu ki te 5 ngā minu i ia tono. Ko te tikanga o te oro whakarārangi roa ake me te utu āhuahira nui ake.

Ka whakamahia e te video Lip sync ngā tohu 1,000 i ia wa o te whakaaturanga o te whakaaturanga. He tāpiri 30-wae = 30,000 ngā tohu. Ka whakatūnga te utu i mua mai i tō tātou taumahatanga, ā, ka whakawhiwhia i te whakanaotanga mēnā kāore i angitu te whakanaotanga.

He — Ko te waehere me ngā taumaha o te SadTalker he whakawhiwhinga MIT i te mutunga ki te mutunga (kore he Llama, Gemma, he pūtake kore-kauhanga rānei). Ko ngā video e waihangatia ana e koe ko ōna whakamahinga hokohoko. E whai mana ana koe ki te whai mana ki te whakaahua mata pūtake me te oro e whakaata ana.

Tata ki te 30 ngā taki mō tētahi tātari 5-raki i runga i tātau pūnaha A100, e whakarea ana i te roanga o te oro. Mā te whakahohe i te whakarei āhua o GFPGAN e rua ngā wā whakarārangi, engari ka puta mai he huaputa mārama ake, he huaputa mātauranga ake.

Ka whakaatatia e te whakatūtū tōpū (whakatūtū) te āhua o te kōauau, te whakatūtū, me te kīanga tae atu ki ngā mata, e whakaputa ana i tētahi ataata ā-ringa-whakawhiti. Ka kati tonu te whakatūtū i te kōauau i te wāhi, ā, ka whakaatatia anake te mīti — he whai hua ina hiahiatia e koe tētahi tākaro āhuahira.

Ko te GFPGAN he tauira whakaora mata e whakamātau ana i ngā taipitopito o te mata i muri i te whakawāteatanga o te whakawāteatanga. Ka whakamātauria e ia ngā huaputa me te 256-pikē-piki, ā, ka tata ake ki te 512. E rua ngā wā whakawātea, engari he pai ki ngā tākaro kaiārahi.

E whakaata ana a SadTalker i te 256 px mā te pūnaha. Ka huri ki te rahi 512 px mō te huaputa mārama ake (whakamora, VRAM tiketike ake), ka whakahohetia rānei te whakarei GFPGAN hei whakawhānui i ngā taipitopito mata. Mō ngā hua pai rawa, ka whakataka i tētahi whakaahua whakaahuatanga o te āhuahira tiketike.

He. Whakahauhau i tētahi MP4, WebM rānei hei tāuru mata, ā, ka whakamahia e tātau te tāurunga tuatahi hei tuakiri whakatere. Mō te tāurunga ā-whitiāhua katoa (whakatauira i te mīhini), tirohia te kawenga ā-whitiāhua o Dubbing Studio e whai ake nei.

He. POST he tono maha ki te /api/v1/lipsync/ me ngā āpure mata me te oro, kātahi ka pātai /api/v1/lipsync/result/?uuid= tae noa ki te tūnga "hōmu". Kei roto i te urupare he URL ki te MP4 i whakawāteatia. E hiahiatia ana e te āheitanga API tētahi mahere utu.

Ka whakamahia e te SadTalker te whakatūtū mata hei kite me te whakapōtu i te mata tino mōhio. Mō ngā hua pai rawa, tuku i tētahi whakaahua me tētahi tangata i waenganui, ngā kanohi e kitea ana, me te whakawhāititanga iti rawa. Ka puta pea ngā hua kāore i te taea te whakamōhio i ngā whakaahua rōpū.
5.0/5 (1)

He aha ka taea e tātau te whakapai ake? Ka āwhina tātau ki te whakaoti i ngā raruraru.

E whakaritea ana hei tīmata?

Ka tāuru wātea, ka whiwhi pūtea 50. Kāore he kāri pūtea e hiahiatia ana.