[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-71067":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":31,"readmeContent":32,"aiSummary":33,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":34,"discoverSource":35},71067,"video-retalking","OpenTalker\u002Fvideo-retalking","OpenTalker","[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild","https:\u002F\u002Fopentalker.github.io\u002Fvideo-retalking\u002F",null,"Python",7252,1061,72,201,0,2,4,9,6,40.08,"Apache License 2.0",false,"main",true,[27,28,29,30],"lip-synchronization","siggraph-asia-2022","talking-head-videos","video-editing","2026-06-12 02:02:47","\u003Cdiv align=\"center\">\n\n\u003Ch2>VideoReTalking \u003Cbr\u002F> \u003Cspan style=\"font-size:12px\">Audio-based Lip Synchronization for Talking Head Video Editing in the Wild\u003C\u002Fspan> \u003C\u002Fh2> \n\n  \u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2211.14758'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArXiv-2211.14758-red'>\u003C\u002Fa> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\u003Ca href='https:\u002F\u002Fvinthony.github.io\u002Fvideo-retalking\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green'>\u003C\u002Fa> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Fvinthony\u002Fvideo-retalking\u002Fblob\u002Fmain\u002Fquick_demo.ipynb)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\n[![Replicate](https:\u002F\u002Freplicate.com\u002Fcjwbw\u002Fvideo-retalking\u002Fbadge)](https:\u002F\u002Freplicate.com\u002Fcjwbw\u002Fvideo-retalking)\n\n\u003Cdiv>\n    \u003Ca target='_blank'>Kun Cheng \u003Csup>*,1,2\u003C\u002Fsup> \u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fvinthony.github.io\u002F' target='_blank'>Xiaodong Cun \u003Csup>*,2\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fyzhang2016.github.io\u002Fyongnorriszhang.github.io\u002F' target='_blank'>Yong Zhang \u003Csup>2\u003C\u002Fsup>\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fmenghanxia.github.io\u002F' target='_blank'>Menghan Xia \u003Csup>2\u003C\u002Fsup>\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Ffeiiyin.github.io\u002F' target='_blank'>Fei Yin \u003Csup>2,3\u003C\u002Fsup>\u003C\u002Fa>&emsp;\u003Cbr\u002F>\n    \u003Ca href='https:\u002F\u002Fweb.xidian.edu.cn\u002Fmrzhu\u002Fen\u002Findex.html' target='_blank'>Mingrui Zhu \u003Csup>1\u003C\u002Fsup>\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fxuanwangvc.github.io\u002F' target='_blank'>Xuan Wang \u003Csup>2\u003C\u002Fsup>\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fjuewang725.github.io\u002F' target='_blank'>Jue Wang \u003Csup>2\u003C\u002Fsup>\u003C\u002Fa>&emsp;\n    \u003Ca href='https:\u002F\u002Fweb.xidian.edu.cn\u002Fnnwang\u002Fen\u002Findex.html' target='_blank'>Nannan Wang \u003Csup>1\u003C\u002Fsup>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\u003Cbr>\n\u003Cdiv>\n    \u003Csup>1\u003C\u002Fsup> Xidian University &emsp; \u003Csup>2\u003C\u002Fsup> Tencent AI Lab &emsp; \u003Csup>3\u003C\u002Fsup> Tsinghua University\n\u003C\u002Fdiv>\n\u003Cbr>\n\u003Ci>\u003Cstrong>\u003Ca href='https:\u002F\u002Fsa2022.siggraph.org\u002F' target='_blank'>SIGGRAPH Asia 2022 Conference Track\u003C\u002Fa>\u003C\u002Fstrong>\u003C\u002Fi>\n\u003Cbr>\n\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fopentalker.github.io\u002Fvideo-retalking\u002Fstatic\u002Fimages\u002Fteaser.png\" width=\"768px\">\n\n\n\u003Cdiv align=\"justify\">  \u003CBR> We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks:\n  \n \u003CBR> (1) face video generation with a canonical expression\n\u003CBR> (2) audio-driven lip-sync and \n  \u003CBR> (3) face enhancement for improving photo-realism. \n  \n \u003CBR>  Given a talking-head video, we first modify the expression of each frame according to the same expression template using the expression editing network, resulting in a video with the canonical expression. This video, together with the given audio, is then fed into the lip-sync network to generate a lip-syncing video. Finally, we improve the photo-realism of the synthesized faces through an identity-aware face enhancement network and post-processing. We use learning-based approaches for all three steps and all our modules can be tackled in a sequential pipeline without any user intervention.\u003C\u002Fdiv>\n\u003CBR>\n\n\u003Cp>\n\u003Cimg alt='pipeline' src=\".\u002Fdocs\u002Fstatic\u002Fimages\u002Fpipeline.png?raw=true\" width=\"768px\">\u003Cbr>\n\u003Cem align='center'>Pipeline\u003C\u002Fem>\n\u003C\u002Fp>\n\n\u003C\u002Fdiv>\n\n## Results in the Wild （contains audio）\nhttps:\u002F\u002Fuser-images.githubusercontent.com\u002F4397546\u002F224310754-665eb2dd-aadc-47dc-b1f9-2029a937b20a.mp4\n\n\n\n\n## Environment\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fvinthony\u002Fvideo-retalking.git\ncd video-retalking\nconda create -n video_retalking python=3.8\nconda activate video_retalking\n\nconda install ffmpeg\n\n# Please follow the instructions from https:\u002F\u002Fpytorch.org\u002Fget-started\u002Fprevious-versions\u002F\n# This installation command only works on CUDA 11.1\npip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Ftorch_stable.html\n\npip install -r requirements.txt\n```\n\n## Quick Inference\n\n#### Pretrained Models\nPlease download our [pre-trained models](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F18rhjMpxK8LVVxf7PI6XwOidt8Vouv_H0?usp=share_link) and put them in `.\u002Fcheckpoints`.\n\n\u003C!-- We also provide some [example videos and audio](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F14OwbNGDCAMPPdY-l_xO1axpUjkPxI9Dv?usp=share_link). Please put them in `.\u002Fexamples`. -->\n\n#### Inference\n\n```\npython3 inference.py \\\n  --face examples\u002Fface\u002F1.mp4 \\\n  --audio examples\u002Faudio\u002F1.wav \\\n  --outfile results\u002F1_1.mp4\n```\nThis script includes data preprocessing steps. You can test any talking face videos without manual alignment. But it is worth noting that DNet cannot handle extreme poses.\n\nYou can also control the expression by adding the following parameters:\n\n```--exp_img```: Pre-defined expression template. The default is \"neutral\". You can choose \"smile\" or an image path.\n\n```--up_face```: You can choose \"surprise\" or \"angry\" to modify the expression of upper face with [GANimation](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fganimation_replicate).\n\n\n\n## Citation\n\nIf you find our work useful in your research, please consider citing:\n\n```\n@misc{cheng2022videoretalking,\n        title={VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild}, \n        author={Kun Cheng and Xiaodong Cun and Yong Zhang and Menghan Xia and Fei Yin and Mingrui Zhu and Xuan Wang and Jue Wang and Nannan Wang},\n        year={2022},\n        eprint={2211.14758},\n        archivePrefix={arXiv},\n        primaryClass={cs.CV}\n  }\n```\n\n## Acknowledgement\nThanks to\n[Wav2Lip](https:\u002F\u002Fgithub.com\u002FRudrabha\u002FWav2Lip),\n[PIRenderer](https:\u002F\u002Fgithub.com\u002FRenYurui\u002FPIRender), \n[GFP-GAN](https:\u002F\u002Fgithub.com\u002FTencentARC\u002FGFPGAN), \n[GPEN](https:\u002F\u002Fgithub.com\u002Fyangxy\u002FGPEN),\n[ganimation_replicate](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fganimation_replicate),\n[STIT](https:\u002F\u002Fgithub.com\u002Frotemtzaban\u002FSTIT)\nfor sharing their code.\n\n\n## Related Work\n- [StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN (ECCV 2022)](https:\u002F\u002Fgithub.com\u002FFeiiYin\u002FStyleHEAT)\n- [CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior (CVPR 2023)](https:\u002F\u002Fgithub.com\u002FDoubiiu\u002FCodeTalker)\n- [SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation (CVPR 2023)](https:\u002F\u002Fgithub.com\u002FWinfredy\u002FSadTalker)\n- [DPE: Disentanglement of Pose and Expression for General Video Portrait Editing (CVPR 2023)](https:\u002F\u002Fgithub.com\u002FCarlyx\u002FDPE)\n- [3D GAN Inversion with Facial Symmetry Prior (CVPR 2023)](https:\u002F\u002Fgithub.com\u002FFeiiYin\u002FSPI\u002F)\n- [T2M-GPT: Generating Human Motion from Textual Descriptions with Discrete Representations (CVPR 2023)](https:\u002F\u002Fgithub.com\u002FMael-zys\u002FT2M-GPT)\n\n##  Disclaimer\n\nThis is not an official product of Tencent. \n\n```\n1. Please carefully read and comply with the open-source license applicable to this code before using it. \n2. Please carefully read and comply with the intellectual property declaration applicable to this code before using it.\n3. This open-source code runs completely offline and does not collect any personal information or other data. If you use this code to provide services to end-users and collect related data, please take necessary compliance measures according to applicable laws and regulations (such as publishing privacy policies, adopting necessary data security strategies, etc.). If the collected data involves personal information, user consent must be obtained (if applicable). Any legal liabilities arising from this are unrelated to Tencent.\n4. Without Tencent's written permission, you are not authorized to use the names or logos legally owned by Tencent, such as \"Tencent.\" Otherwise, you may be liable for your legal responsibilities.\n5. This open-source code does not have the ability to directly provide services to end-users. If you need to use this code for further model training or demos, as part of your product to provide services to end-users, or for similar use, please comply with applicable laws and regulations for your product or service. Any legal liabilities arising from this are unrelated to Tencent.\n6. It is prohibited to use this open-source code for activities that harm the legitimate rights and interests of others (including but not limited to fraud, deception, infringement of others' portrait rights, reputation rights, etc.), or other behaviors that violate applicable laws and regulations or go against social ethics and good customs (including providing incorrect or false information, spreading pornographic, terrorist, and violent information, etc.). Otherwise, you may be liable for your legal responsibilities.\n\n```\n## All Thanks To Our Contributors \n\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FOpenTalker\u002Fvideo-retalking\u002Fgraphs\u002Fcontributors\">\n  \u003Cimg src=\"https:\u002F\u002Fcontrib.rocks\u002Fimage?repo=OpenTalker\u002Fvideo-retalking\" \u002F>\n\u003C\u002Fa>\n","VideoReTalking 是一个基于音频的唇形同步系统，用于编辑真实世界中的说话头部视频。其核心功能包括通过三个连续的任务来实现高质量的唇形同步：首先生成具有标准表情的脸部视频，然后根据输入音频驱动唇形同步，最后通过身份感知的脸部增强网络提升合成脸部的真实感。该项目使用Python开发，支持在多种场景下应用，如影视后期制作、虚拟主播等需要对说话人物视频进行编辑和同步处理的场合。开源社区活跃，采用Apache License 2.0许可协议发布。","2026-06-11 03:35:43","high_star"]