[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72230":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":29,"readmeContent":30,"aiSummary":31,"trendingCount":16,"starSnapshotCount":16,"syncStatus":32,"lastSyncTime":33,"discoverSource":34},72230,"VACE","ali-vilab\u002FVACE","ali-vilab","[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing","https:\u002F\u002Fali-vilab.github.io\u002FVACE-Page\u002F",null,"Python",3809,266,83,54,0,11,17,48,33,29.28,"Apache License 2.0",false,"main",true,[27,28],"video-editing","video-generation","2026-06-12 02:03:00","\u003Cp align=\"center\">\n\n\u003Ch1 align=\"center\">VACE: All-in-One Video Creation and Editing\u003C\u002Fh1>\n\u003Ch3 align=\"center\">(ICCV 2025)\u003C\u002Fh3>\n\u003Cp align=\"center\">\n    \u003Cstrong>Zeyinzi Jiang\u003Csup>*\u003C\u002Fsup>\u003C\u002Fstrong>\n    ·\n    \u003Cstrong>Zhen Han\u003Csup>*\u003C\u002Fsup>\u003C\u002Fstrong>\n    ·\n    \u003Cstrong>Chaojie Mao\u003Csup>*&dagger;\u003C\u002Fsup>\u003C\u002Fstrong>\n    ·\n    \u003Cstrong>Jingfeng Zhang\u003C\u002Fstrong>\n    ·\n    \u003Cstrong>Yulin Pan\u003C\u002Fstrong>\n    ·\n    \u003Cstrong>Yu Liu\u003C\u002Fstrong>\n    \u003Cbr>\n    \u003Cb>Tongyi Lab - \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1\">\u003Cimg src='https:\u002F\u002Fali-vilab.github.io\u002FVACE-Page\u002Fassets\u002Flogos\u002Fwan_logo.png' alt='wan_logo' style='margin-bottom: -4px; height: 20px;'>\u003C\u002Fa> \u003C\u002Fb>\n    \u003Cbr>\n    \u003Cbr>\n        \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.07598\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVACE-arXiv-red' alt='Paper PDF'>\u003C\u002Fa>\n        \u003Ca href=\"https:\u002F\u002Fali-vilab.github.io\u002FVACE-Page\u002F\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVACE-Project_Page-green' alt='Project Page'>\u003C\u002Fa>\n        \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fali-vilab\u002Fvace-67eca186ff3e3564726aff38\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVACE-HuggingFace_Model-yellow'>\u003C\u002Fa>\n        \u003Ca href=\"https:\u002F\u002Fmodelscope.cn\u002Fcollections\u002FVACE-8fa5fcfd386e43\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVACE-ModelScope_Model-purple'>\u003C\u002Fa>\n    \u003Cbr>\n\u003C\u002Fp>\n\n\n## Introduction\n\u003Cstrong>VACE\u003C\u002Fstrong> is an all-in-one model designed for video creation and editing. It encompasses various tasks, including reference-to-video generation (\u003Cstrong>R2V\u003C\u002Fstrong>), video-to-video editing (\u003Cstrong>V2V\u003C\u002Fstrong>), and masked video-to-video editing (\u003Cstrong>MV2V\u003C\u002Fstrong>), allowing users to compose these tasks freely. This functionality enables users to explore diverse possibilities and streamlines their workflows effectively, offering a range of capabilities, such as Move-Anything, Swap-Anything, Reference-Anything, Expand-Anything, Animate-Anything, and more.\n\n\u003Cimg src='.\u002Fassets\u002Fmaterials\u002Fteaser.jpg'>\n\n\n## 🎉 News\n- [x] Oct 17, 2025: [VACE-Benchmark](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fali-vilab\u002FVACE-Benchmark) has been updated to incorporate the evaluation data. [VACE-Page](https:\u002F\u002Fali-vilab.github.io\u002FVACE-Page\u002F) also features creative community cases, offering researchers and community members better project insight and tracking.\n- [x] Jun 26, 2025: [VACE](https:\u002F\u002Fopenaccess.thecvf.com\u002Fcontent\u002FICCV2025\u002Fhtml\u002FJiang_VACE_All-in-One_Video_Creation_and_Editing_ICCV_2025_paper.html) is accepted by ICCV 2025.\n- [x] May 14, 2025: 🔥Wan2.1-VACE-1.3B and Wan2.1-VACE-14B models are now available at [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FWan-AI\u002Fwan21-68ac4ba85372ae5a8e282a1b) and [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fcollections\u002Ftongyiwanxiang-Wan21-shipinshengcheng-67ec9b23fd8d4f)!\n- [x] Mar 31, 2025: 🔥VACE-Wan2.1-1.3B-Preview and VACE-LTX-Video-0.9 models are now available at [HuggingFace](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fali-vilab\u002Fvace-67eca186ff3e3564726aff38) and [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fcollections\u002FVACE-8fa5fcfd386e43)!\n- [x] Mar 31, 2025: 🔥Release code of model inference, preprocessing, and gradio demos. \n- [x] Mar 11, 2025: We propose [VACE](https:\u002F\u002Fali-vilab.github.io\u002FVACE-Page\u002F), an all-in-one model for video creation and editing.\n\n\n## 🪄 Models\n| Models                   | Download Link                                                                                                                                           | Video Size        | License                                                                                       |\n|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------|\n| VACE-Wan2.1-1.3B-Preview | [Huggingface](https:\u002F\u002Fhuggingface.co\u002Fali-vilab\u002FVACE-Wan2.1-1.3B-Preview) 🤗  [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fiic\u002FVACE-Wan2.1-1.3B-Preview) 🤖 | ~ 81 x 480 x 832  | [Apache-2.0](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-T2V-1.3B\u002Fblob\u002Fmain\u002FLICENSE.txt)             |\n| VACE-LTX-Video-0.9       | [Huggingface](https:\u002F\u002Fhuggingface.co\u002Fali-vilab\u002FVACE-LTX-Video-0.9) 🤗     [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fiic\u002FVACE-LTX-Video-0.9) 🤖          | ~ 97 x 512 x 768  | [RAIL-M](https:\u002F\u002Fhuggingface.co\u002FLightricks\u002FLTX-Video\u002Fblob\u002Fmain\u002Fltx-video-2b-v0.9.license.txt) |\n| Wan2.1-VACE-1.3B         | [Huggingface](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-VACE-1.3B) 🤗     [ModelScope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002FWan-AI\u002FWan2.1-VACE-1.3B) 🤖          | ~ 81 x 480 x 832  | [Apache-2.0](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-T2V-1.3B\u002Fblob\u002Fmain\u002FLICENSE.txt)             |\n| Wan2.1-VACE-14B          | [Huggingface](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-VACE-14B) 🤗     [ModelScope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002FWan-AI\u002FWan2.1-VACE-14B) 🤖            | ~ 81 x 720 x 1280 | [Apache-2.0](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-T2V-14B\u002Fblob\u002Fmain\u002FLICENSE.txt)             |\n\n- The input supports any resolution, but to achieve optimal results, the video size should fall within a specific range.\n- All models inherit the license of the original model.\n\n\n## ⚙️ Installation\nThe codebase was tested with Python 3.10.13, CUDA version 12.4, and PyTorch >= 2.5.1.\n\n### Setup for Model Inference\nYou can setup for VACE model inference by running:\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fali-vilab\u002FVACE.git && cd VACE\npip install torch==2.5.1 torchvision==0.20.1 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu124  # If PyTorch is not installed.\npip install -r requirements.txt\npip install wan@git+https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1  # If you want to use Wan2.1-based VACE.\npip install ltx-video@git+https:\u002F\u002Fgithub.com\u002FLightricks\u002FLTX-Video@ltx-video-0.9.1 sentencepiece --no-deps # If you want to use LTX-Video-0.9-based VACE. It may conflict with Wan.\n```\nPlease download your preferred base model to `\u003Crepo-root>\u002Fmodels\u002F`. \n\n### Setup for Preprocess Tools\nIf you need preprocessing tools, please install:\n```bash\npip install -r requirements\u002Fannotator.txt\n```\nPlease download [VACE-Annotators](https:\u002F\u002Fhuggingface.co\u002Fali-vilab\u002FVACE-Annotators) to `\u003Crepo-root>\u002Fmodels\u002F`.\n\n### Local Directories Setup\nIt is recommended to download [VACE-Benchmark](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fali-vilab\u002FVACE-Benchmark) to `\u003Crepo-root>\u002Fbenchmarks\u002F` as examples in `run_vace_xxx.sh`.\n\nWe recommend to organize local directories as:\n```angular2html\nVACE\n├── ...\n├── benchmarks\n│   └── VACE-Benchmark\n│       └── assets\n│           └── examples\n│               ├── animate_anything\n│               │   └── ...\n│               └── ...\n├── models\n│   ├── VACE-Annotators\n│   │   └── ...\n│   ├── VACE-LTX-Video-0.9\n│   │   └── ...\n│   └── VACE-Wan2.1-1.3B-Preview\n│       └── ...\n└── ...\n```\n\n## 🚀 Usage\nIn VACE, users can input **text prompt** and optional **video**, **mask**, and **image** for video generation or editing.\nDetailed instructions for using VACE can be found in the [User Guide](.\u002FUserGuide.md).\n\n### Inference CIL\n#### 1) End-to-End Running\nTo simply run VACE without diving into any implementation details, we suggest an end-to-end pipeline. For example:\n```bash\n# run V2V depth\npython vace\u002Fvace_pipeline.py --base wan --task depth --video assets\u002Fvideos\u002Ftest.mp4 --prompt 'xxx'\n\n# run MV2V inpainting by providing bbox\npython vace\u002Fvace_pipeline.py --base wan --task inpainting --mode bbox --bbox 50,50,550,700 --video assets\u002Fvideos\u002Ftest.mp4 --prompt 'xxx'\n```\nThis script will run video preprocessing and model inference sequentially, \nand you need to specify all the required args of preprocessing (`--task`, `--mode`, `--bbox`, `--video`, etc.) and inference (`--prompt`, etc.). \nThe output video together with intermediate video, mask and images will be saved into `.\u002Fresults\u002F` by default.\n\n> 💡**Note**:\n> Please refer to [run_vace_pipeline.sh](.\u002Frun_vace_pipeline.sh) for usage examples of different task pipelines.\n\n\n#### 2) Preprocessing\nTo have more flexible control over the input, before VACE model inference, user inputs need to be preprocessed into `src_video`, `src_mask`, and `src_ref_images` first.\nWe assign each [preprocessor](.\u002Fvace\u002Fconfigs\u002F__init__.py) a task name, so simply call [`vace_preprocess.py`](.\u002Fvace\u002Fvace_preproccess.py) and specify the task name and task params. For example:\n```angular2html\n# process video depth\npython vace\u002Fvace_preproccess.py --task depth --video assets\u002Fvideos\u002Ftest.mp4\n\n# process video inpainting by providing bbox\npython vace\u002Fvace_preproccess.py --task inpainting --mode bbox --bbox 50,50,550,700 --video assets\u002Fvideos\u002Ftest.mp4\n```\nThe outputs will be saved to `.\u002Fprocessed\u002F` by default.\n\n> 💡**Note**:\n> Please refer to [run_vace_pipeline.sh](.\u002Frun_vace_pipeline.sh) preprocessing methods for different tasks.\nMoreover, refer to [vace\u002Fconfigs\u002F](.\u002Fvace\u002Fconfigs\u002F) for all the pre-defined tasks and required params.\nYou can also customize preprocessors by implementing at [`annotators`](.\u002Fvace\u002Fannotators\u002F__init__.py) and register them at [`configs`](.\u002Fvace\u002Fconfigs).\n\n\n#### 3) Model inference\nUsing the input data obtained from **Preprocessing**, the model inference process can be performed as follows:\n```bash\n# For Wan2.1 single GPU inference (1.3B-480P)\npython vace\u002Fvace_wan_inference.py --ckpt_dir \u003Cpath-to-model> --src_video \u003Cpath-to-src-video> --src_mask \u003Cpath-to-src-mask> --src_ref_images \u003Cpaths-to-src-ref-images> --prompt \"xxx\"\n\n# For Wan2.1 Multi GPU Acceleration inference (1.3B-480P)\npip install \"xfuser>=0.4.1\"\ntorchrun --nproc_per_node=8 vace\u002Fvace_wan_inference.py --dit_fsdp --t5_fsdp --ulysses_size 1 --ring_size 8 --ckpt_dir \u003Cpath-to-model> --src_video \u003Cpath-to-src-video> --src_mask \u003Cpath-to-src-mask> --src_ref_images \u003Cpaths-to-src-ref-images> --prompt \"xxx\"\n\n# For Wan2.1 Multi GPU Acceleration inference (14B-720P)\ntorchrun --nproc_per_node=8 vace\u002Fvace_wan_inference.py --dit_fsdp --t5_fsdp --ulysses_size 8 --ring_size 1 --size 720p --model_name 'vace-14B' --ckpt_dir \u003Cpath-to-model> --src_video \u003Cpath-to-src-video> --src_mask \u003Cpath-to-src-mask> --src_ref_images \u003Cpaths-to-src-ref-images> --prompt \"xxx\"\n\n# For LTX inference, run\npython vace\u002Fvace_ltx_inference.py --ckpt_path \u003Cpath-to-model> --text_encoder_path \u003Cpath-to-model> --src_video \u003Cpath-to-src-video> --src_mask \u003Cpath-to-src-mask> --src_ref_images \u003Cpaths-to-src-ref-images> --prompt \"xxx\"\n```\nThe output video together with intermediate video, mask and images will be saved into `.\u002Fresults\u002F` by default.\n\n> 💡**Note**: \n> (1) Please refer to [vace\u002Fvace_wan_inference.py](.\u002Fvace\u002Fvace_wan_inference.py) and [vace\u002Fvace_ltx_inference.py](.\u002Fvace\u002Fvace_ltx_inference.py) for the inference args.\n> (2) For LTX-Video and English language Wan2.1 users, you need prompt extension to unlock the full model performance. \nPlease follow the [instruction of Wan2.1](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1?tab=readme-ov-file#2-using-prompt-extension) and set `--use_prompt_extend` while running inference.\n> (3) When performing prompt extension in editing tasks, it's important to pay attention to the results of expanding plain text. Since the visual information being input is unknown, this may lead to the extended output not matching the video being edited, which can affect the final outcome.\n\n### Inference Gradio\nFor preprocessors, run \n```bash\npython vace\u002Fgradios\u002Fvace_preprocess_demo.py\n```\nFor model inference, run\n```bash\n# For Wan2.1 gradio inference\npython vace\u002Fgradios\u002Fvace_wan_demo.py\n\n# For LTX gradio inference\npython vace\u002Fgradios\u002Fvace_ltx_demo.py\n```\n\n## Acknowledgement\n\nWe are grateful for the following awesome projects, including [Scepter](https:\u002F\u002Fgithub.com\u002Fmodelscope\u002Fscepter), [Wan](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1), and [LTX-Video](https:\u002F\u002Fgithub.com\u002FLightricks\u002FLTX-Video). Additionally, we extend our deepest gratitude to all community creators. It is their proactive exploration, experimentation, and boundless creativity that have brought immense inspiration to the project, fostering the emergence of even more refined workflows and stunning video generation content based on it. This includes, but is not limited to: [Kijai's Workflow](https:\u002F\u002Fgithub.com\u002Fkijai\u002FComfyUI-WanVideoWrapper), native code support for [ComfyUI](https:\u002F\u002Fgithub.com\u002Fcomfyanonymous\u002FComfyUI) and [Diffusers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fdiffusers), crucial model quantization support, a diverse ecosystem of LoRA adapters, and the ever-evolving innovative workflows from our community members.\n\n\n## BibTeX\n\n```bibtex\n@inproceedings{vace,\n    title = {VACE: All-in-One Video Creation and Editing},\n    author = {Jiang, Zeyinzi and Han, Zhen and Mao, Chaojie and Zhang, Jingfeng and Pan, Yulin and Liu, Yu},\n    booktitle = {Proceedings of the IEEE\u002FCVF International Conference on Computer Vision},\n    pages = {17191-17202},\n    year = {2025}\n}\n","VACE 是一个集视频创作与编辑于一体的模型，支持参考视频生成、视频到视频编辑以及掩码视频到视频编辑等多种任务。其核心技术特点包括Move-Anything、Swap-Anything、Reference-Anything等功能，能够灵活组合以实现多样化的视频处理需求。该模型基于Python开发，并已在ICCV 2025上发表相关论文。适用于需要高效完成从零开始创建视频或对现有视频进行复杂编辑的场景，如创意内容制作、影视后期处理等。",2,"2026-06-11 03:40:57","high_star"]