[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72118":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72118,"IDM-VTON","yisol\u002FIDM-VTON","yisol","[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild","https:\u002F\u002Fidm-vton.github.io\u002F",null,"Python",5055,819,64,145,0,6,27,65,18,39.74,"Other",false,"main",true,[],"2026-06-12 02:02:58","\n\u003Cdiv align=\"center\">\n\u003Ch1>IDM-VTON: Improving Diffusion Models for Authentic Virtual Try-on in the Wild\u003C\u002Fh1>\n\n\u003Ca href='https:\u002F\u002Fidm-vton.github.io'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-green'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.05139'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-Arxiv-red'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fyisol\u002FIDM-VTON'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Demo-yellow'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fyisol\u002FIDM-VTON'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Model-blue'>\u003C\u002Fa>\n\n\n\u003C\u002Fdiv>\n\nThis is the official implementation of the paper [\"Improving Diffusion Models for Authentic Virtual Try-on in the Wild\"](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.05139).\n\nStar ⭐ us if you like it!\n\n---\n\n\n![teaser2](assets\u002Fteaser2.png)&nbsp;\n![teaser](assets\u002Fteaser.png)&nbsp;\n\n\n\n## Requirements\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fyisol\u002FIDM-VTON.git\ncd IDM-VTON\n\nconda env create -f environment.yaml\nconda activate idm\n```\n\n## Data preparation\n\n### VITON-HD\nYou can download VITON-HD dataset from [VITON-HD](https:\u002F\u002Fgithub.com\u002Fshadow2496\u002FVITON-HD).\n\nAfter download VITON-HD dataset, move vitonhd_test_tagged.json into the test folder, and move vitonhd_train_tagged.json into the train folder.\n\nStructure of the Dataset directory should be as follows.\n\n```\n\ntrain\n|-- image\n|-- image-densepose\n|-- agnostic-mask\n|-- cloth\n|-- vitonhd_train_tagged.json\n\ntest\n|-- image\n|-- image-densepose\n|-- agnostic-mask\n|-- cloth\n|-- vitonhd_test_tagged.json\n\n```\n\n### DressCode\nYou can download DressCode dataset from [DressCode](https:\u002F\u002Fgithub.com\u002Faimagelab\u002Fdress-code).\n\nWe provide pre-computed densepose images and captions for garments [here](https:\u002F\u002Fkaistackr-my.sharepoint.com\u002F:u:\u002Fg\u002Fpersonal\u002Fcpis7_kaist_ac_kr\u002FEaIPRG-aiRRIopz9i002FOwBDa-0-BHUKVZ7Ia5yAVVG3A?e=YxkAip).\n\nWe used [detectron2](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetectron2) for obtaining densepose images, refer [here](https:\u002F\u002Fgithub.com\u002Fsangyun884\u002FHR-VITON\u002Fissues\u002F45) for more details.\n\nAfter download the DressCode dataset, place image-densepose directories and caption text files as follows.\n\n```\nDressCode\n|-- dresses\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n|-- lower_body\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n|-- upper_body\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n```\n\n\n## Training\n\n\n### Preparation\n\nDownload pre-trained ip-adapter for sdxl(IP-Adapter\u002Fsdxl_models\u002Fip-adapter-plus_sdxl_vit-h.bin) and image encoder(IP-Adapter\u002Fmodels\u002Fimage_encoder) [here](https:\u002F\u002Fgithub.com\u002Ftencent-ailab\u002FIP-Adapter).\n\n```\ngit clone https:\u002F\u002Fhuggingface.co\u002Fh94\u002FIP-Adapter\n```\n\nMove ip-adapter to ckpt\u002Fip_adapter, and image encoder to ckpt\u002Fimage_encoder.\n\nStart training using python file with arguments,\n\n```\naccelerate launch train_xl.py \\\n    --gradient_checkpointing --use_8bit_adam \\\n    --output_dir=result --train_batch_size=6 \\\n    --data_dir=DATA_DIR\n```\n\nor, you can simply run with the script file.\n\n```\nsh train_xl.sh\n```\n\n\n## Inference\n\n\n### VITON-HD\n\nInference using python file with arguments,\n\n```\naccelerate launch inference.py \\\n    --width 768 --height 1024 --num_inference_steps 30 \\\n    --output_dir \"result\" \\\n    --unpaired \\\n    --data_dir \"DATA_DIR\" \\\n    --seed 42 \\\n    --test_batch_size 2 \\\n    --guidance_scale 2.0\n```\n\nor, you can simply run with the script file.\n\n```\nsh inference.sh\n```\n\n### DressCode\n\nFor DressCode dataset, put the category you want to generate images via category argument,\n```\naccelerate launch inference_dc.py \\\n    --width 768 --height 1024 --num_inference_steps 30 \\\n    --output_dir \"result\" \\\n    --unpaired \\\n    --data_dir \"DATA_DIR\" \\\n    --seed 42 \n    --test_batch_size 2\n    --guidance_scale 2.0\n    --category \"upper_body\" \n```\n\nor, you can simply run with the script file.\n```\nsh inference.sh\n```\n\n## Start a local gradio demo \u003Ca href='https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Fgradio'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fgradio-app\u002Fgradio'>\u003C\u002Fa>\n\nDownload checkpoints for human parsing [here](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fyisol\u002FIDM-VTON\u002Ftree\u002Fmain\u002Fckpt).\n\nPlace the checkpoints under the ckpt folder.\n```\nckpt\n|-- densepose\n    |-- model_final_162be9.pkl\n|-- humanparsing\n    |-- parsing_atr.onnx\n    |-- parsing_lip.onnx\n\n|-- openpose\n    |-- ckpts\n        |-- body_pose_model.pth\n    \n```\n\n\n\n\nRun the following command:\n\n```python\npython gradio_demo\u002Fapp.py\n```\n\n\n\n\n\n\n## Acknowledgements\n\n\nThanks [ZeroGPU](https:\u002F\u002Fhuggingface.co\u002Fzero-gpu-explorers) for providing free GPU.\n\nThanks [IP-Adapter](https:\u002F\u002Fgithub.com\u002Ftencent-ailab\u002FIP-Adapter) for base codes.\n\nThanks [OOTDiffusion](https:\u002F\u002Fgithub.com\u002Flevihsu\u002FOOTDiffusion) and [DCI-VTON](https:\u002F\u002Fgithub.com\u002Fbcmi\u002FDCI-VTON-Virtual-Try-On) for masking generation.\n\nThanks [SCHP](https:\u002F\u002Fgithub.com\u002FGoGoDuck912\u002FSelf-Correction-Human-Parsing) for human segmentation.\n\nThanks [Densepose](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FDensePose) for human densepose.\n\n\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=yisol\u002FIDM-VTON&type=Date)](https:\u002F\u002Fstar-history.com\u002F#yisol\u002FIDM-VTON&Date)\n\n\n\n## Citation\n```\n@article{choi2024improving,\n  title={Improving Diffusion Models for Authentic Virtual Try-on in the Wild},\n  author={Choi, Yisol and Kwak, Sangkyung and Lee, Kyungmin and Choi, Hyungwon and Shin, Jinwoo},\n  journal={arXiv preprint arXiv:2403.05139},\n  year={2024}\n}\n```\n\n\n\n## License\nThe codes and checkpoints in this repository are under the [CC BY-NC-SA 4.0 license](https:\u002F\u002Fcreativecommons.org\u002Flicenses\u002Fby-nc-sa\u002F4.0\u002Flegalcode).\n\n\n","IDM-VTON 项目旨在通过改进扩散模型来实现在真实场景中的虚拟试衣体验。其核心功能包括利用先进的深度学习技术，特别是扩散模型，以生成更加自然和真实的服装试穿效果。该项目使用 Python 编写，并基于 Hugging Face 平台提供演示和预训练模型。它支持 VITON-HD 和 DressCode 两个数据集，便于用户根据需求选择合适的训练数据。IDM-VTON 适用于电子商务、时尚设计以及任何需要高质量虚拟试衣解决方案的场景，能够帮助提升用户体验并促进销售转化。",2,"2026-06-11 03:40:27","high_star"]