[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72416":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":23,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":39,"readmeContent":40,"aiSummary":41,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":42,"discoverSource":43},72416,"InfiniteYou","bytedance\u002FInfiniteYou","bytedance","🔥 [ICCV 2025 Highlight] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity","https:\u002F\u002Fbytedance.github.io\u002FInfiniteYou\u002F",null,"Python",2682,290,28,25,0,2,4,60.79,"Apache License 2.0",false,"main",true,[25,26,27,28,29,30,31,32,33,34,35,36,37,38],"diffusers","diffusion","diffusion-transformer","dit","face","flux","iccv2025","identity-preserving","image-editing","image-generation","personalization","pytorch","research","text-to-image","2026-06-12 04:01:05","\u003Cdiv align=\"center\">\n\n## InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity\n\n[**Liming Jiang**](https:\u002F\u002Fliming-jiang.com\u002F)&nbsp;&nbsp;&nbsp;&nbsp;\n[**Qing Yan**](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=0TIYjPAAAAAJ)&nbsp;&nbsp;&nbsp;&nbsp;\n[**Yumin Jia**](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fyuminjia\u002F)&nbsp;&nbsp;&nbsp;&nbsp;\n[**Zichuan Liu**](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=-H18WY8AAAAJ)&nbsp;&nbsp;&nbsp;&nbsp;\n[**Hao Kang**](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=VeTCSyEAAAAJ)&nbsp;&nbsp;&nbsp;&nbsp;\n[**Xin Lu**](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=mFC0wp8AAAAJ)\u003Cbr \u002F>\nByteDance Intelligent Creation\u003Cbr \u002F>\n**ICCV 2025 (\u003Cspan style=\"color:#F44336\">Highlight\u003C\u002Fspan>)**\n\n\u003Ca href=\"https:\u002F\u002Fbytedance.github.io\u002FInfiniteYou\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=Project&message=Page&color=blue&logo=github-pages\">\u003C\u002Fa> &ensp;\n\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16418\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=ArXiv&message=Paper&color=darkred&logo=arxiv\">\u003C\u002Fa> &ensp;\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=%F0%9F%A4%96%20Released&message=Models&color=green\">\u003C\u002Fa> &ensp;\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fbytedance\u002FComfyUI_InfiniteYou\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=%E2%9A%99%EF%B8%8F%20ComfyUI&message=Node&color=purple\">\u003C\u002Fa> &ensp;\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FByteDance\u002FInfiniteYou-FLUX\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=%F0%9F%A4%97%20Hugging%20Face&message=Demo&color=orange\">\u003C\u002Fa> &ensp;\n\n\u003C\u002Fdiv>\n\n![teaser](.\u002Fassets\u002Fteaser.jpg)\n\n> **Abstract:** *Achieving flexible and high-fidelity identity-preserved image generation remains formidable, particularly with advanced Diffusion Transformers (DiTs) like FLUX. We introduce **InfiniteYou (InfU)**, one of the earliest robust frameworks leveraging DiTs for this task. InfU addresses significant issues of existing methods, such as insufficient identity similarity, poor text-image alignment, and low generation quality and aesthetics. Central to InfU is InfuseNet, a component that injects identity features into the DiT base model via residual connections, enhancing identity similarity while maintaining generation capabilities. A multi-stage training strategy, including pretraining and supervised fine-tuning (SFT) with synthetic single-person-multiple-sample (SPMS) data, further improves text-image alignment, ameliorates image quality, and alleviates face copy-pasting. Extensive experiments demonstrate that InfU achieves state-of-the-art performance, surpassing existing baselines. In addition, the plug-and-play design of InfU ensures compatibility with various existing methods, offering a valuable contribution to the broader community.*\n\n\n## 🔥 News\n\n- [07\u002F2025] 🔥 The [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16418) of InfiniteYou is selected as ICCV 2025 (\u003Cspan style=\"color:#F44336\">**Highlight**\u003C\u002Fspan>).\n\n- [06\u002F2025] 🔥 The [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16418) of InfiniteYou is accepted to ICCV 2025.\n\n- [04\u002F2025] 🔥 The official [ComfyUI node](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FComfyUI_InfiniteYou) is released. Unofficial [ComfyUI contributions](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FInfiniteYou#comfyui-nodes) are appreciated.\n\n- [04\u002F2025] 🔥 Quantization and offloading [options](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FInfiniteYou#memory-requirements) are provided to reduce the memory requirements for InfiniteYou-FLUX v1.0.\n\n- [03\u002F2025] 🔥 The [code](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FInfiniteYou), [model](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou), and [demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FByteDance\u002FInfiniteYou-FLUX) of InfiniteYou-FLUX v1.0 are released.\n\n- [03\u002F2025] 🔥 The [project page](https:\u002F\u002Fbytedance.github.io\u002FInfiniteYou) of InfiniteYou is created.\n\n- [03\u002F2025] 🔥 The [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16418) of InfiniteYou is released on arXiv.\n\n\n## 💡 Important Usage Tips\n\n- We released two model variants of InfiniteYou-FLUX v1.0: [aes_stage2](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\u002Ftree\u002Fmain\u002Finfu_flux_v1.0\u002Faes_stage2) and [sim_stage1](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\u002Ftree\u002Fmain\u002Finfu_flux_v1.0\u002Fsim_stage1). The `aes_stage2` is our model after SFT, which is used by default for better text-image alignment and aesthetics. For higher ID similarity, please try `sim_stage1` (using `--model_version` to switch). More details can be found in our [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16418).\n\n- To better fit specific personal needs, we find that two arguments are highly useful to adjust: \u003Cbr \u002F>`--infusenet_conditioning_scale` (default: `1.0`) and `--infusenet_guidance_start` (default: `0.0`). Usually, you may NOT need to adjust them. If necessary, start by trying a slightly larger `--infusenet_guidance_start` (*e.g.*, `0.1`) only (especially helpful for `sim_stage1`). If still not satisfactory, then try a slightly smaller `--infusenet_conditioning_scale` (*e.g.*, `0.9`).\n\n- We also provided two LoRAs ([Realism](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F631986?modelVersionId=706528) and [Anti-blur](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F675581\u002Fanti-blur-flux-lora)) to enable additional usage flexibility. If needed, try `Realism` only first.  They are *entirely optional*, which are examples to try but are NOT used in our paper.\n\n- If the generated gender does not align with your preferences, try adding specific words in the text prompt, such as 'a man', 'a woman', *etc*. We encourage users to use inclusive and respectful language.\n\n\n## :european_castle: Model Zoo\n\n| InfiniteYou Version | Model Version | Base Model Trained with | Description |  \n| :---: | :---: | :---: | :---: |\n| [InfiniteYou-FLUX v1.0](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou) | [aes_stage2](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\u002Ftree\u002Fmain\u002Finfu_flux_v1.0\u002Faes_stage2) | [FLUX.1-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002FFLUX.1-dev) | Stage-2 model after SFT. Better text-image alignment and aesthetics. |\n| [InfiniteYou-FLUX v1.0](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou) | [sim_stage1](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\u002Ftree\u002Fmain\u002Finfu_flux_v1.0\u002Fsim_stage1) | [FLUX.1-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002FFLUX.1-dev) | Stage-1 model before SFT. Higher identity similarity. |\n\n\n## 🔧 Requirements and Installation\n\n### Dependencies\n\nSimply run this one-line command to install (feel free to create a `python3` virtual environment before you run):\n\n```bash\npip install -r requirements.txt\n```\n\n### Memory Requirements \n\n- **Full-performance**: The original `bf16` model inference requires a **peak VRAM** of around **43GB**.\n\n- **Fast CPU offloading**: By specifying only `--cpu_offload` in [test.py](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FInfiniteYou\u002Fblob\u002Fmain\u002Ftest.py#L44), the **peak VRAM** is reduced to around **30GB** with **NO** performance degradation.\n\n- **8-bit quantization**: By specifying only `--quantize_8bit` in [test.py](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FInfiniteYou\u002Fblob\u002Fmain\u002Ftest.py#L44), the **peak VRAM** is reduced to around **24GB** with performance remaining very similar.\n\n- **Combining fast CPU offloading and 8-bit quantization**: By specifying both `--cpu_offload` and \u003Cbr \u002F>`--quantize_8bit`, the **peak VRAM** is further reduced to around **16GB** with performance remaining very similar.\n\nIf you want to use our models but only have a GPU with even less VRAM, please further refer to [Diffusers memory reduction tips](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fdiffusers\u002Fen\u002Foptimization\u002Fmemory), where some more aggressive strategies may be helpful. Community contributions are also welcome.\n\n\n## ⚡️ Quick Inference\n\n### Local Inference Script\n\n```bash\npython test.py --id_image .\u002Fassets\u002Fexamples\u002Fman.jpg --prompt \"A man, portrait, cinematic\" --out_results_dir .\u002Fresults\n```\n\n\u003Cdetails>\n\u003Csummary style='font-size:20px'>\u003Cb>\u003Ci>Explanation of all the arguments (click to expand!)\u003C\u002Fi>\u003C\u002Fb>\u003C\u002Fsummary>\n\n- Input and output:\n  - `--id_image (str)`: The path to the input identity (ID) image. Default: `.\u002Fassets\u002Fexamples\u002Fman.jpg`.\n  - `--prompt (str)`: The text prompt for image generation. Default: `A man, portrait, cinematic`.\n  - `--out_results_dir (str)`: The path to the output directory to save the generated results. Default: `.\u002Fresults`.\n  - `--control_image (str or None)`: The path to the control image \\[*optional*\\] to extract five facical keypoints to control the generation. Default: `None`.\n  - `--base_model_path (str)`: The huggingface or local path to the base model. Default: `black-forest-labs\u002FFLUX.1-dev`.\n  - `--model_dir (str)`: The path to the InfiniteYou model directory. Default: `ByteDance\u002FInfiniteYou`.\n- Version control:\n  - `--infu_flux_version (str)`: InfiniteYou-FLUX version: currently only `v1.0` is supported. Default: `v1.0`.\n  - `--model_version (str)`: The model variant to use: `aes_stage2` | `sim_stage1`. Default: `aes_stage2`.\n- General inference arguments:\n  - `--cuda_device (int)`: The cuda device ID to use. Default: `0`.\n  - `--seed (int)`: The seed for reproducibility (0 for random). Default: `0`.\n  - `--guideance_scale (float)`: The guidance scale for the diffusion process. Default: `3.5`.\n  - `--num_steps (int)`: The number of inference steps. Default: `30`.\n- InfiniteYou-specific arguments:\n  - `--infusenet_conditioning_scale (float)`: The scale for the InfuseNet conditioning. Default: `1.0`.\n  - `--infusenet_guidance_start (float)`: The start point for the InfuseNet guidance injection. Default: `0.0`.\n  - `--infusenet_guidance_end (float)`: The end point for the InfuseNet guidance injection. Default: `1.0`.\n- Optional LoRAs:\n  - `--enable_realism_lora (store_true)`: Whether to enable the Realism LoRA. Default: `False`.\n  - `--enable_anti_blur_lora (store_true)`: Whether to enable the Anti-blur LoRA. Default: `False`.\n- Memory reduction options:\n  - `--quantize_8bit (store_true)`: Whether to quantize the model to the 8-bit format. Default: `False`.\n  - `--cpu_offload (store_true)`: Whether to use fast CPU offloading. Default: `False`.\n\n\u003C\u002Fdetails>\n\n\n### Local Gradio Demo\n\n```bash\npython app.py\n```\n\n### Online Hugging Face Demo\n\nWe appreciate the GPU grant from the Hugging Face team. \nYou can also try our [InfiniteYou-FLUX Hugging Face demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FByteDance\u002FInfiniteYou-FLUX) online.\n\n### ComfyUI Nodes\n\n- **Official ComfyUI native node implementation**\n  - [bytedance\u002FComfyUI_InfiniteYou](https:\u002F\u002Fgithub.com\u002Fbytedance\u002FComfyUI_InfiniteYou)\n\n- **Unofficial contributions**\n  - [ZenAI-Vietnam\u002FComfyUI_InfiniteYou](https:\u002F\u002Fgithub.com\u002FZenAI-Vietnam\u002FComfyUI_InfiniteYou)\n  - [katalist-ai\u002FComfyUI-InfiniteYou](https:\u002F\u002Fgithub.com\u002Fkatalist-ai\u002FComfyUI-InfiniteYou)\n  - [niknah\u002FComfyUI-InfiniteYou](https:\u002F\u002Fgithub.com\u002Fniknah\u002FComfyUI-InfiniteYou)\n  - [game4d\u002FComfyUI-BDsInfiniteYou](https:\u002F\u002Fgithub.com\u002Fgame4d\u002FComfyUI-BDsInfiniteYou)\n  - [GGUF version](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F1424364?modelVersionId=1617144) (16GB VRAM) and [Christmas Toy LoRA](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F1466015?modelVersionId=1658038) by [@MegaCocos](https:\u002F\u002Fgithub.com\u002FMegaCocos)\n\n\n## 🆚 Comparison with State-of-the-Art Relevant Methods\n\n![comparative_results](.\u002Fassets\u002Fcomparative_results.jpg)\n\nQualitative comparison results of InfU with the state-of-the-art baselines, FLUX.1-dev IP-Adapter and PuLID-FLUX. The identity similarity and text-image alignment of the results generated by FLUX.1-dev IP-Adapter (IPA) are inadequate. PuLID-FLUX generates images with decent identity similarity. However, it suffers from poor text-image alignment (Columns 1, 2, 4), and the image quality (e.g., bad hands in Column 5) and aesthetic appeal are degraded. In addition, the face copy-paste issue of PuLID-FLUX is evident (Column 5). In comparison, the proposed InfU outperforms the baselines across all dimensions.\n\n\n## ⚙️ Plug-and-Play Property with Off-the-Shelf Popular Approaches\n\n![plug_and_play](.\u002Fassets\u002Fplug_and_play.jpg)\n\nInfU features a desirable plug-and-play design, compatible with many existing methods. It naturally supports base model replacement with any variants of FLUX.1-dev, such as FLUX.1-schnell for more efficient generation (e.g., in 4 steps). The compatibility with ControlNets and LoRAs provides more controllability and flexibility for customized tasks. Notably, the compatibility with OminiControl extends our potential for multi-concept personalization, such as interacted identity (ID) and object personalized generation. InfU is also compatible with IP-Adapter (IPA) for stylization of personalized images, producing decent results when injecting style references via IPA. Our plug-and-play feature may extend to even more approaches, providing valuable contributions to the broader community.\n\n\n## 📜 Disclaimer and Licenses\n\nThe images used in this repository and related demos are sourced from consented subjects or generated by the models. These pictures are intended solely to showcase the capabilities of our research. If you have any concerns, please feel free to contact us, and we will promptly remove any inappropriate content.\n\nThe use of the released code, model, and demo must strictly adhere to the respective licenses. Our code is released under the [Apache License 2.0](.\u002FLICENSE), and our model is released under the [Creative Commons Attribution-NonCommercial 4.0 International Public License](https:\u002F\u002Fhuggingface.co\u002FByteDance\u002FInfiniteYou\u002Fblob\u002Fmain\u002FLICENSE) for academic research purposes only. Any manual or automatic downloading of the face models from [InsightFace](https:\u002F\u002Fgithub.com\u002Fdeepinsight\u002Finsightface), the [FLUX.1-dev](https:\u002F\u002Fhuggingface.co\u002Fblack-forest-labs\u002FFLUX.1-dev) base model, LoRAs ([Realism](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F631986?modelVersionId=706528) and [Anti-blur](https:\u002F\u002Fcivitai.com\u002Fmodels\u002F675581\u002Fanti-blur-flux-lora)), *etc.*, must follow their original licenses and be used only for academic research purposes.\n\nThis research aims to positively impact the field of Generative AI. Any usage of this method must be responsible and comply with local laws. The developers do not assume any responsibility for any potential misuse.\n\n\n## 🤗 Acknowledgments\n\nWe sincerely acknowledge the insightful discussions from Stathi Fotiadis, Min Jin Chong, Xiao Yang, Tiancheng Zhi, Jing Liu, and Xiaohui Shen. We genuinely appreciate the help from Jincheng Liang and Lu Guo with our user study and qualitative evaluation.\n\n\n## 📖 Citation\n\nIf you find InfiniteYou useful for your research or applications, please cite our paper:\n\n```bibtex\n@inproceedings{jiang2025infiniteyou,\n  title={{InfiniteYou}: Flexible Photo Recrafting While Preserving Your Identity},\n  author={Jiang, Liming and Yan, Qing and Jia, Yumin and Liu, Zichuan and Kang, Hao and Lu, Xin},\n  booktitle={ICCV},\n  year={2025}\n}\n```\n\nWe also appreciate it if you could give a star :star: to this repository. Thanks a lot!\n","InfiniteYou 是一个用于在保持个人身份特征的同时灵活重构照片的项目。它基于先进的扩散变换器（DiTs），如FLUX，通过其核心组件InfuseNet将身份特征注入基础模型，解决了现有方法中身份相似度不足、图文对齐差以及生成质量低等问题。该项目采用多阶段训练策略，包括预训练和监督微调，进一步提升了图像质量和文本-图像对齐效果。InfiniteYou 适用于需要高质量个性化图像生成的场景，例如社交媒体上的个人形象管理、创意内容制作等。此外，它的即插即用设计使其能够与多种现有技术兼容，为更广泛的应用提供了可能性。","2026-06-11 03:41:58","high_star"]