[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72472":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":31,"readmeContent":32,"aiSummary":33,"trendingCount":16,"starSnapshotCount":16,"syncStatus":34,"lastSyncTime":35,"discoverSource":36},72472,"img2img-turbo","GaParmar\u002Fimg2img-turbo","GaParmar","One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more","",null,"Python",2451,289,22,112,0,4,15,12,70.89,"MIT License",false,"main",true,[26,27,28,29,30],"computer-vision","deep-learning","generative-adversarial-network","generative-art","stable-diffusion","2026-06-12 04:01:05","# img2img-turbo\n\n[**Paper**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.12036) | [**Sketch2Image Demo**](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fgparmar\u002Fimg2img-turbo-sketch) \n#### **Quick start:** [**Running Locally**](#getting-started) | [**Gradio (locally hosted)**](#gradio-demo) | [**Training**](#training-with-your-own-data)\n\n### Cat Sketching\n\u003Cp align=\"left\" >\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FGaParmar\u002Fimg2img-turbo\u002Fmain\u002Fassets\u002Fcat_2x.gif\" width=\"800\" \u002F>\n\u003C\u002Fp>\n\n### Fish Sketching\n\u003Cp align=\"left\">\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FGaParmar\u002Fimg2img-turbo\u002Fmain\u002Fassets\u002Ffish_2x.gif\"  width=\"800\" \u002F>\n\u003C\u002Fp>\n\n\nWe propose a general method for adapting a single-step diffusion model, such as SD-Turbo, to new tasks and domains through adversarial learning. This enables us to leverage the internal knowledge of pre-trained diffusion models while achieving efficient inference (e.g., for 512x512 images, 0.29 seconds on A6000 and 0.11 seconds on A100). \n\nOur one-step conditional models **CycleGAN-Turbo** and **pix2pix-turbo** can perform various image-to-image translation tasks for both unpaired and paired settings. CycleGAN-Turbo outperforms existing GAN-based and diffusion-based methods, while pix2pix-turbo is on par with recent works such as ControlNet for Sketch2Photo and Edge2Image, but with one-step inference. \n\n[One-Step Image Translation with Text-to-Image Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.12036)\u003Cbr>\n[Gaurav Parmar](https:\u002F\u002Fgauravparmar.com\u002F), [Taesung Park](https:\u002F\u002Ftaesung.me\u002F), [Srinivasa Narasimhan](https:\u002F\u002Fwww.cs.cmu.edu\u002F~srinivas\u002F), [Jun-Yan Zhu](https:\u002F\u002Fgithub.com\u002Fjunyanz\u002F)\u003Cbr>\nCMU and Adobe, arXiv 2403.12036\n\n\u003Cbr>\n\u003Cdiv>\n\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fteaser_results.jpg' align=\"center\" width=1000px>\n\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\n\n\n## Results\n\n### Paired Translation with pix2pix-turbo\n**Edge to Image**\n\u003Cdiv>\n\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fedge_to_image_results.jpg' align=\"center\" width=800px>\n\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\u003C!-- **Sketch to Image**\nTODO -->\n### Generating Diverse Outputs\nBy varying the input noise map, our method can generate diverse outputs from the same input conditioning.\nThe output style can be controlled by changing the text prompt.\n\u003Cdiv> \u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fgen_variations.jpg' align=\"center\" width=800px>\n\u003C\u002Fp> \u003C\u002Fdiv>\n\n### Unpaired Translation with CycleGAN-Turbo\n\n**Day to Night**\n\u003Cdiv> \u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fday2night_results.jpg' align=\"center\" width=800px>\n\u003C\u002Fp> \u003C\u002Fdiv>\n\n**Night to Day**\n\u003Cdiv>\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fnight2day_results.jpg' align=\"center\" width=800px>\n\u003C\u002Fp> \u003C\u002Fdiv>\n\n**Clear to Rainy**\n\u003Cdiv>\n\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fclear2rainy_results.jpg' align=\"center\" width=800px>\n\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n**Rainy to Clear**\n\u003Cdiv>\n\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Frainy2clear.jpg' align=\"center\" width=800px>\n\u003C\u002Fp>\n\u003C\u002Fdiv>\n\u003Chr>\n\n\n## Method\n**Our Generator Architecture:**\nWe tightly integrate three separate modules in the original latent diffusion models into a single end-to-end network with small trainable weights. This architecture allows us to translate the input image x to the output y, while retaining the input scene structure. We use LoRA adapters in each module, introduce skip connections and Zero-Convs between input and output, and retrain the first layer of the U-Net. Blue boxes indicate trainable layers. Semi-transparent layers are frozen. The same generator can be used for various GAN objectives.\n\u003Cdiv>\n\u003Cp align=\"center\">\n\u003Cimg src='assets\u002Fmethod.jpg' align=\"center\" width=900px>\n\u003C\u002Fp>\n\u003C\u002Fdiv>\n\n\n## Getting Started\n**Environment Setup**\n- We provide a [conda env file](environment.yaml) that contains all the required dependencies.\n    ```\n    conda env create -f environment.yaml\n    ```\n- Following this, you can activate the conda environment with the command below. \n  ```\n  conda activate img2img-turbo\n  ```\n- Or use virtual environment:\n  ```\n  python3 -m venv venv\n  source venv\u002Fbin\u002Factivate\n  pip install -r requirements.txt\n  ```\n**Paired Image Translation (pix2pix-turbo)**\n- The following command takes an image file and a prompt as inputs, extracts the canny edges, and saves the results in the directory specified.\n    ```bash\n    python src\u002Finference_paired.py --model_name \"edge_to_image\" \\\n        --input_image \"assets\u002Fexamples\u002Fbird.png\" \\\n        --prompt \"a blue bird\" \\\n        --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input Image\u003C\u002Fth>\n    \u003Cth>Canny Edges\u003C\u002Fth>\n    \u003Cth>Model Output\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fbird.png' width=\"200px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fbird_canny.png' width=\"200px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fbird_canny_blue.png' width=\"200px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n    \u003Cbr>\n\n- The following command takes a sketch and a prompt as inputs, and saves the results in the directory specified.\n    ```bash\n    python src\u002Finference_paired.py --model_name \"sketch_to_image_stochastic\" \\\n    --input_image \"assets\u002Fexamples\u002Fsketch_input.png\" --gamma 0.4 \\\n    --prompt \"ethereal fantasy concept art of an asteroid. magnificent, celestial, ethereal, painterly, epic, majestic, magical, fantasy art, cover art, dreamy\" \\\n    --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input\u003C\u002Fth>\n    \u003Cth>Model Output\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fsketch_input.png' width=\"400px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fsketch_output.png' width=\"400px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n    \u003Cbr>\n\n**Unpaired Image Translation (CycleGAN-Turbo)**\n- The following command takes a **day** image file as input, and saves the output **night** in the directory specified.\n    ```\n    python src\u002Finference_unpaired.py --model_name \"day_to_night\" \\\n        --input_image \"assets\u002Fexamples\u002Fday2night_input.png\" --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input (day)\u003C\u002Fth>\n    \u003Cth>Model Output (night)\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fday2night_input.png' width=\"400px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fday2night_output.png' width=\"400px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\n- The following command takes a **night** image file as input, and saves the output **day** in the directory specified.\n    ```\n    python src\u002Finference_unpaired.py --model_name \"night_to_day\" \\\n        --input_image \"assets\u002Fexamples\u002Fnight2day_input.png\" --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input (night)\u003C\u002Fth>\n    \u003Cth>Model Output (day)\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fnight2day_input.png' width=\"400px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fnight2day_output.png' width=\"400px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\n- The following command takes a **clear** image file as input, and saves the output **rainy** in the directory specified.\n    ```\n    python src\u002Finference_unpaired.py --model_name \"clear_to_rainy\" \\\n        --input_image \"assets\u002Fexamples\u002Fclear2rainy_input.png\" --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input (clear)\u003C\u002Fth>\n    \u003Cth>Model Output (rainy)\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fclear2rainy_input.png' width=\"400px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Fclear2rainy_output.png' width=\"400px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\n- The following command takes a **rainy** image file as input, and saves the output **clear** in the directory specified.\n    ```\n    python src\u002Finference_unpaired.py --model_name \"rainy_to_clear\" \\\n        --input_image \"assets\u002Fexamples\u002Frainy2clear_input.png\" --output_dir \"outputs\"\n    ```\n    \u003Ctable>\n    \u003Cth>Input (rainy)\u003C\u002Fth>\n    \u003Cth>Model Output (clear)\u003C\u002Fth>\n    \u003C\u002Ftr>\n    \u003Ctr>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Frainy2clear_input.png' width=\"400px\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src='assets\u002Fexamples\u002Frainy2clear_output.png' width=\"400px\">\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\n\n\n## Gradio Demo\n- We provide a Gradio demo for the paired image translation tasks.\n- The following command will launch the sketch to image locally using gradio.\n    ```\n    gradio gradio_sketch2image.py\n    ```\n- The following command will launch the canny edge to image gradio demo locally.\n   ```\n    gradio gradio_canny2image.py\n   ```\n\n\n## Training with your own data\n- See the steps [here](docs\u002Ftraining_pix2pix_turbo.md) for training a pix2pix-turbo model on your paired data.\n- See the steps [here](docs\u002Ftraining_cyclegan_turbo.md) for training a CycleGAN-Turbo model on your unpaired data.\n\n\n## Acknowledgment\nOur work uses the Stable Diffusion-Turbo as the base model with the following [LICENSE](https:\u002F\u002Fhuggingface.co\u002Fstabilityai\u002Fsd-turbo\u002Fblob\u002Fmain\u002FLICENSE).\n","img2img-turbo 是一个基于 Stable Diffusion 的图像到图像转换项目，支持多种任务如草图转图像、白天转夜晚等。其核心功能包括通过对抗学习适应新任务和领域的一步扩散模型（如 CycleGAN-Turbo 和 pix2pix-turbo），能够在保持高效推理的同时利用预训练模型的内部知识。该项目特别适合需要快速且高质量图像转换的应用场景，例如创意设计、图像编辑以及风格迁移等。技术上，它实现了在512x512分辨率下A6000显卡仅需0.29秒、A100显卡仅需0.11秒的快速处理能力，同时提供多样化输出控制与配对\u002F非配对翻译任务的支持。",2,"2026-06-11 03:42:11","high_star"]