[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-74135":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},74135,"StoryDiffusion","HVision-NKU\u002FStoryDiffusion","HVision-NKU","Accepted as [NeurIPS 2024] Spotlight Presentation Paper","",null,"Jupyter Notebook",6423,644,87,117,0,1,9,3,39.43,"Apache License 2.0",false,"main",[],"2026-06-12 02:03:22","\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Ff79da6b7-0b3b-4dd7-8dd0-ba0b15306fe6\" height=100>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n  \n## StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation  [![Paper page](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhuggingface\u002Fbadges\u002Fresolve\u002Fmain\u002Fpaper-page-md-dark.svg)]()\n\n[[Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.01434)] &emsp; [[Project Page](https:\u002F\u002Fstorydiffusion.github.io\u002F)] &emsp;  [[Jittor Version](https:\u002F\u002Fgithub.com\u002FJittorCV\u002Fjittordiffusion\u002Ftree\u002Fmaster)]&emsp; [[🤗 Comic Generation Demo ](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FYupengZhou\u002FStoryDiffusion)] [![Replicate](https:\u002F\u002Freplicate.com\u002Fcjwbw\u002FStoryDiffusion\u002Fbadge)](https:\u002F\u002Freplicate.com\u002Fcjwbw\u002FStoryDiffusion) [![Run Comics Demo in Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FHVision-NKU\u002FStoryDiffusion\u002Fblob\u002Fmain\u002FComic_Generation.ipynb) \u003Cbr>\n\u003C\u002Fdiv>\n\n\n---\n\nOfficial implementation of **[StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation]()**.\n\n### **Demo Video**\n\nhttps:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Fd5b80f8f-09b0-48cd-8b10-daff46d422af\n\n\n### Update History\n\n***You can visit [here](update.md) to visit update history.***\n\n### 🌠  **Key Features:**\nStoryDiffusion can create a magic story by generating consistent images and videos. Our work mainly has two parts: \n1. Consistent self-attention for character-consistent image generation over long-range sequences. It is hot-pluggable and compatible with all SD1.5 and SDXL-based image diffusion models. For the current implementation, the user needs to provide at least 3 text prompts for the consistent self-attention module. We recommend at least 5 - 6 text prompts for better layout arrangement.\n2. Motion predictor for long-range video generation, which predicts motion between Condition Images in a compressed image semantic space, achieving larger motion prediction. \n\n\n\n## 🔥 **Examples**\n\n\n### Comics generation \n\n\n![1](https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Fb3771cbc-b6ca-4e26-bdc5-d944daf9f266)\n\n\n\n### Image-to-Video generation （Results are HIGHLY compressed for speed）\nLeveraging the images produced through our Consistent Self-Attention mechanism, we can extend the process to create videos by seamlessly transitioning between these images. This can be considered as a two-stage long video generation approach.\n\nNote: results are **highly compressed** for speed, you can visit [our website](https:\u002F\u002Fstorydiffusion.github.io\u002F) for the high-quality version.\n#### Two-stage Long Videos Generation (New Update)\nCombining the two parts, we can generate very long and high-quality AIGC videos.\n| Video1 | Video2  | Video3  |\n| --- | --- | --- |\n| \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F4e7e0f24-5f90-419b-9a1e-cdf36d361b26\" width=224>  | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Ff509343d-d691-4e2a-b615-7d96381ef7c1\" width=224> | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F4f0f7abb-4ae4-47a6-b692-5bdd8d9c8006\" width=224>  |\n\n\n#### Long Video Results using Condition Images\nOur Image-to-Video model can generate a video by providing a sequence of user-input condition images.\n| Video1 | Video2  | Video3  |\n| --- | --- | --- |\n| \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Faf6f5c50-c773-4ef2-a757-6d7a46393f39\" width=224>  | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Fd58e4037-d8df-4f90-8c81-ce4b6d2d868e\" width=224> |  \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F40da15ba-f5c1-48d8-84d6-8d327207d696\" width=224>  |\n\n| Video4 | Video5  | Video6  |\n| --- | --- | --- |\n| \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F8f04c9fc-3031-49e3-9de8-83d582b80a1f\" width=224>  | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F604107fb-8afe-4052-bda4-362c646a756e\" width=224> |  \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Fb05fa6a0-12e6-4111-abf8-18b8cd84f3ff\" width=224>  |\n\n\n\n\n#### Short Videos \n\n| Video1 | Video2  | Video3  |\n| --- | --- | --- |\n| \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F5e7f717f-daad-46f6-b3ba-c087bd843158\" width=224>  | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F79aa52b2-bf37-4c9c-8555-c7050aec0cdf\" width=224> | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F9fdfd091-10e6-434e-9ce7-6d6e6d8f4b22\" width=224>  |\n\n\n\n| Video4 | Video5  | Video6  |\n| --- | --- | --- |\n| \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F0b219b60-a998-4820-9657-6abe1747cb6b\" width=224>  | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002Fd387aef0-ffc8-41b0-914f-4b0392d9f8c5\" width=224> | \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FHVision-NKU\u002FStoryDiffusion\u002Fassets\u002F49511209\u002F3c64958a-1079-4ca0-a9cf-e0486adbc57f\" width=224>  |\n\n\n\n\n## 🚩 **TODO\u002FUpdates**\n- [x] Comic Results of StoryDiffusion.\n- [x] Video Results of StoryDiffusion.\n- [x] Source code of Comic Generation\n- [x] Source code of gradio demo\n- [ ] Source code of Video Generation Model\n- [ ] Pretrained weight of Video Generation Model\n---\n\n# 🔧 Dependencies and Installation\n\n- Python >= 3.8 (Recommend to use [Anaconda](https:\u002F\u002Fwww.anaconda.com\u002Fdownload\u002F#linux) or [Miniconda](https:\u002F\u002Fdocs.conda.io\u002Fen\u002Flatest\u002Fminiconda.html))\n- [PyTorch >= 2.0.0](https:\u002F\u002Fpytorch.org\u002F)\n```bash\nconda create --name storydiffusion python=3.10\nconda activate storydiffusion\npip install -U pip\n\n# Install requirements\npip install -r requirements.txt\n```\n# How to use\n\nCurrently, we provide two ways for you to generate comics.\n\n## Use the jupyter notebook\n\nYou can open the `Comic_Generation.ipynb` and run the code.\n\n## Start a local gradio demo\nRun the following command:\n\n\n**(Recommend)** We provide a low GPU Memory cost version, it was tested on a machine with 24GB GPU-memory(Tesla A10) and 30GB RAM, and expected to work well with >20 G GPU-memory.\n\n```python\npython gradio_app_sdxl_specific_id_low_vram.py\n```\n\n\n## Contact\nIf you have any questions, you are very welcome to email ypzhousdu@gmail.com and zhoudaquan21@gmail.com\n\n   \n\n\n# Disclaimer\nThis project strives to impact the domain of AI-driven image and video generation positively. Users are granted the freedom to create images and videos using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.\n\n# Related Resources\nFollowing are some third-party implementations of StoryDiffusion.\n\n\n## API\n\n- [runpod.io serverless worker](https:\u002F\u002Fgithub.com\u002Fbes-dev\u002Fstory-diffusion-runpod-serverless-worker) provided by [BeS](https:\u002F\u002Fgithub.com\u002Fbes-dev).\n- [Replicate worker](https:\u002F\u002Fgithub.com\u002Fcamenduru\u002FStoryDiffusion-replicate) provided by [camenduru](https:\u002F\u002Fgithub.com\u002Fcamenduru).\n\n\n\n\n# BibTeX\nIf you find StoryDiffusion useful for your research and applications, please cite using this BibTeX:\n\n```BibTeX\n@article{zhou2024storydiffusion,\n  title={StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation},\n  author={Zhou, Yupeng and Zhou, Daquan and Cheng, Ming-Ming and Feng, Jiashi and Hou, Qibin},\n  journal={NeurIPS 2024},\n  year={2024}\n}\n","StoryDiffusion 是一个用于长序列图像和视频生成的项目，通过一致的自注意力机制确保生成内容的一致性和连贯性。其核心功能包括基于文本提示的长序列图像生成以及通过运动预测器实现的长视频生成。该项目支持与SD1.5和SDXL等扩散模型兼容，用户需要提供至少3个文本提示以获得最佳效果。StoryDiffusion 适用于需要创建连贯故事线的场景，如漫画生成、从图像到视频的转换及长视频创作，特别适合于艺术创作、娱乐内容生产等领域。",2,"2026-06-11 03:48:58","high_star"]