[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72615":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72615,"DreamOmni2","JIA-Lab-research\u002FDreamOmni2","JIA-Lab-research","This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation  (CVPR2026 Highlight)''","",null,"Python",2027,175,79,23,0,3,7,60.94,"Apache License 2.0",false,"main",[24,25,26],"image-editing","image-generation","unified-generation-editing-model","2026-06-12 04:01:06","# DreamOmni2: Multimodal Instruction-based Editing and Generation (CVPR2026 Highlight)\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fhtml\u002F2510.06679v1\">\n            \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv%20paper-2510.06679v1-b31b1b.svg\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fpbihao.github.io\u002Fprojects\u002FDreamOmni2\u002Findex.html\">\n        \u003Cimg alt=\"Project Page\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-blue\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=8xpoiRK57uU\">\n        \u003Cimg alt=\"Video Demo\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVideo-Demo-red\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxiabs\u002FDreamOmni2Bench\">\n        \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDreamOmni2-Benchmark-green\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fxiabs\u002FDreamOmni2\">\n        \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗-HF%20Model-yellow\">\n    \u003C\u002Fa>    \n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fwcy1122\u002FDreamOmni2-Edit\">\n        \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗-HF%20Editing%20Demo-yellow\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fwcy1122\u002FDreamOmni2-Gen\">\n        \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗-HF%20Generation%20Demo-yellow\">\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fwww.runninghub.ai\u002Fworkflow\u002F1980131298238959618\">\n        \u003Cimg alt=\"Build\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FComfyUI-Runninghub-blue\">\n    \u003C\u002Fa>\n\u003C\u002Fp>\n\n## 🔥 News\n- 🔥**2025.10.10**: Release DreamOmni2 [editing demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fwcy1122\u002FDreamOmni2-Edit) and [generation demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fwcy1122\u002FDreamOmni2-Gen)\n- 🔥**2025.10.10**: Release DreamOmni2 [Benchmark](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fxiabs\u002FDreamOmni2Bench).\n- 🔥**2025.10.10**: Release DreamOmni2's [codes](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FDreamOmni2) and [models](https:\u002F\u002Fhuggingface.co\u002Fxiabs\u002FDreamOmni2).\n- 🔥**2025.10.09**: Release DreamOmni2 [tech report](https:\u002F\u002Farxiv.org\u002Fhtml\u002F2510.06679v1).\n\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"600\" src=\"imgs\u002Fgallery.png\">\n\u003C\u002Fp>\n\n\n\u003Cdiv align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fcloud.video.taobao.com\u002Fvod\u002FHxWB8i8sYkh0DdfvfByoMHqRtezNMCpWJdjzWTOCqdY.mp4\">\n    \u003Cimg src=\"imgs\u002Fcover.png\" alt=\"Watch the video\" style=\"width: 600px;\">\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\n## Introduction\n\n**(1) Multimodal Instruction-based Generation**\n\nFor traditional subject-driven generation based on concrete objects, DreamOmni2 achieves the best results among open-source models, showing superior identity and pose consistency. Additionally, DreamOmni2 can reference abstract attributes (such as material, texture, makeup, hairstyle, posture, design style, artistic style, etc.), even surpassing commercial models in this area.\n\n**(2) Multimodal Instruction-based Editing**\n\nBeyond traditional instruction-based editing models, DreamOmni2 supports multimodal instruction editing. In everyday editing tasks, there are often elements that are difficult to describe purely with language and require reference images. Our model addresses this need, supporting references to any concrete objects and abstract attributes, with performance comparable to commercial models.\n\n**(3) Unified Generation and Editing Model**\n\nBuilding upon these two new tasks, we introduce DreamOmni2, which is capable of multimodal instruction-based editing and generation under any concrete or abstract concept guidance. Overall, DreamOmni2 is a more intelligent and powerful open-sourced unified generation and editing model, offering enhanced capabilities across a wide range of tasks.\n\n## Editing and Generation Model?\nEditing and generation are distinct tasks. Editing requires strict consistency in preserving the non-edited areas of the source image, while generation only needs to retain the ID, IP, or attribution from the reference image as per the instructions, allowing the entire image to be regenerated with a focus on aesthetics. We’ve found that the instructions for generation and editing are often similar, so we’ve separated these two tasks to make it easier for users to choose the appropriate task type.\n\n## Quick Start\n\n### Requirements and Installation\n\nFirst, install the necessary dependencies:\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FDreamOmni2\ncd .\u002FDreamOmni2\npip install -r requirements.txt\n```\n\nNext, download the DreamOmni2 weights into the models folder.\n\n```bash\nhuggingface-cli download --resume-download --local-dir-use-symlinks False xiabs\u002FDreamOmni2 --local-dir .\u002Fmodels\n```\n\n### Inference\n\nMultimodal Instriction-based Editing\n\n**Notably, for editing tasks, due to the format settings of the training data, we need to place the image to be edited in the first position.**\n\n```bash\npython3 \u002Fmnt\u002Fbn\u002Funifygen\u002Fxiabin_dev\u002Ficlr2026\u002FDreamOmni2\u002Finference_edit.py \\\n    --input_img_path \"example_input\u002Fedit_tests\u002Fsrc.jpg\" \"example_input\u002Fedit_tests\u002Fref.jpg\" \\\n    --input_instruction \"Make the woman from the second image stand on the road in the first image.\" \\\n    --output_path \"example_input\u002Fedit_tests\u002Fedit_res.png\"\n```\n\nMultimodal Instriction-based Generation\n```bash\npython3 \u002Fmnt\u002Fbn\u002Funifygen\u002Fxiabin_dev\u002Ficlr2026\u002FDreamOmni2\u002Finference_gen.py \\\n    --input_img_path \"example_input\u002Fgen_tests\u002Fimg1.jpg\" \"example_input\u002Fgen_tests\u002Fimg2.jpg\" \\\n    --input_instruction \"In the scene, the character from the first image stands on the left, and the character from the second image stands on the right. They are shaking hands against the backdrop of a spaceship interior.\" \\\n    --output_path \"example_input\u002Fgen_tests\u002Fgen_res.png\" \\\n    --height 1024 \\\n    --width 1024\n```\n\n\n### Web Demo\n```\nCUDA_VISIBLE_DEVICES=0 python web_edit.py \\\n    --vlm_path PATH_TO_VLM \\\n    --edit_lora_path PATH_TO_DEIT_LORA \\\n    --server_name \"0.0.0.0\" \\\n    --server_port 7860\n\n\nCUDA_VISIBLE_DEVICES=1 python web_generate.py \\\n    --vlm_path PATH_TO_VLM \\\n    --gen_lora_path PATH_TO_GENERATION_LORA \\\n    --server_name \"0.0.0.0\" \\\n    --server_port 7861\n```\n\n\n## Disclaimer\n\nThis project strives to impact the domain of AI-driven image generation positively. Users are granted the freedom to\ncreate images using this tool, but they are expected to comply with local laws and utilize it responsibly.\nThe developers do not assume any responsibility for potential misuse by users.\n\n\n##  Citation\n\nIf DreamOmni2 is helpful, please help to ⭐ the repo.\n\nIf you find this project useful for your research, please consider citing our [paper](https:\u002F\u002Farxiv.org\u002Fhtml\u002F2510.06679v1).\n\n## Contact\nIf you have any comments or questions, please [open a new issue](https:\u002F\u002Fgithub.com\u002Fxxx\u002Fxxx\u002Fissues\u002Fnew\u002Fchoose) or contact [Bin Xia](zjbinxia@gmail.com).\n\n\n\n\n\n\n\n\n\n","DreamOmni2是一个基于多模态指令的图像编辑和生成项目，实现了CVPR 2026高亮论文中的模型。其核心功能包括高质量的图像生成与编辑，能够根据具体对象或抽象属性（如材质、纹理、妆容等）进行操作，并且在保持身份和姿态一致性方面表现出色。技术上，DreamOmni2使用Python开发，支持通过文本及参考图像来指导编辑过程，超越了现有的开源甚至部分商业解决方案。适用于需要精细化控制图像内容创作的场景，比如艺术设计、虚拟形象创建等领域。",2,"2026-06-11 03:42:49","high_star"]