[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2743":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":14,"lastSyncTime":29,"discoverSource":30},2743,"verl-omni","verl-project\u002Fverl-omni","verl-project","RL training framework for diffusion and omni-modality models","https:\u002F\u002Fverl-omni.readthedocs.io\u002Fen\u002Flatest\u002Findex.html",null,"Python",338,50,2,17,0,33,87,229,99,5.12,"Apache License 2.0",false,"main",[],"2026-06-12 02:00:43","\u003Cdiv align=\"center\">\n\n# VeRL-Omni\n\n### Easy, fast, and stable RL training for diffusion and omni-modality models\n\n[![Docs](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocs-Read%20the%20Docs-8A2BE2)](https:\u002F\u002Fverl-omni.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-Apache%202.0-blue)](.\u002FLICENSE) \u003Ca href=\"docs\u002Fassets\u002FWeChat.jpg\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F微信-green?logo=wechat&amp\">\u003C\u002Fa>\n\n\u003C\u002Fdiv>\n\n`VeRL-Omni` is a general RL training framework focused on multimodal generative models, built on top of [`verl`](https:\u002F\u002Fgithub.com\u002Fverl-project\u002Fverl).\n\nIt originated from the multi-modal generation RL effort in `verl`, and now has a dedicated home so it can evolve in a more focused way.\n\n## Why `VeRL-Omni`\n\nMultimodal generative RL training differs from text-only LLM RL not only in model structure, but also in I\u002FO patterns, compute characteristics, and runtime bottlenecks. As this space grows, it deserves a dedicated training repository that can evolve quickly around its own constraints.\n\n### Scope\n\n`VeRL-Omni` targets RL post-training for three families of generative models:\n\n1. **Diffusion generative models** for image, video, and audio — e.g., Qwen-Image, Wan2.2.\n2. **Unified multimodal understanding + generation models** — e.g., BAGEL, HunyuanImage-3.0.\n3. **Omni-modality models** that jointly handle text, image, audio, and video — e.g., Qwen3-Omni.\n\n### What we focus on\n\n- **Specialized rollout** via [`vLLM-Omni`](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-omni) for high-throughput diffusion and multimodal generation.\n- **Flexible reward pipelines** spanning rule-based rewards, model-based rewards, and multimodal reward computation.\n- **Modular training backends** that plug into existing parallelism (FSDP, USP) and other optimizations rather than rebuilding the stack from scratch.\n- **End-to-end examples and benchmarks** validating co-located sync and fully-async RL on the model families above.\n- **High training throughput** — on our reference Qwen-Image FlowGRPO setup, `VeRL-Omni` achieves **~25% higher end-to-end throughput** than the diffusers-based [`flow_grpo`](https:\u002F\u002Fgithub.com\u002Fyifan123\u002Fflow_grpo) implementation, driven by `vLLM-Omni` rollout, FSDP training, and overlapped reward computation (asynchronous).\n\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Farch.png\" alt=\"verl-omni architecture diagram\" width=\"70%\">\n\u003C\u002Fdiv>\n\n\n## Getting Started  🚀\n\nVisit our documentation to learn more.\n\n- [Installation](https:\u002F\u002Fverl-omni.readthedocs.io\u002Fen\u002Flatest\u002Fstart\u002Finstall.html)\n- [Quickstart](https:\u002F\u002Fverl-omni.readthedocs.io\u002Fen\u002Flatest\u002Fstart\u002Fflowgrpo_quickstart.html)\n\n## Model and Algorithm Support 🎨\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Cth>Model\u003C\u002Fth>\n    \u003Cth>Category\u003C\u002Fth>\n    \u003Cth>Modality\u003C\u002Fth>\n    \u003Cth>Algorithm\u003C\u002Fth>\n    \u003Cth>Status\u003C\u002Fth>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd rowspan=\"3\">Qwen-Image\u003C\u002Ftd>\n    \u003Ctd rowspan=\"3\">Diffusion generator\u003C\u002Ftd>\n    \u003Ctd rowspan=\"3\">Text → Image\u003C\u002Ftd>\n    \u003Ctd>FlowGRPO\u003C\u002Ftd>\n    \u003Ctd>✅\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>MixGRPO\u003C\u002Ftd>\n    \u003Ctd>✅\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>GRPO-Guard\u003C\u002Ftd>\n    \u003Ctd>✅\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>Wan2.2\u003C\u002Ftd>\n    \u003Ctd>Diffusion generator\u003C\u002Ftd>\n    \u003Ctd>Text → Video\u003C\u002Ftd>\n    \u003Ctd>DanceGRPO\u003C\u002Ftd>\n    \u003Ctd>WIP\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>BAGEL\u003C\u002Ftd>\n    \u003Ctd>Unified understand + gen\u003C\u002Ftd>\n    \u003Ctd>Text + Image\u003C\u002Ftd>\n    \u003Ctd>FlowGRPO\u003C\u002Ftd>\n    \u003Ctd>WIP\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd rowspan=\"2\">HunyuanImage-3.0\u003C\u002Ftd>\n    \u003Ctd rowspan=\"2\">Unified understand + gen\u003C\u002Ftd>\n    \u003Ctd rowspan=\"2\">Text + Image\u003C\u002Ftd>\n    \u003Ctd>MixGRPO\u003C\u002Ftd>\n    \u003Ctd>Planned\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>SRPO\u003C\u002Ftd>\n    \u003Ctd>Planned\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>Qwen3-Omni-Thinker\u003C\u002Ftd>\n    \u003Ctd>Omni-modality\u003C\u002Ftd>\n    \u003Ctd>Text \u002F Image \u002F Video \u002F Audio\u003C\u002Ftd>\n    \u003Ctd>GSPO\u003C\u002Ftd>\n    \u003Ctd>WIP\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>SD3.5\u003C\u002Ftd>\n    \u003Ctd>Diffusion generator\u003C\u002Ftd>\n    \u003Ctd>Text → Image\u003C\u002Ftd>\n    \u003Ctd>DPO\u003C\u002Ftd>\n    \u003Ctd>WIP\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\n## Ascend NPU Support 💠\n\n`VeRL-Omni` now supports Ascend NPU. For instructions on how to install and get started with FlowGRPO training on Ascend NPU, please refer to our [Ascend NPU Quickstart Guide](https:\u002F\u002Fverl-omni.readthedocs.io\u002Fen\u002Flatest\u002Fstart\u002Fflowgrpo_quickstart_npu.html).\n\n\n## Roadmap 🗺\n\nFuture work is tracked here:\n\n- [RFC: Multi-modal Generation RL 2026Q2 Roadmap](https:\u002F\u002Fgithub.com\u002Fverl-project\u002Fverl\u002Fissues\u002F5755)\n\n## Contributing 🤝\n\nContributions are welcome.\n\nSee the [contribution guide](CONTRIBUTING.md).\n\n## Acknowledgement 🌟\n\n`verl-omni` builds on the engineering foundations developed in [`verl`](https:\u002F\u002Fgithub.com\u002Fverl-project\u002Fverl) and is closely aligned with multimodal inference systems such as [`vLLM-Omni`](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-omni).\n\n## Citation 📚\n\nIf you find the project helpful, please cite:\n\n```bibtex\n@misc{verlomni_github,\n  title        = {{VeRL-Omni: Easy, Fast, and Stable RL Training for Diffusion and Omni-Modality Models}},\n  author       = {Yongxiang Huang and Cheung Kawai and Jingan Zhou and Yingshu Chen and {openYuanrong Team} and Xibin Wu},\n  year         = {2026},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002Fverl-project\u002Fverl-omni}},\n  urldate      = {2026-04-28}\n}\n```\n","VeRL-Omni 是一个专注于多模态生成模型的强化学习（RL）训练框架。该项目基于 `verl` 构建，支持扩散生成模型、统一多模态理解和生成模型以及全模态模型的后训练优化。其核心功能包括通过 `vLLM-Omni` 实现的高吞吐量推演、灵活的奖励计算管道、模块化的训练后端支持现有并行化技术如FSDP和USP，以及提供了端到端示例与基准测试。适用于需要高效处理文本、图像、音频及视频联合生成任务的研究者或开发者场景中，特别是在追求提高训练效率时表现尤为突出。","2026-06-11 02:51:03","CREATED_QUERY"]