[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72514":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72514,"DiffRhythm","ASLP-lab\u002FDiffRhythm","ASLP-lab","Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion","",null,"Python",2308,269,43,50,0,7,10,15,21,79.79,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:06","\u003Cp align=\"center\">\n    \u003Cimg src=\"src\u002FDiffRhythm_logo.jpg\" width=\"400\"\u002F>\n\u003Cp>\n\n\u003Cp align=\"center\">\n   \u003Ch1>Di♪♪Rhythm: Blazingly Fast and Embarrassingly Simple\u003C\u002Fbr>End-to-End Full-Length Song Generation with Latent Diffusion\u003C\u002Fh1>\n\u003C\u002Fp>\n\nZiqian Ning, Huakang Chen, Yuepeng Jiang, Chunbo Hao, Guobin Ma, Shuai Wang, Jixun Yao, Lei Xie†\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FASLP-lab\u002FDiffRhythm\">Huggingface Space Demo\u003C\u002Fa>\n  \u003Cbr>\n  \u003Cb>DiffRhythm 2\u003C\u002Fb> &nbsp;&nbsp;|&nbsp;&nbsp;\n  📑 \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fpdf\u002F2510.22950\">Paper\u003C\u002Fa> &nbsp;&nbsp;|&nbsp;&nbsp;\n  🎵 \u003Ca href=\"https:\u002F\u002Faslp-lab.github.io\u002FDiffRhythm2.github.io\u002F\">Demo\u003C\u002Fa>\n  \u003Cbr>\n  \u003Cb>DiffRhythm+\u003C\u002Fb> &nbsp;&nbsp;|&nbsp;&nbsp;\n  📑 \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.12890\">Paper\u003C\u002Fa> &nbsp;&nbsp;|&nbsp;&nbsp;\n  🎵 \u003Ca href=\"https:\u002F\u002Flongwaytog0.github.io\u002FDiffRhythmPlus\u002F\">Demo\u003C\u002Fa>\n  \u003Cbr>\n  \u003Cb>DiffRhythm\u003C\u002Fb> &nbsp;&nbsp;|&nbsp;&nbsp;\n  📑 \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.01183\">Paper\u003C\u002Fa> &nbsp;&nbsp;|&nbsp;&nbsp;\n  🎵 \u003Ca href=\"https:\u002F\u002Faslp-lab.github.io\u002FDiffRhythm.github.io\u002F\">Demo\u003C\u002Fa>\n  &nbsp;&nbsp;|&nbsp;&nbsp;\n  💬 \u003Ca href=\"src\u002Fcontact.md\">WeChat (微信)\u003C\u002Fa>\n\u003C\u002Fp>\n\nDiffRhythm (Chinese: 谛韵, Dì Yùn) is the ***first*** open-sourced diffusion-based music generation model that is capable of creating full-length songs. The name combines \"Diff\" (referencing its diffusion architecture) with \"Rhythm\" (highlighting its focus on music and song creation). The Chinese name 谛韵 (Dì Yùn) phonetically mirrors \"DiffRhythm\", where \"谛\" (attentive listening) symbolizes auditory perception, and \"韵\" (melodic charm) represents musicality.\n\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"src\u002Fdiffrhythm.jpg\" width=\"90%\"\u002F>\n\u003Cp>\n\n## News and Updates\n\n* 📌 Join Us on Discord! [![Discord](https:\u002F\u002Fdcbadge.limes.pink\u002Fapi\u002Fserver\u002Fhttps:\u002F\u002Fdiscord.gg\u002FvUD4zgTpJa)](https:\u002F\u002Fdiscord.gg\u002FvUD4zgTpJa)\n\n* **2025.5.9 🔥** **DiffRhythm-v1.2 Official Launch!**\n\n   Version 1.2 largely resolves repetition and omission issues, significantly improves audio quality and arrangement with richer instrumentation, and enables song editing and continuation with advanced understanding of music structure and style.\n\n* **2025.3.15 🔥** **DiffRhythm-full Official Release: Complete Music Generation!**  \n\n    The wait is over - **285s full-length music generation** is now live!  \n\n    *The symphony evolves. What impossible music will you compose next?*\n\n* **2025.3.11 💻** DiffRhythm can now run on MacOS! \n\n* **2025.3.9 🔥** **DiffRhythm Update: Text-to-Music and Pure Music Generation!**  \n\n    We're excited to announce two groundbreaking features now live in our open-source music model:  \n\n    🎯 **Text-Based Style Prompts**  \n    Describe styles\u002Fscenes in words (e.g., `Jazzy Nightclub Vibe`, `Pop Emotional Piano` or `Indie folk ballad, coming-of-age themes, acoustic guitar picking with harmonica interludes`) — *no audio reference needed!*  \n\n    🎧 **Instrumental Mode**  \n    Generate pure music with wild prompts like:  \n    ```bash  \n    \"Arctic research station, theremin auroras dancing with geomagnetic storms\"  \n    ```\n\n    ✨ Special Thanks to community contributor @Jourdelune for implementing these features via #PR29!\n\n    **Full Release Notes**: See [src\u002Fupdate_alert.md](src\u002Fupdate_alert.md) for  details, demos, and roadmap.\n\n    Break the rules. Make music that shouldn't exist.\n\n* **2025.3.7 🔥** **DiffRhythm** is now officially licensed under the **Apache 2.0 License**! 🎉 As the first diffusion-based music generation model, DiffRhythm opens up exciting new possibilities for AI-driven creativity in music. Whether you're a researcher, developer, or music enthusiast, we invite you to explore, innovate, and build upon this foundation. \n\n* **2025.3.6 🔥** The local deployment guide is now available.\n\n* **2025.3.4 🔥** We released the [DiffRhythm paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.01183) and [Huggingface Space demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FASLP-lab\u002FDiffRhythm).\n\n## TODOs\n- [ ] Support Colab.\n- [ ] Gradio support.\n- [x] Dynamic length control\n- [x] Vocals only\n- [x] Song extension\n- [x] Support Docker.\n- [x] Release DiffRhythm-full.\n- [x] Release training code.\n- [x] Support local deployment.\n- [x] Release paper to Arxiv.\n- [x] Online serving on Hugging Face Space.\n\n## Model Versions\n\n|  Model   | HuggingFace |\n|  ----  | ----  |\n| DiffRhythm-v1.2-base (1m35s)  | https:\u002F\u002Fhuggingface.co\u002FASLP-lab\u002FDiffRhythm-1_2 |\n| DiffRhythm-v1.2-full (4m45s)  | https:\u002F\u002Fhuggingface.co\u002FASLP-lab\u002FDiffRhythm-1_2-full |\n| DiffRhythm-base (1m35s)  | https:\u002F\u002Fhuggingface.co\u002FASLP-lab\u002FDiffRhythm-base |\n| DiffRhythm-full (4m45s)  | https:\u002F\u002Fhuggingface.co\u002FASLP-lab\u002FDiffRhythm-full |\n| DiffRhythm-vae  | https:\u002F\u002Fhuggingface.co\u002FASLP-lab\u002FDiffRhythm-vae |\n\n## Docker installation\nYou just need the 3 files inside the folder docker. Do as it follows:\u003Cbr\u002F>\n- Clone the project or copy the files\u003Cbr\u002F>\n- cd into the folder\u003Cbr\u002F>\n- Edit your docker compose biding folders\n- docker compose up -d (or docker-compose up -d depending on your version)\u003Cbr\u002F>\n- docker exec -it DiffRhythm bash\u003Cbr\u002F>\n\nYou will be in the terminal ready for use. Just go to \u002Fhome\u002Fapp\u002Fscripts and run infer_prompt_ref.sh\n## Inference\n\nFollowing the steps below to clone the repository and install the environment.\n\n```bash \n# clone and enter the repositry\ngit clone https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm.git\ncd DiffRhythm\n\n# install the environment\n\n## espeak-ng\n# For Debian-like distribution (e.g. Ubuntu, Mint, etc.)\nsudo apt-get install espeak-ng\n# For RedHat-like distribution (e.g. CentOS, Fedora, etc.) \nsudo yum install espeak-ng\n# For MacOS\nbrew install espeak-ng\n# For Windows\n# Please visit https:\u002F\u002Fgithub.com\u002Fespeak-ng\u002Fespeak-ng\u002Freleases to download .msi installer\n\n## create python environment\nconda create -n diffrhythm python=3.10\nconda activate diffrhythm\n\n## OR you can use classic Python virtual enviroment instead of conda\npython -m venv venv\n# activate venv on Linux\nsource venv\u002Fbin\u002Factivate\n# activate venv on Windows\nvenv\\Scripts\\activate\n\n## install requirements\npip install -r requirements.txt\n```\n\nOn Linux you can now simply use the inference script:\n```bash\n# For inference using a reference WAV file\nbash scripts\u002Finfer_wav_ref.sh\n# For inference using a text prompt reference\nbash scripts\u002Finfer_prompt_ref.sh\n```\n\nBut before running the inference on Windows, make sure you set the user enviroment variables:\\\n`PHONEMIZER_ESPEAK_LIBRARY` -> `C:\\Program Files\\eSpeak NG\\libespeak-ng.dll`\\\n`PHONEMIZER_ESPEAK_PATH` -> `C:\\Program Files\\eSpeak NG`\\\nChange `C:\\Program Files\\eSpeak NG` to your eSpeak installation directory and reboot your PC to apply changes.\n\n*Installing Japanese voices, mbrola binaries and unpacking an mbrola_ph folder (as described [here](https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm\u002Fissues\u002F15) and [here](https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm\u002Fissues\u002F22)) are **no longer required** when running on Windows. See https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm\u002Fissues\u002F17#issuecomment-2705058729, [this](https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm\u002Fcommit\u002F2ea9424274df10670ddc613b5d61cc16d13e2b88) and [this commit](https:\u002F\u002Fgithub.com\u002FASLP-lab\u002FDiffRhythm\u002Fcommit\u002F1ad7229e1a774c9a2a0c4888103dd4ea7176aebb).*\n\nAfter this, you will also be able to run inference scripts on Windows (please note that English lyrics will be used here):\n```batch\nrem : For inference using a reference WAV file\ncall scripts\\infer_wav_ref.bat\nrem : For inference using a text prompt reference\ncall scripts\\infer_prompt_ref.bat\n```\n\nExample files of lrc and reference audio can be found in `infer\u002Fexample`.\n\nYou can use [the tools](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FASLP-lab\u002FDiffRhythm) we provide on huggingface to generate the lrc.\n\n**Note that DiffRhythm-base requires a minimum of 8G of VRAM. To meet the 8G VRAM requirement, use the `--chunked` argument when running the inference. Higher VRAM may be required if chunked decoding is disabled.**\n\n## Training\n\nComing soon...\n\n## License & Disclaimer\n\nDiffRhythm (code and DiT weights) is released under the [Apache License 2.0](https:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0). This open-source license allows you to freely use, modify, and distribute the model, as long as you include the appropriate copyright notice and disclaimer.\n\nWe do not make any profit from this model. Our goal is to provide a high-quality base model for music generation, fostering innovation in AI music and contributing to the advancement of human creativity. We hope that DiffRhythm will serve as a foundation for further research and development in the field of AI-generated music.\n\nDiffRhythm enables the creation of original music across diverse genres, supporting applications in artistic creation, education, and entertainment. While designed for positive use cases, potential risks include unintentional copyright infringement through stylistic similarities, inappropriate blending of cultural musical elements, and misuse for generating harmful content. To ensure responsible deployment, users must implement verification mechanisms to confirm musical originality, disclose AI involvement in generated works, and obtain permissions when adapting protected styles.\n\n## Citation\n```\n@article{ning2025diffrhythm,\n  title={{DiffRhythm}: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion},\n  author={Ziqian, Ning and Huakang, Chen and Yuepeng, Jiang and Chunbo, Hao and Guobin, Ma and Shuai, Wang and Jixun, Yao and Lei, Xie},\n  journal={arXiv preprint arXiv:2503.01183},\n  year={2025}\n}\n```\n## Contact Us\n\nIf you are interested in leaving a message to our research team, feel free to email `nzqiann@gmail.com`.\n\u003Cp align=\"center\">\n    \u003Ca href=\"http:\u002F\u002Fwww.nwpu-aslp.org\u002F\">\n        \u003Cimg src=\"src\u002FASLP.jpg\" width=\"400\"\u002F>\n    \u003C\u002Fa>\n\u003C\u002Fp>\n","DiffRhythm是一个基于潜扩散模型的全长度音乐生成项目。它能够快速且简便地生成完整的歌曲，核心功能包括文本到音乐的转换以及纯音乐生成，用户可以通过描述风格或场景来生成具有特定情感和氛围的音乐作品，无需提供音频参考。此外，DiffRhythm支持在MacOS上运行，并且最新版本改进了重复与遗漏问题，提高了音频质量和编曲丰富度，还增加了对音乐结构和风格更深入的理解能力，使得编辑和延续歌曲成为可能。该项目适用于音乐创作、背景音乐制作及任何需要高质量原创音乐的场景。",2,"2026-06-11 03:42:24","high_star"]