[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11458":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":9,"totalLinesOfCode":9,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":9,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},11458,"vllm-ascend","vllm-project\u002Fvllm-ascend","vllm-project","Community maintained hardware plugin for vLLM on Ascend",null,"https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend","C++",2223,1374,30,1423,0,21,60,163,63,31.41,false,"main",[25,26,27,28,29,30,31,32,33],"ascend","inference","llm","llm-serving","llmops","mlops","model-serving","transformer","vllm","2026-06-12 02:02:31","\u003Cp align=\"center\">\n  \u003Cpicture>\n    \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fvllm-project\u002Fvllm-ascend\u002Fmain\u002Fdocs\u002Fsource\u002Flogos\u002Fvllm-ascend-logo-text-dark.png\">\n    \u003Cimg alt=\"vllm-ascend\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fvllm-project\u002Fvllm-ascend\u002Fmain\u002Fdocs\u002Fsource\u002Flogos\u002Fvllm-ascend-logo-text-light.png\" width=55%>\n  \u003C\u002Fpicture>\n\u003C\u002Fp>\n\n\u003Ch3 align=\"center\">\nvLLM Ascend Plugin\n\u003C\u002Fh3>\n\n\u003Cdiv align=\"center\">\n\n[![DeepWiki](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDeepWiki-Ask_AI-_.svg?style=flat&color=0052D9&labelColor=000000&logo=data:image\u002Fpng;base64,iVBORw0KGgoAAAANSUhEUgAAACwAAAAyCAYAAAAnWDnqAAAAAXNSR0IArs4c6QAAA05JREFUaEPtmUtyEzEQhtWTQyQLHNak2AB7ZnyXZMEjXMGeK\u002FAIi+QuHrMnbChYY7MIh8g01fJoopFb0uhhEqqcbWTp06\u002Fuv1saEDv4O3n3dV60RfP947Mm9\u002FSQc0ICFQgzfc4CYZoTPAswgSJCCUJUnAAoRHOAUOcATwbmVLWdGoH\u002F\u002FPB8mnKqScAhsD0kYP3j\u002FYt5LPQe2KvcXmGvRHcDnpxfL2zOYJ1mFwrryWTz0advv1Ut4CJgf5uhDuDj5eUcAUoahrdY\u002F56ebRWeraTjMt\u002F00Sh3UDtjgHtQNHwcRGOC98BJEAEymycmYcWwOprTgcB6VZ5JK5TAJ+fXGLBm3FDAmn6oPPjR4rKCAoJCal2eAiQp2x0vxTPB3ALO2CRkwmDy5WohzBDwSEFKRwPbknEggCPB\u002FimwrycgxX2NzoMCHhPkDwqYMr9tRcP5qNrMZHkVnOjRMWwLCcr8ohBVb1OMjxLwGCvjTikrsBOiA6fNyCrm8V1rP93iVPpwaE+gO0SsWmPiXB+jikdf6SizrT5qKasx5j8ABbHpFTx+vFXp9EnYQmLx02h1QTTrl6eDqxLnGjporxl3NL3agEvXdT0WmEost648sQOYAeJS9Q7bfUVoMGnjo4AZdUMQku50McDcMWcBPvr0SzbTAFDfvJqwLzgxwATnCgnp4wDl6Aa+Ax283gghmj+vj7feE2KBBRMW3FzOpLOADl0Isb5587h\u002FU4gGvkt5v60Z1VLG8BhYjbzRwyQZemwAd6cCR5\u002FXFWLYZRIMpX39AR0tjaGGiGzLVyhse5C9RKC6ai42ppWPKiBagOvaYk8lO7DajerabOZP46Lby5wKjw1HCRx7p9sVMOWGzb\u002FvA1hwiWc6jm3MvQDTogQkiqIhJV0nBQBTU+3okKCFDy9WwferkHjtxib7t3xIUQtHxnIwtx4mpg26\u002FHfwVNVDb4oI9RHmx5WGelRVlrtiw43zboCLaxv46AZeB3IlTkwouebTr1y2NjSpHz68WNFjHvupy3q8TFn3Hos2IAk4Ju5dCo8B3wP7VPr\u002FFGaKiG+T+v+TQqIrOqMTL1VdWV1DdmcbO8KXBz6esmYWYKPwDL5b5FA1a0hwapHiom0r\u002FcKaoqr+27\u002FXcrS5UwSMbQAAAABJRU5ErkJggg==)](https:\u002F\u002Fdeepwiki.com\u002Fvllm-project\u002Fvllm-ascend)\n\n\u003C\u002Fdiv>\n\n\u003Cp align=\"center\">\n| \u003Ca href=\"https:\u002F\u002Fwww.hiascend.com\u002Fen\u002F\">\u003Cb>About Ascend\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002F\">\u003Cb>Documentation\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fslack.vllm.ai\">\u003Cb>#SIG-Ascend\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fdiscuss.vllm.ai\u002Fc\u002Fhardware-support\u002Fvllm-ascend-support\">\u003Cb>Users Forum\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Ftinyurl.com\u002Fvllm-ascend-meeting\">\u003Cb>Weekly Meeting\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n\u003Ca >\u003Cb>English\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"README.zh.md\">\u003Cb>中文\u003C\u002Fb>\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n*Latest News* 🔥\n\n- [2026\u002F05] We released the new official version [v0.18.0](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Freleases\u002Ftag\u002Fv0.18.0)! Please follow the [official guide](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.18.0\u002F) to start using vLLM Ascend Plugin on Ascend.\n- [2026\u002F02] We released the new official version [v0.13.0](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Freleases\u002Ftag\u002Fv0.13.0)! Please follow the [official guide](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.13.0\u002F) to start using vLLM Ascend Plugin on Ascend.\n- [2025\u002F12] We released the new official version [v0.11.0](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Freleases\u002Ftag\u002Fv0.11.0)! Please follow the [official guide](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.11.0\u002F) to start using vLLM Ascend Plugin on Ascend.\n- [2025\u002F09] We released the new official version [v0.9.1](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Freleases\u002Ftag\u002Fv0.9.1)! Please follow the [official guide](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.9.1\u002Ftutorials\u002Flarge_scale_ep.html) to start deploying large-scale Expert Parallelism (EP) on Ascend.\n- [2025\u002F08] We hosted the [vLLM Beijing Meetup](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002F7n8OYNrCC_I9SJaybHA_-Q) with vLLM and Tencent! Please find the meetup slides [here](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1Pid6NSFLU43DZRi0EaTcPgXsAzDvbBqF).\n- [2025\u002F06] [User stories](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fcommunity\u002Fuser_stories\u002Findex.html) page is now live! It kicks off with LLaMA-Factory\u002Fverl\u002FTRL\u002FGPUStack to demonstrate how vLLM Ascend assists Ascend users in enhancing their experience across fine-tuning, evaluation, reinforcement learning (RL), and deployment scenarios.\n- [2025\u002F06] [Contributors](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fcommunity\u002Fcontributors.html) page is now live! All contributions deserve to be recorded, thanks for all contributors.\n- [2025\u002F05] We've released the first official version [v0.7.3](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Freleases\u002Ftag\u002Fv0.7.3)! We collaborated with the vLLM community to publish a blog post sharing our practice: [Introducing vLLM Hardware Plugin, Best Practice from Ascend NPU](https:\u002F\u002Fblog.vllm.ai\u002F2025\u002F05\u002F12\u002Fhardware-plugin.html).\n- [2025\u002F03] We hosted the [vLLM Beijing Meetup](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FVtxO9WXa5fC-mKqlxNUJUQ) with vLLM team! Please find the meetup slides [here](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1Pid6NSFLU43DZRi0EaTcPgXsAzDvbBqF).\n- [2025\u002F02] vLLM community officially created [vllm-project\u002Fvllm-ascend](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend) repo for running vLLM seamlessly on the Ascend NPU.\n- [2024\u002F12] We are working with the vLLM community to support [[RFC]: Hardware pluggable](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm\u002Fissues\u002F11162).\n\n---\n\n## Overview\n\nvLLM Ascend (`vllm-ascend`) is a community maintained hardware plugin for running vLLM seamlessly on the Ascend NPU.\n\nIt is the recommended approach for supporting the Ascend backend within the vLLM community. It adheres to the principles outlined in the [[RFC]: Hardware pluggable](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm\u002Fissues\u002F11162), providing a hardware-pluggable interface that decouples the integration of the Ascend NPU with vLLM.\n\nBy using vLLM Ascend plugin, popular open-source models, including Transformer-like, Mixture-of-Experts (MoE), Embedding, Multi-modal LLMs can run seamlessly on the Ascend NPU.\n\n## Prerequisites\n\n- Hardware: Atlas 800I A2 Inference series, Atlas A2 Training series, Atlas 800I A3 Inference series, Atlas A3 Training series, Atlas 300I Duo (Experimental)\n- OS: Linux\n- Software:\n    - Python >= 3.10, \u003C 3.12\n    - CANN == 9.0.0 (Ascend HDK version refers to [here](https:\u002F\u002Fwww.hiascend.com\u002Fdocument\u002Fdetail\u002Fzh\u002Fcanncommercial\u002F83RC2\u002Freleasenote\u002Freleasenote_0000.html))\n    - PyTorch == 2.10.0, torch-npu == 2.10.0\n    - vLLM (the same version as vllm-ascend)\n\n## Getting Started\n\nPlease use the following recommended versions to get started quickly:\n\n| Version    | Release type | Doc                                  |\n|------------|--------------|--------------------------------------|\n| v0.19.1rc1 | Latest release candidate | See [QuickStart](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fquick_start.html) and [Installation](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Finstallation.html) for more details |\n| v0.18.0 | Latest stable version | See [QuickStart](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.18.0\u002Fquick_start.html) and [Installation](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Fv0.18.0\u002Finstallation.html) for more details |\n\n## Contributing\n\nSee [CONTRIBUTING](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fdeveloper_guide\u002Fcontribution\u002Findex.html) for more details, which is a step-by-step guide to help you set up the development environment, build and test.\n\nWe welcome and value any contributions and collaborations:\n\n- Please let us know if you encounter a bug by [filing an issue](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm-ascend\u002Fissues)\n- Please use [User forum](https:\u002F\u002Fdiscuss.vllm.ai\u002Fc\u002Fhardware-support\u002Fvllm-ascend-support) for usage questions and help.\n\n## Branch\n\nvllm-ascend has a main branch and a dev branch.\n\n- **main**: main branch, corresponds to the vLLM main branch, and is continuously monitored for quality through Ascend CI.\n- **releases\u002FvX.Y.Z**: development branch, created alongside new releases of vLLM. For example, `releases\u002Fv0.13.0` is the dev branch for vLLM `v0.13.0` version.\n\nBelow are the maintained branches:\n\n| Branch           | Status       | Note                                 |\n|------------------|--------------|--------------------------------------|\n| main             | Maintained   | CI commitment for vLLM main branch and vLLM v0.18.0 tag |\n| v0.7.1-dev       | Unmaintained | Outdated, no longer maintained. |\n| v0.7.3-dev       | Unmaintained | Only bug fixes are allowed, and no new release tags anymore. |\n| v0.9.1-dev       | Unmaintained | Only bug fixes are allowed, and no new release tags anymore. |\n| v0.11.0-dev      | Unmaintained | Only bug fixes are allowed, and no new release tags anymore. |\n| releases\u002Fv0.13.0 | Maintained   | CI commitment for vLLM 0.13.0 version |\n| releases\u002Fv0.18.0 | Maintained   | CI commitment for vLLM 0.18.0 version |\n| rfc\u002Ffeature-name | Maintained   | [Feature branches](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fcommunity\u002Fversioning_policy.html#feature-branches) for collaboration |\n  \nPlease refer to [Versioning policy](https:\u002F\u002Fdocs.vllm.ai\u002Fprojects\u002Fascend\u002Fen\u002Flatest\u002Fcommunity\u002Fversioning_policy.html) for more details.\n\n## Weekly Meeting\n\n- vLLM Ascend Weekly Meeting: \u003Chttps:\u002F\u002Ftinyurl.com\u002Fvllm-ascend-meeting>\n- Wednesday, 15:00 - 16:00 (UTC+8, [Convert to your timezone](https:\u002F\u002Fdateful.com\u002Fconvert\u002Fgmt8?t=15))\n\n## License\n\nApache License 2.0, as found in the [LICENSE](.\u002FLICENSE) file.\n","vLLM Ascend Plugin 是一个社区维护的硬件插件，旨在为Ascend平台上的vLLM提供支持。它通过Python语言实现，专注于大语言模型（LLM）的高效推理与服务部署，具备低延迟、高吞吐量的特点，并且优化了Transformer架构下的模型运行效率。该插件非常适合需要在Ascend加速器上进行大规模语言模型推理和部署的应用场景，如自然语言处理服务、在线聊天机器人等，能够显著提升计算资源利用率及模型响应速度。",2,"2026-06-11 03:31:51","trending"]