[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-73218":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},73218,"aibrix","vllm-project\u002Faibrix","vllm-project","Cost-efficient and pluggable Infrastructure components for GenAI inference","",null,"Go",4858,597,48,298,0,11,20,62,33,30.33,"Apache License 2.0",false,"main",[],"2026-06-12 02:03:10","# AIBrix\n\nWelcome to AIBrix, an open-source initiative designed to provide essential building blocks to construct scalable GenAI inference infrastructure. AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs.\n\n\n\u003Cp align=\"center\">\n| \u003Ca href=\"https:\u002F\u002Faibrix.readthedocs.io\u002Flatest\u002F\">\u003Cb>Documentation\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Faibrix.github.io\u002F\">\u003Cb>Blog\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.03648\">\u003Cb>White Paper\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fx.com\u002Fvllm_project\">\u003Cb>Twitter\u002FX\u003C\u002Fb>\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fvllm-dev.slack.com\u002Farchives\u002FC08EQ883CSV\">\u003Cb>Developer Slack\u003C\u002Fb>\u003C\u002Fa> |\n\u003C\u002Fp>\n\n## Latest News\n\n### Releases\n- **[2026-03-05]** AIBrix v0.6.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.6.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2026-03-03-v0.6.0-release\u002F) for more details.\n- **[2025-11-10]** AIBrix v0.5.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.5.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2025-11-10-v0.5.0-release\u002F) for more details.\n- **[2025-08-05]** AIBrix v0.4.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.4.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2025-08-04-v0.4.0-release\u002F) for more details.\n- **[2025-05-21]** AIBrix v0.3.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.3.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2025-05-21-v0.3.0-release\u002F) for more details.\n- **[2025-03-09]** AIBrix v0.2.1 is released. DeepSeek-R1 full weights deployment is supported and gateway stability has been improved! Check [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2025-03-10-deepseek-r1\u002F) for more details.\n- **[2025-02-19]** AIBrix v0.2.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.2.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2025-02-05-v0.2.0-release\u002F) for more details.\n- **[2024-11-13]** AIBrix v0.1.0 is released. Check out the [release notes](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Ftag\u002Fv0.1.0) and [Blog Post](https:\u002F\u002Faibrix.github.io\u002Fposts\u002F2024-11-12-v0.1.0-release\u002F) for more details.\n\n### Talks and Presentations\n\n- **[2025-11-12]** AIBrix team co-delivered a keynote at KubeCon North America 2025 [AIBrix: Kubernetes-native GenAI Inference Infrastructure](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=7KHenRXNGAw&t=875s), providing AIBrix overview.\n- **[2025-06-10]** AIBrix team delivered a talk at KubeCon China 2025 titled [AIBrix: Cost-Effective and Scalable Kubernetes Control Plane for vLLM](https:\u002F\u002Fkccncchn2025.sched.com\u002Fevent\u002F1x5im\u002Fintroducing-aibrix-cost-effective-and-scalable-kubernetes-control-plane-for-vllm-jiaxin-shan-liguang-xie-bytedance), discussing how the framework optimizes vLLM deployment via Kubernetes for cost efficiency and scalability.\n- **[2025-04-04]** AIBrix team co-delivered a keynote at KubeCon EU 2025 with Google on [LLM-Aware Load Balancing in Kubernetes: A New Era of Efficiency](https:\u002F\u002Fkccnceu2025.sched.com\u002Fevent\u002F1txC7\u002Fkeynote-llm-aware-load-balancing-in-kubernetes-a-new-era-of-efficiency-clayton-coleman-distinguished-engineer-google-jiaxin-shan-software-engineer-bytedance), focusing on LLM specific routing solutions.\n- **[2025-03-30]** AIBrix was featured at the [ASPLOS'25](http:\u002F\u002Fasplos-conference.org\u002Fasplos2025\u002F) workshop with the presentation [AIBrix: An Open-Source, Large-Scale LLM Inference Infrastructure for System Research](https:\u002F\u002Fdocs.google.com\u002Fpresentation\u002Fd\u002F1YDVsPFTIgGXnROGaJ1VKuDDAB4T5fzpE\u002Fedit), showcasing its architecture for efficient LLM inference in system research scenarios.\n\n## Key Features\n\nThe initial release includes the following key features:\n\n- **High-Density LoRA Management**: Streamlined support for lightweight, low-rank adaptations of models.\n- **LLM Gateway and Routing**: Efficiently manage and direct traffic across multiple models and replicas.\n- **LLM App-Tailored Autoscaler**: Dynamically scale inference resources based on real-time demand.\n- **Unified AI Runtime**: A versatile sidecar enabling metric standardization, model downloading, and management.\n- **Distributed Inference**: Scalable architecture to handle large workloads across multiple nodes.\n- **Distributed KV Cache**: Enables high-capacity, cross-engine KV reuse.\n- **Cost-efficient Heterogeneous Serving**: Enables mixed GPU inference to reduce costs with SLO guarantees.\n- **GPU Hardware Failure Detection**: Proactive detection of GPU hardware issues.\n\n## Architecture\n\n![aibrix-architecture-v1](docs\u002Fsource\u002Fassets\u002Fimages\u002Faibrix-architecture-v1.jpeg)\n\n\n## Quick Start\n\nTo get started with AIBrix, clone this repository and follow the setup instructions in the documentation. Our comprehensive guide will help you configure and deploy your first LLM infrastructure seamlessly.\n\n```shell\n# Local Testing\ngit clone https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix.git\ncd aibrix\n\n# Install nightly aibrix dependencies\nkubectl apply -k config\u002Fdependency --server-side\n\n# Install nightly aibrix components\nkubectl apply -k config\u002Fdefault\n```\n\nInstall stable distribution\n```shell\n# Install component dependencies\nkubectl apply -f \"https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Fdownload\u002Fv0.5.0\u002Faibrix-dependency-v0.5.0.yaml\" --server-side\n\n# Install aibrix components\nkubectl apply -f \"https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Freleases\u002Fdownload\u002Fv0.5.0\u002Faibrix-core-v0.5.0.yaml\"\n```\n\n## Documentation\n\nFor detailed documentation on installation, configuration, and usage, please visit our [documentation page](https:\u002F\u002Faibrix.readthedocs.io\u002Flatest\u002F).\n\n## Contributing\n\nWe welcome contributions from the community! Check out our [contributing guidelines](.\u002FCONTRIBUTING.md) to see how you can make a difference.\n\nSlack Channel: [#aibrix](https:\u002F\u002Fvllm-dev.slack.com\u002Farchives\u002FC08EQ883CSV)\n\n## License\n\nAIBrix is licensed under the [Apache 2.0 License](LICENSE).\n\n## Support\n\nIf you have any questions or encounter any issues, please submit an issue on our [GitHub issues page](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Faibrix\u002Fissues).\n\nThank you for choosing AIBrix for your GenAI infrastructure needs!\n","AIBrix 是一个旨在为生成式人工智能推理提供高效且可插拔基础设施组件的开源项目。它采用 Go 语言开发，提供了云原生解决方案，专门针对企业需求优化了大规模语言模型（LLM）的部署、管理和扩展。其核心功能包括通过 Kubernetes 控制平面实现成本效益和可扩展性，支持多种大模型的权重部署，并持续增强网关稳定性。适用于需要构建或扩展生成式 AI 服务的企业级应用场景，如智能客服、内容生成等。",2,"2026-06-11 03:44:33","high_star"]