[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9617":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":38,"readmeContent":39,"aiSummary":40,"trendingCount":16,"starSnapshotCount":16,"syncStatus":41,"lastSyncTime":42,"discoverSource":43},9617,"MNN","alibaba\u002FMNN","alibaba","MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.","http:\u002F\u002Fwww.mnn.zone\u002F",null,"C++",15470,2350,235,27,0,4,66,332,37,45,"Apache License 2.0",false,"master",true,[27,28,29,30,31,32,33,34,35,36,37],"arm","convolution","deep-learning","embedded-devices","llm","machine-learning","ml","mnn","transformer","vulkan","winograd-algorithm","2026-06-12 02:02:10","![MNN](doc\u002Fbanner.png)\n---\n[![License](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Falibaba\u002FMNN)](LICENSE.txt)\n[![Documentation](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDocumentation-Read-green)](https:\u002F\u002Fmnn-docs.readthedocs.io\u002Fen\u002Flatest\u002F)\n[![簡體中文版本](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLanguage-%E7%AE%80%E4%BD%93%E4%B8%AD%E6%96%87-green)](README_CN.md)\n[![繁體中文版本](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLanguage-%E7%B9%81%E9%AB%94%E4%B8%AD%E6%96%87-green)](README_TW.md)\n[![日本語バージョン](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLanguage-%E6%97%A5%E6%9C%AC%E8%AA%9E-green)](README_JP.md)\n[![MNN Homepage](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHomepage-Visit-green)](http:\u002F\u002Fwww.mnn.zone)\n[![zread](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FZread-ask-green)](https:\u002F\u002Fzread.ai\u002Falibaba\u002FMNN)\n\n[![MNN Chat App](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FApps-MNN_Chat-blue)](.\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002FREADME.md)\n[![TaoAvatar](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FApps-MNN_TaoAvatar-blue)](.\u002Fapps\u002FAndroid\u002FMnn3dAvatar\u002FREADME.md)\n[![Sana](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FApps-Sana_Image_Edit-blue)](.\u002Fapps\u002Fsana\u002FREADME.md)\n\n## News 🔥\n- [2026\u002F03\u002F05] Support Qwen3.5 Series.\n\u003Cp align=\"center\">\n  \u003Cimg width=\"15%\" alt=\"Icon\"  src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Fassets\u002Fqwen35_1.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"15%\" alt=\"Icon\" src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Fassets\u002Fqwen35_2.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"15%\" alt=\"Icon\" src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Fassets\u002Fqwen35_3.jpg\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2026\u002F02\u002F13] MNN-Sana-Edit-V2 is now available at [apps](.\u002Fapps\u002Fsana\u002FREADME.md), offering cartoon-style photo editing based on Sana.\n\u003Cp align=\"center\">\n  \u003Cimg width=\"80%\" alt=\"Icon\"  src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Fassets\u002Fsana_show_case.jpg\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n\u003Cdetails>\n\u003Csummary> History News \u003C\u002Fsummary>\n\n- [2025\u002F10\u002F16] Support Qwen3-VL Series.\n- [2025\u002F06\u002F11] New App MNN TaoAvatar released, you can talk with 3DAvatar offline with LLM, ASR, TTS, A2BS and NNR models all run local on your device!! [MNN TaoAvatar](.\u002Fapps\u002FAndroid\u002FMnn3dAvatar\u002FREADME.md)\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Favatar\u002Favatar_demo.gif\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2025\u002F05\u002F12] android app support qwen2.5 omni 3b and 7b [MNN Chat App](.\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002FREADME.md#releases).\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_home_new.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"20%\" alt=\"Icon\" src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_sound_new.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"20%\" alt=\"Icon\" src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_image_new.jpg\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2025\u002F04\u002F30] android app support qwen3 and dark mode [MNN Chat App](.\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002FREADME.md#releases).\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\"https:\u002F\u002Fmeta.alicdn.com\u002Fdata\u002Fmnn\u002Fqwen_3.gif\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2025\u002F02\u002F18] iOS multimodal LLM App is released [MNN LLM iOS](.\u002Fapps\u002FiOS\u002FMNNLLMChat\u002FREADME.md).\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\".\u002Fapps\u002FiOS\u002FMNNLLMChat\u002Fassets\u002Fintroduction.gif\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2025\u002F02\u002F11] android app support for [deepseek r1 1.5b](.\u002Fproject\u002Fandroid\u002Fapps\u002FMnnLlmApp\u002FREADME.md#version-021).\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fdeepseek_support.gif\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\n- [2025\u002F01\u002F23] We released our full multimodal LLM Android App:[MNN-LLM-Android](.\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002FREADME.md). including text-to-text, image-to-text, audio-to-text, and text-to-image generation.\n\u003Cp align=\"center\">\n  \u003Cimg width=\"20%\" alt=\"Icon\"  src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_home_new.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"20%\" alt=\"Icon\" src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_diffusion_new.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"20%\" alt=\"Icon\" src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_sound_new.jpg\" style=\"margin: 0 10px;\">\n  \u003Cimg width=\"20%\" alt=\"Icon\" src=\".\u002Fapps\u002FAndroid\u002FMnnLlmChat\u002Fassets\u002Fimage_image_new.jpg\" style=\"margin: 0 10px;\">\n\u003C\u002Fp>\n\u003C\u002Fdetails>\n\n## Intro\nMNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. At present, MNN has been integrated into more than 30 apps of Alibaba Inc, such as Taobao, Tmall, Youku, DingTalk, Xianyu, etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT.\n\n[MNN-LLM](.\u002Ftransformers\u002FREADME.md) is a large language model runtime solution developed based on the MNN engine. The mission of this project is to deploy LLM models locally on everyone's platforms(Mobile Phone\u002FPC\u002FIOT). It supports popular large language models such as Qianwen, Baichuan, Zhipu, LLAMA, and others. [MNN-LLM User guide](https:\u002F\u002Fmnn-docs.readthedocs.io\u002Fen\u002Flatest\u002Ftransformers\u002Fllm.html)\n\n[MNN-Diffusion](https:\u002F\u002Fgithub.com\u002Falibaba\u002FMNN\u002Ftree\u002Fmaster\u002Ftransformers\u002Fdiffusion) is a stable diffusion model runtime solution developed based on the MNN engine. The mission of this project is to deploy stable diffusion models locally on everyone's platforms. [MNN-Diffusion User guide](https:\u002F\u002Fmnn-docs.readthedocs.io\u002Fen\u002Flatest\u002Ftransformers\u002Fdiffusion.html)\n\n![architecture](doc\u002Farchitecture.png)\n\nInside Alibaba, [MNN](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002F5I1ISpx8lQqvCS8tGd6EJw) works as the basic module of the compute container in the [Walle](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FqpeCETty0BqqNJV9CMJafA) System, the first end-to-end, general-purpose, and large-scale production system for device-cloud collaborative machine learning, which has been published in the top system conference OSDI’22. The key design principles of MNN and the extensive benchmark testing results (vs. TensorFlow, TensorFlow Lite, PyTorch, PyTorch Mobile, TVM) can be found in the OSDI paper. The scripts and instructions for benchmark testing are put in the path “\u002Fbenchmark”. If MNN or the design of Walle helps your research or production use, please cite our OSDI paper as follows:\n\n    @inproceedings {proc:osdi22:walle,\n        author = {Chengfei Lv and Chaoyue Niu and Renjie Gu and Xiaotang Jiang and Zhaode Wang and Bin Liu and Ziqi Wu and Qiulin Yao and Congyu Huang and Panos Huang and Tao Huang and Hui Shu and Jinde Song and Bin Zou and Peng Lan and Guohuan Xu and Fei Wu and Shaojie Tang and Fan Wu and Guihai Chen},\n        title = {Walle: An {End-to-End}, {General-Purpose}, and {Large-Scale} Production System for {Device-Cloud} Collaborative Machine Learning},\n        booktitle = {16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22)},\n        year = {2022},\n        isbn = {978-1-939133-28-1},\n        address = {Carlsbad, CA},\n        pages = {249--265},\n        url = {https:\u002F\u002Fwww.usenix.org\u002Fconference\u002Fosdi22\u002Fpresentation\u002Flv},\n        publisher = {USENIX Association},\n        month = jul,\n    }\n\n\n## Documentation and Workbench\nMNN's docs are in place in [Read the docs](https:\u002F\u002Fmnn-docs.readthedocs.io\u002Fen\u002Flatest).\n\nYou can also read docs\u002FREADME to build docs's html.\n\nMNN Workbench could be downloaded from [MNN's homepage](http:\u002F\u002Fwww.mnn.zone), which provides pretrained models, visualized training tools, and one-click deployment of models to devices.\n\n## Key Features\n### Lightweight\n- Optimized for devices, no dependencies, can be easily deployed to mobile devices and a variety of embedded devices.\n- iOS platform: static library size will full option for armv7+arm64 platforms is about 12MB, size increase of linked executables is about 2M.\n- Android platform: core so size is about 800KB (armv7a - c++_shared).\n- Using MNN_BUILD_MINI can reduce package size by about 25%, with a limit of fixed model input size\n- Support FP16 \u002F Int8 quantize, can reduce model size 50%-70%\n\n### Versatility\n- Supports `Tensorflow`, `Caffe`, `ONNX`,`Torchscripts` and supports common neural networks such as `CNN`, `RNN`, `GAN`, `Transformer`.\n- Supports AI model with multi-inputs or multi-outputs, every kind of dimension format, dynamic inputs, controlflow.\n- MNN supports approximate full OPs used for the AI Model. The converter supports 178 `Tensorflow` OPs, 52 `Caffe` OPs, 163 `Torchscripts` OPs, 158 `ONNX` OPs.\n- Supports iOS 8.0+, Android 4.3+, and embedded devices with POSIX interface.\n- Supports hybrid computing on multiple devices. Currently supports CPU and GPU.\n\n\n### High performance\n- Implements core computing with lots of optimized assembly code to make full use of the ARM \u002F x64 CPU.\n- Use Metal \u002F OpenCL \u002F Vulkan to support GPU inference on mobile.\n- Use CUDA and tensorcore to support NVIDIA GPU for better performance\n- Convolution and transposition convolution algorithms are efficient and stable. The Winograd convolution algorithm is widely used to better symmetric convolutions such as 3x3,4x4,5x5,6x6,7x7.\n- Twice speed increase for the new architecture ARM v8.2 with FP16 half-precision calculation support. 2.5 faster to use sdot for ARM v8.2 and VNNI.\n\n### Ease of use\n- Support use MNN's OP to do numerical calculating like numpy.\n- Support lightweight image process module like OpenCV, which is only 100k.\n- Support build model and train it on PC \u002F mobile.\n- MNN Python API helps ML engineers to easily use MNN to infer, train, and process images, without dipping their toes in C++ code.\n\nThe Architecture \u002F Precision MNN supported is shown below:\n\n- S ：Support and work well, deeply optimized, recommend to use\n- A ：Support and work well, can use\n- B ：Support but has bug or not optimized, no recommend to use\n- C ：Not Support\n\n| Architecture \u002F Precision |  | Normal | FP16 | BF16 | Int8 |\n| --- | --- | --- | --- | --- | --- |\n| CPU | Native | B | C | B | B |\n|  | x86\u002Fx64-SSE4.1 | A | C | C | A |\n|  | x86\u002Fx64-AVX2 | S | C | C | A |\n|  | x86\u002Fx64-AVX512 | S | C | C | S |\n|  | ARMv7a | S | S (ARMv8.2) | S | S |\n|  | ARMv8 | S | S (ARMv8.2) | S(ARMv8.6) | S |\n| GPU | OpenCL | A | S | C | S |\n|  | Vulkan | A | A | C | A |\n|  | Metal | A | S | C | S |\n|  | CUDA | A | S | C | A |\n| NPU | CoreML | A | C | C | C |\n|  | HIAI | A | C | C | C |\n|  | NNAPI | B | B | C | B |\n|  | QNN | C | B | C | C |\n\n\n## Tools\n\nBase on MNN (Tensor compute engine), we provided a series of tools for inference, train and general computation.\n\n- MNN-Converter: Convert other models to MNN models for inference, such as Tensorflow(lite), Caffe, ONNX, Torchscripts. And do graph optimization to reduce computation.\n- MNN-Compress: Compress model to reduce size and increase performance \u002F speed\n- MNN-Express: Support model with controlflow, use MNN's OP to do general-purpose computing.\n- MNN-CV: An OpenCV-like library, but based on MNN and then much more lightweight.\n- MNN-Train: Support train MNN model.\n\n## How to Discuss and Get Help From the MNN Community\n\nThe group discussions are predominantly Chinese. But we welcome and will help English speakers.\n\nDingtalk discussion groups:\n\nGroup #4 (Available): 160170007549\n\nGroup #3 (Full)\n\nGroup #2 (Full): 23350225\n\nGroup #1 (Full): 23329087\n\n## Historical Paper\n\nThe preliminary version of MNN, as mobile inference engine and with the focus on manual optimization, has also been published in MLSys 2020. Please cite the paper, if MNN previously helped your research:\n\n\n    @inproceedings{alibaba2020mnn,\n      author = {Jiang, Xiaotang and Wang, Huan and Chen, Yiliu and Wu, Ziqi and Wang, Lichuan and Zou, Bin and Yang, Yafeng and Cui, Zongyang and Cai, Yu and Yu, Tianhang and Lv, Chengfei and Wu, Zhihua},\n      title = {MNN: A Universal and Efficient Inference Engine},\n      booktitle = {MLSys},\n      year = {2020}\n    }\n\n\n## License\nApache 2.0\n\n## Acknowledgement\nMNN participants: Taobao Technology Department, Search Engineering Team, DAMO Team, Youku and other Alibaba Group employees.\n\nMNN refers to the following projects:\n- [Caffe](https:\u002F\u002Fgithub.com\u002FBVLC\u002Fcaffe)\n- [flatbuffer](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fflatbuffers)\n- [gemmlowp](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fgemmlowp)\n- [Google Vulkan demo](http:\u002F\u002Fwww.github.com\u002Fgooglesamples\u002Fandroid-vulkan-tutorials)\n- [Halide](https:\u002F\u002Fgithub.com\u002Fhalide\u002FHalide)\n- [Mace](https:\u002F\u002Fgithub.com\u002FXiaoMi\u002Fmace)\n- [ONNX](https:\u002F\u002Fgithub.com\u002Fonnx\u002Fonnx)\n- [protobuffer](https:\u002F\u002Fgithub.com\u002Fprotocolbuffers\u002Fprotobuf)\n- [skia](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Fskia)\n- [Tensorflow](https:\u002F\u002Fgithub.com\u002Ftensorflow\u002Ftensorflow)\n- [ncnn](https:\u002F\u002Fgithub.com\u002FTencent\u002Fncnn)\n- [paddle-mobile](https:\u002F\u002Fgithub.com\u002FPaddlePaddle\u002Fpaddle-mobile)\n- [stb](https:\u002F\u002Fgithub.com\u002Fnothings\u002Fstb)\n- [rapidjson](https:\u002F\u002Fgithub.com\u002FTencent\u002Frapidjson)\n- [pybind11](https:\u002F\u002Fgithub.com\u002Fpybind\u002Fpybind11)\n- [pytorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fpytorch)\n- [bolt](https:\u002F\u002Fgithub.com\u002Fhuawei-noah\u002Fbolt)\n- [libyuv](https:\u002F\u002Fchromium.googlesource.com\u002Flibyuv\u002Flibyuv)\n- [libjpeg](https:\u002F\u002Fgithub.com\u002Flibjpeg-turbo\u002Flibjpeg-turbo)\n- [opencv](https:\u002F\u002Fgithub.com\u002Fopencv\u002Fopencv)\n- [onnxruntime](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime)\n","MNN 是一个由阿里巴巴开发的高性能、轻量级推理引擎，专为设备端的大规模语言模型和边缘AI应用提供支持。其核心功能包括对深度学习模型进行高效推理，并通过ARM架构优化、卷积算法（如Winograd）及Vulkan图形API等技术特点实现快速计算。MNN适用于需要在移动设备或嵌入式系统上运行复杂AI任务的场景，比如语音识别、图像处理以及自然语言处理等，能够确保低延迟与高效率的同时保持较小的内存占用。",2,"2026-06-11 03:23:46","top_topic"]