[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9720":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":43,"readmeContent":44,"aiSummary":45,"trendingCount":16,"starSnapshotCount":16,"syncStatus":46,"lastSyncTime":47,"discoverSource":48},9720,"computervision-recipes","microsoft\u002Fcomputervision-recipes","microsoft","Best Practices, code samples, and documentation for Computer Vision.","",null,"Jupyter Notebook",9863,1207,274,105,0,1,3,14,4,70.15,"MIT License",false,"staging",true,[27,28,29,30,31,32,33,34,35,36,37,7,38,39,40,41,42],"artificial-intelligence","azure","computer-vision","convolutional-neural-networks","data-science","deep-learning","image-classification","image-processing","jupyter-notebook","kubernetes","machine-learning","object-detection","operationalization","python","similarity","tutorial","2026-06-12 04:00:46","\u003Cimg src=\"scenarios\u002Fmedia\u002Flogo_cvbp.png\" align=\"right\" alt=\"\" width=\"300\"\u002F>\n\n```diff\n+ Update July: Added support for action recognition and tracking\n+              in the new release v1.2.\n```\n\n# Computer Vision\n\nIn recent years, we've see an extra-ordinary growth in Computer Vision, with applications in face recognition, image understanding, search, drones, mapping, semi-autonomous and autonomous vehicles. A key part to many of these applications are visual recognition tasks such as image classification, object detection and image similarity.\n\nThis repository provides examples and best practice guidelines for building computer vision systems. The goal of this repository is to build a comprehensive set of tools and examples that leverage recent advances in Computer Vision algorithms, neural architectures, and operationalizing such systems. Rather than creating implementations from scratch, we draw from existing state-of-the-art libraries and build additional utility around loading image data, optimizing and evaluating models, and scaling up to the cloud. In addition, having worked in this space for many years, we aim to answer common questions, point out frequently observed pitfalls, and show how to use the cloud for training and deployment.\n\nWe hope that these examples and utilities can significantly reduce the “time to market” by simplifying the experience from defining the business problem to development of solution by orders of magnitude. In addition, the example notebooks would serve as guidelines and showcase best practices and usage of the tools in a wide variety of languages.\n\nThese examples are provided as [Jupyter notebooks](scenarios) and common [utility functions](utils_cv). All examples use PyTorch as the underlying deep learning library.\n\n## Examples\n\nThis repository supports various Computer Vision scenarios which either operate on a single image:\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fscenarios\u002Fmedia\u002Fcv_overview.jpg\" height=\"350\" alt=\"Some supported CV scenarios\"\u002F>\n\u003C\u002Fp>\n\nAs well as scenarios such as action recognition which take a video sequence as input:\n\u003Cp align=\"center\">\n  \u003Cimg src=\u002Fscenarios\u002Faction_recognition\u002Fmedia\u002Faction_recognition2.gif \"Example of action recognition\"\u002F>\n\u003C\u002Fp>\n\n\n## Target Audience\n\nOur target audience for this repository includes data scientists and machine learning engineers with varying levels of Computer Vision knowledge as our content is source-only and targets custom machine learning modelling. The utilities and examples provided are intended to be solution accelerators for real-world vision problems.\n\n## Getting Started\n\nTo get started, navigate to the [Setup Guide](SETUP.md), which lists\ninstructions on how to setup the compute environment and dependencies needed to run the\nnotebooks in this repo. Once your environment is setup, navigate to the\n[Scenarios](scenarios) folder and start exploring the notebooks. We recommend to start with the *image classification* notebooks, since this introduces concepts which are also used by the other scenarios (e.g. pre-training on ImageNet).\n\nAlternatively, we support Binder\n[![Binder](https:\u002F\u002Fmybinder.org\u002Fbadge_logo.svg)](https:\u002F\u002Fmybinder.org\u002Fv2\u002Fgh\u002FPatrickBue\u002Fcomputervision-recipes\u002Fmaster?filepath=scenarios%2Fclassification%2F01_training_introduction_BINDER.ipynb)\nwhich makes it easy to try one of our notebooks in a web-browser simply by following this link. However, Binder is free, and as a result only comes with limited CPU compute power and without GPU support. Expect the notebook to run very slowly (this is somewhat improved by reducing image resolution to e.g. 60 pixels but at the cost of low accuracies).\n\n## Scenarios\n\nThe following is a summary of commonly used Computer Vision scenarios that are covered in this repository. For each of the main scenarios (\"base\"), we provide the tools to effectively build your own model. This includes simple tasks such as fine-tuning your own model on your own data, to more complex tasks such as hard-negative mining and even model deployment.\n\n| Scenario | Support     | Description |\n| -------- | ----------- | ----------- |\n| [Classification](scenarios\u002Fclassification) | Base | Image Classification is a supervised machine learning technique to learn and predict the category of a given image. |\n| [Similarity](scenarios\u002Fsimilarity)  | Base | Image Similarity is a way to compute a similarity score given a pair of images. Given an image, it allows you to identify the most similar image in a given dataset.  |\n| [Detection](scenarios\u002Fdetection) | Base | Object Detection is a technique that allows you to detect the bounding box of an object within an image. |\n| [Keypoints](scenarios\u002Fkeypoints) | Base | Keypoint detection can be used to detect specific points on an object. A pre-trained model is provided to detect body joints for human pose estimation. |\n| [Segmentation](scenarios\u002Fsegmentation) | Base | Image Segmentation assigns a category to each pixel in an image. |\n| [Action recognition](scenarios\u002Faction_recognition) | Base | Action recognition to identify in video\u002Fwebcam footage what actions are performed (e.g. \"running\", \"opening a bottle\") and at what respective start\u002Fend times. We also implemented the i3d implementation of action recognition that can be found under (contrib)[contrib]. |\n| [Tracking](scenarios\u002Ftracking) | Base | Tracking allows to detect and track multiple objects in a video sequence over time. |\n| [Crowd counting](contrib\u002Fcrowd_counting) | Contrib | Counting the number of people in low-crowd-density (e.g. less than 10 people) and high-crowd-density (e.g. thousands of people) scenarios.|\n\nWe separate the supported CV scenarios into two locations: (i) **base**: code and notebooks within the \"utils_cv\" and \"scenarios\" folders which follow strict coding guidelines, are well tested and maintained; (ii) **contrib**: code and other assets within the \"contrib\" folder, mainly covering less common CV scenarios using bleeding edge state-of-the-art approaches. Code in \"contrib\" is not regularly tested or maintained.\n\n## Computer Vision on Azure\n\nNote that for certain computer vision problems, you may not need to build your own models. Instead, pre-built or easily customizable solutions exist on Azure which do not require any custom coding or machine learning expertise. We strongly recommend evaluating if these can sufficiently solve your problem. If these solutions are not applicable, or the accuracy of these solutions is not sufficient, then resorting to more complex and time-consuming custom approaches may be necessary.\n\nThe following Microsoft services offer simple solutions to address common computer vision tasks:\n\n- [Vision Services](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fcomputer-vision\u002F)\nare a set of pre-trained REST APIs which can be called for image tagging, face recognition, OCR, video analytics, and more. These APIs work out of the box and require minimal expertise in machine learning, but have limited customization capabilities. See the various demos available to get a feel for the functionality (e.g. [Computer Vision](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fservices\u002Fcognitive-services\u002Fcomputer-vision\u002F#analyze)). The service can be used through API calls or through SDKs (available in .NET, Python, Java, Node and Go languages)\n\n- [Custom Vision](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Fcognitive-services\u002Fcustom-vision-service\u002Fhome)\nis a SaaS service to train and deploy a model as a REST API given a user-provided training set. All steps including image upload, annotation, and model deployment can be performed using an intuitive UI or through SDKs (available in .NEt, Python, Java, Node and Go languages). Training image classification or object detection models can be achieved with minimal machine learning expertise. The Custom Vision offers more flexibility than using the pre-trained cognitive services APIs, but requires the user to bring and annotate their own data.\n\nIf you need to train your own model, the following services and links provide additional information that is likely useful.\n\n- [Azure Machine Learning service (AzureML)](https:\u002F\u002Fdocs.microsoft.com\u002Fazure\u002Fmachine-learning\u002F?WT.mc_id=computervision-github-azureai)\nis a service that helps users accelerate the training and deploying of machine learning models. While not specific for computer vision workloads, the AzureML Python SDK can be used for scalable and reliable training and deployment of machine learning solutions to the cloud. We leverage Azure Machine Learning in several of the notebooks within this repository (e.g. [deployment to Azure Kubernetes Service](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fcomputervision-recipes\u002Fblob\u002Fmaster\u002Fscenarios\u002Fclassification\u002F22_deployment_on_azure_kubernetes_service.ipynb))\n\n- [Azure AI Reference architectures](https:\u002F\u002Fdocs.microsoft.com\u002Fen-us\u002Fazure\u002Farchitecture\u002Freference-architectures\u002Fai\u002Ftraining-python-models\u002F?WT.mc_id=computervision-github-azureai)\nprovide a set of examples (backed by code) of how to build common AI-oriented workloads that leverage multiple cloud components. While not computer vision specific, these reference architectures cover several machine learning workloads such as model deployment or batch scoring.\n\n## Build Status\n\n### AzureML Testing\n\n| Build Type | Branch | Status |  | Branch | Status |\n| --- | --- | --- | --- | --- | --- |\n| **Linux GPU** | master | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-gpu?branchName=master)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=41&branchName=master) | | staging | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-gpu?branchName=staging)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=41&branchName=staging) |\n| **Linux CPU** | master | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-cpu?branchName=master)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=37&branchName=master) | | staging | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-cpu?branchName=staging)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=37&branchName=staging) |\n| **Notebook unit GPU** | master | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-nb-gpu?branchName=master)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=42&branchName=master) | | staging | [![Build Status](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_apis\u002Fbuild\u002Fstatus\u002FAzureML\u002FAML-unit-test-linux-nb-gpu?branchName=staging)](https:\u002F\u002Fdev.azure.com\u002Fbest-practices\u002Fcomputervision\u002F_build\u002Flatest?definitionId=42&branchName=staging) |\n\n\n## Contributing\nThis project welcomes contributions and suggestions. Please see our [contribution guidelines](CONTRIBUTING.md).\n","该项目提供了计算机视觉领域的最佳实践、代码示例和文档。核心功能包括图像分类、目标检测、图像相似度等视觉识别任务，并支持动作识别与跟踪。技术上，项目基于Jupyter Notebook，使用PyTorch作为深度学习库，同时提供了一系列实用工具来简化数据加载、模型优化与评估以及云扩展的过程。适合于希望快速开发计算机视觉解决方案的数据科学家和机器学习工程师使用，在实际应用中能够显著缩短从定义问题到部署解决方案的时间。",2,"2026-06-11 03:24:24","top_topic"]