[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72344":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":29,"discoverSource":30},72344,"VMamba","MzeroMiko\u002FVMamba","MzeroMiko","VMamba: Visual State Space Models，code is based on mamba","",null,"Python",3181,232,15,268,0,2,12,40,6,29.1,"MIT License",false,"main",[],"2026-06-12 02:03:02","\n\u003Cdiv align=\"center\">\n\u003Ch1>VMamba \u003C\u002Fh1>\n\u003Ch3>VMamba: Visual State Space Model\u003C\u002Fh3>\n\n[Yue Liu](https:\u002F\u002Fgithub.com\u002FMzeroMiko)\u003Csup>1\u003C\u002Fsup>,[Yunjie Tian](https:\u002F\u002Fsunsmarterjie.github.io\u002F)\u003Csup>1\u003C\u002Fsup>,[Yuzhong Zhao](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=tStQNm4AAAAJ&hl=zh-CN&oi=ao)\u003Csup>1\u003C\u002Fsup>, [Hongtian Yu](https:\u002F\u002Fgithub.com\u002Fyuhongtian17)\u003Csup>1\u003C\u002Fsup>, [Lingxi Xie](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=EEMm7hwAAAAJ&hl=zh-CN&oi=ao)\u003Csup>2\u003C\u002Fsup>, [Yaowei Wang](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=o_DllmIAAAAJ&hl=zh-CN&oi=ao)\u003Csup>3\u003C\u002Fsup>, [Qixiang Ye](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=tjEfgsEAAAAJ&hl=zh-CN&oi=ao)\u003Csup>1\u003C\u002Fsup>, [Yunfan Liu](https:\u002F\u002Fscholar.google.com.hk\u002Fcitations?user=YPL33G0AAAAJ&hl=zh-CN&oi=ao)\u003Csup>1\u003C\u002Fsup>\n\n\u003Csup>1\u003C\u002Fsup>  University of Chinese Academy of Sciences, \u003Csup>2\u003C\u002Fsup>  HUAWEI Inc.,  \u003Csup>3\u003C\u002Fsup> PengCheng Lab.\n\nPaper: ([arXiv 2401.10166](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.10166))\n\n\u003C\u002Fdiv>\n\n\n## 🔥 use VMamba with only ***one file*** and in ***fewest steps*** !\n```bash\nconda create -n vmamba python=3.10\npip install torch==2.2 torchvision torchaudio triton pytest chardet yacs termcolor fvcore seaborn packaging ninja einops numpy==1.24.4 timm==0.4.12\npip install https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba\u002Freleases\u002Fdownload\u002Fv2.2.4\u002Fmamba_ssm-2.2.4+cu12torch2.2cxx11abiTRUE-cp310-cp310-linux_x86_64.whl\npython vmamba.py\n```\n\n* [**updates**](#white_check_mark-updates)\n* [**abstract**](#abstract)\n* [**overview**](#overview--derivations)\n* [**main results**](#main-results)\n* [**getting started**](#getting-started)\n* [**star history**](#star-history)\n* [**citation**](#citation)\n* [**acknowledgment**](#acknowledgment)\n\n\n## :white_check_mark: Updates\n* **`Sep. 25th, 2024`**: Update: **VMamba is accepted by NeurIPS2024 (spotlight)!**\n* **`June. 14th, 2024`**: Update: we clean the code to be easier to read; we add support for `mamba2`.\n* **`May. 26th, 2024`**: Update: we release the updated weights of VMambav2, together with the new arxiv paper.\n* **`May. 7th, 2024`**: Update: **Important!** using `torch.backends.cudnn.enabled=True` in downstream tasks may be quite slow. If you found vmamba quite slow in your machine, disable it in vmamba.py, else, ignore this.\n* **...**\n\n***for details see [detailed_updates.md](assets\u002Fdetailed_updates.md)***\n\n## Abstract\n\nDesigning computationally efficient network architectures persists as an ongoing necessity in computer vision. In this paper, we transplant Mamba, a state-space language model, into VMamba, a vision backbone that works in linear time complexity. At the core of VMamba lies a stack of Visual State-Space (VSS) blocks with the 2D Selective Scan (SS2D) module. By traversing along four scanning routes, SS2D helps bridge the gap between the ordered nature of 1D selective scan and the non-sequential structure of 2D vision data, which facilitates the gathering of contextual information from various sources and perspectives. Based on the VSS blocks, we develop a family of VMamba architectures and accelerate them through a succession of architectural and implementation enhancements. Extensive experiments showcase VMamba’s promising performance across diverse visual perception tasks, highlighting its advantages in input scaling efficiency compared to existing benchmark models.\n\n## Overview\n\n* [**VMamba**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.10166) serves as a general-purpose backbone for computer vision.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Farchitecture.png\" alt=\"architecture\" width=\"80%\">\n\u003C\u002Fp>\n\n* **2D-Selective-Scan of VMamba**\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fss2d.png\" alt=\"arch\" width=\"80%\">\n\u003C\u002Fp>\n\n* **VMamba has global effective receptive field**\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ferf.png\" alt=\"erf\" width=\"80%\">\n\u003C\u002Fp>\n\n* **VMamba resembles Transformer-Based Methods in Activation Map**\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fattn.png\" alt=\"attn\" width=\"80%\">\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Factivation_map.png\" alt=\"activation\" width=\"80%\">\n\u003C\u002Fp>\n\n## Main Results\n\u003C!-- copied from assets\u002Fperformance.md  -->\n\n\u003C!-- :book: -->\n\u003C!-- ***The checkpoints of some of the models listed below will be released in weeks!*** -->\n\n:book:\n***For details see [performance.md](.\u002Fassets\u002Fperformance.md).***\n\n### **Classification on ImageNet-1K**\n| name | pretrain | resolution |acc@1 | #params | FLOPs | TP. | Train TP. | configs\u002Flogs\u002Fckpts |\n| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |\n| Swin-T | ImageNet-1K | 224x224 | 81.2 | 28M | 4.5G | 1244 |987 | -- |\n| Swin-S | ImageNet-1K | 224x224 | 83.2 | 50M | 8.7G | 718 |642 | -- |\n| Swin-B | ImageNet-1K | 224x224 | 83.5 | 88M | 15.4G | 458 |496 | -- |\n| VMamba-S[`s2l15`] | ImageNet-1K | 224x224 | 83.6 | 50M | 8.7G | 877 | 314| [config](classification\u002Fconfigs\u002Fvssm\u002Fvmambav2_small_224.yaml)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm_small_0229.txt)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm_small_0229_ckpt_epoch_222.pth) |\n| VMamba-B[`s2l15`] | ImageNet-1K | 224x224 | 83.9 | 89M | 15.4G | 646 | 247 | [config](classification\u002Fconfigs\u002Fvssm\u002Fvmambav2_base_224.yaml)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm_base_0229.txt)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm_base_0229_ckpt_epoch_237.pth) |\n| VMamba-T[`s1l8`] | ImageNet-1K | 224x224 | 82.6 | 30M | 4.9G | 1686| 571| [config](classification\u002Fconfigs\u002Fvssm\u002Fvmambav2v_tiny_224.yaml)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm1_tiny_0230s.txt)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2cls\u002Fvssm1_tiny_0230s_ckpt_epoch_264.pth) |\n\n\n* *Models in this subsection is trained from scratch with random or manual initialization. The hyper-parameters are inherited from Swin, except for `drop_path_rate` and `EMA`. All models are trained with EMA except for the `Vanilla-VMamba-T`.*\n* *`TP.(Throughput)` and `Train TP. (Train Throughput)` are assessed on an A100 GPU paired with an AMD EPYC 7542 CPU, with batch size 128. `Train TP.` is tested with mix-resolution, excluding the time consumption of optimizers.*\n* *`FLOPs` and `parameters` are now gathered with `head` (In previous versions, they were counted without head, so the numbers raise a little bit).*\n* *we calculate `FLOPs` with the algorithm @albertgu [provides](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba\u002Fissues\u002F110), which will be bigger than previous calculation (which is based on the `selective_scan_ref` function, and ignores the hardware-aware algorithm).*\n\n\n### **Object Detection on COCO**\n  \n| Backbone | #params | FLOPs | Detector | bboxAP | bboxAP50 | bboxAP75 | segmAP | segmAP50 | segmAP75 | configs\u002Flogs\u002Fckpts |\n| :---: | :---: | :---: | :---: | :---: | :---: |:---: |:---: |:---: |:---: |:---: |\n| Swin-T | 48M | 267G | MaskRCNN@1x | 42.7 |65.2 |46.8 |39.3 |62.2 |42.2 |-- |\n| Swin-S | 69M | 354G | MaskRCNN@1x | 44.8 |66.6 |48.9 |40.9 |63.4 |44.2 |-- |-- |\n| Swin-B | 107M | 496G | MaskRCNN@1x | 46.9|--|--| 42.3|--|--|-- |-- |\n| VMamba-S[`s2l15`] | 70M | 384G | MaskRCNN@1x | 48.7 |70.0 |53.4 |43.7 |67.3 |47.0 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_small.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_small.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_small_epoch_11.pth) |\n| VMamba-B[`s2l15`] | 108M | 485G | MaskRCNN@1x | 49.2 |71.4 |54.0 |44.1 |68.3 |47.7 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_base.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_base.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_base_epoch_11.pth) |\n| VMamba-B[`s2l15`] | 108M | 485G | MaskRCNN@1x[`bs8`] | 49.2 |70.9 |53.9 |43.9 |67.7 |47.6 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_base.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_base_bs8.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_base_epoch_12_bs8.pth) |\n| VMamba-T[`s1l8`] | 50M | 271G | MaskRCNN@1x | 47.3 |69.3 |52.0 |42.7 |66.4 |45.9 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_tiny.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_tiny_s.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_tiny_s_epoch_12.pth) |\n| :---: | :---: | :---: | :---: | :---: | :---: |:---: |:---: |:---: |:---: |:---: |:---: |:---: |\n| Swin-T | 48M | 267G | MaskRCNN@3x | 46.0 |68.1 |50.3 |41.6 |65.1 |44.9 |-- |\n| Swin-S | 69M | 354G | MaskRCNN@3x | 48.2 |69.8 |52.8 |43.2 |67.0 |46.1  |-- |\n| VMamba-S[`s2l15`] | 70M | 384G | MaskRCNN@3x | 49.9 |70.9 |54.7 |44.20 |68.2 |47.7 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_small_ms_3x.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_small_ms_3x.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_small_ms_3x_epoch_32.pth) |\n| VMamba-T[`s1l8`] | 50M | 271G | MaskRCNN@3x | 48.8 |70.4 |53.50 |43.7 |67.4 |47.0 | [config](detection\u002Fconfigs\u002Fvssm1\u002Fmask_rcnn_vssm_fpn_coco_tiny_ms_3x.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_tiny_ms_3x_s.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2det\u002Fmask_rcnn_vssm_fpn_coco_tiny_ms_3x_s_epoch_31.pth) |\n\n\n* *Models in this subsection is initialized from the models trained in `classfication`.*\n* *we now calculate FLOPs with the algrithm @albertgu [provides](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba\u002Fissues\u002F110), which will be bigger than previous calculation (which is based on the `selective_scan_ref` function, and ignores the hardware-aware algrithm).*\n\n### **Semantic Segmentation on ADE20K**\n\n| Backbone | Input|  #params | FLOPs | Segmentor | mIoU(SS) | mIoU(MS) | configs\u002Flogs\u002Flogs(ms)\u002Fckpts |\n| :---: | :---: | :---: | :---: | :---: | :---: |:---: |:---: |\n| Swin-T | 512x512 | 60M | 945G | UperNet@160k | 44.4| 45.8| -- |\n| Swin-S | 512x512 | 81M | 1039G | UperNet@160k | 47.6| 49.5| -- |\n| Swin-B | 512x512 | 121M | 1188G | UperNet@160k | 48.1| 49.7|-- |\n| VMamba-S[`s2l15`] | 512x512 | 82M | 1028G | UperNet@160k | 50.6| 51.2|[config](segmentation\u002Fconfigs\u002Fvssm1\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_small.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_small.log)\u002F[log(ms)](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_small_tta.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_small_iter_144000.pth) |\n| VMamba-B[`s2l15`] | 512x512 | 122M | 1170G | UperNet@160k | 51.0| 51.6|[config](segmentation\u002Fconfigs\u002Fvssm1\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_base.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_base.log)\u002F[log(ms)](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_base_tta.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_base_iter_160000.pth) |\n| VMamba-T[`s1l8`] | 512x512 | 62M | 949G | UperNet@160k | 47.9| 48.8| [config](segmentation\u002Fconfigs\u002Fvssm1\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_tiny.py)\u002F[log](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_tiny_s.log)\u002F[log(ms)](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_tiny_s_tta.log)\u002F[ckpt](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba\u002Freleases\u002Fdownload\u002F%23v2seg\u002Fupernet_vssm_4xb4-160k_ade20k-512x512_tiny_s_iter_160000.pth) |\n\n\n* *Models in this subsection is initialized from the models trained in `classfication`.*\n* *we now calculate FLOPs with the algrithm @albertgu [provides](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba\u002Fissues\u002F110), which will be bigger than previous calculation (which is based on the `selective_scan_ref` function, and ignores the hardware-aware algrithm).*\n\n## Getting Started\n\n### Installation\n\n**Step 1: Clone the VMamba repository:**\n\nTo get started, first clone the VMamba repository and navigate to the project directory:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002FVMamba.git\ncd VMamba\n```\n\n**Step 2: Environment Setup:**\n\nVMamba recommends setting up a conda environment and installing dependencies via pip. Use the following commands to set up your environment:\nAlso, We recommend using the pytorch>=2.0, cuda>=11.8. But lower version of pytorch and CUDA are also supported.\n\n***Create and activate a new conda environment***\n\n```bash\nconda create -n vmamba\nconda activate vmamba\n```\n\n***Install Dependencies***\n\n```bash\npip install -r requirements.txt\ncd kernels\u002Fselective_scan && pip install .\n```\n\u003C!-- cd kernels\u002Fcross_scan && pip install . -->\n\n***Check Selective Scan (optional)***\n\n* If you want to check the modules compared with `mamba_ssm`, install [`mamba_ssm`](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba) first!\n\n* If you want to check if the implementation of `selective scan` of ours is the same with `mamba_ssm`, `selective_scan\u002Ftest_selective_scan.py` is here for you. Change to `MODE = \"mamba_ssm_sscore\"` in `selective_scan\u002Ftest_selective_scan.py`, and run `pytest selective_scan\u002Ftest_selective_scan.py`.\n\n* If you want to check if the implementation of `selective scan` of ours is the same with reference code (`selective_scan_ref`), change to `MODE = \"sscore\"` in `selective_scan\u002Ftest_selective_scan.py`, and run `pytest selective_scan\u002Ftest_selective_scan.py`.\n\n* `MODE = \"mamba_ssm\"` stands for checking whether the results of `mamba_ssm` is close to `selective_scan_ref`, and `\"sstest\"` is preserved for development. \n\n* If you find `mamba_ssm` (`selective_scan_cuda`) or `selective_scan` ( `selctive_scan_cuda_core`) is not close enough to `selective_scan_ref`, and the test failed, do not worry. Check if `mamba_ssm` and `selective_scan` are close enough [instead](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba\u002Fpull\u002F161).\n\n* ***If you are interested in selective scan, you can check [mamba](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba), [mamba-mini](https:\u002F\u002Fgithub.com\u002FMzeroMiko\u002Fmamba-mini), [mamba.py](https:\u002F\u002Fgithub.com\u002FalxndrTL\u002Fmamba.py) [mamba-minimal](https:\u002F\u002Fgithub.com\u002Fjohnma2006\u002Fmamba-minimal) for more information.***\n\n***Dependencies for `Detection` and `Segmentation` (optional)***\n\n```bash\npip install mmengine==0.10.1 mmcv==2.1.0 opencv-python-headless ftfy regex\npip install mmdet==3.3.0 mmsegmentation==1.2.2 mmpretrain==1.2.0\n```\n\n### Model Training and Inference\n\n**Classification**\n\nTo train VMamba models for classification on ImageNet, use the following commands for different configurations:\n\n```bash\npython -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=8 --master_addr=\"127.0.0.1\" --master_port=29501 main.py --cfg \u003C\u002Fpath\u002Fto\u002Fconfig> --batch-size 128 --data-path \u003C\u002Fpath\u002Fof\u002Fdataset> --output \u002Ftmp\n```\n\nIf you only want to test the performance (together with params and flops):\n\n```bash\npython -m torch.distributed.launch --nnodes=1 --node_rank=0 --nproc_per_node=1 --master_addr=\"127.0.0.1\" --master_port=29501 main.py --cfg \u003C\u002Fpath\u002Fto\u002Fconfig> --batch-size 128 --data-path \u003C\u002Fpath\u002Fof\u002Fdataset> --output \u002Ftmp --pretrained \u003C\u002Fpath\u002Fof\u002Fcheckpoint>\n```\n\n***please refer to [modelcard](.\u002Fmodelcard.sh) for more details.***\n\n**Detection and Segmentation**\n\nTo evaluate with `mmdetection` or `mmsegmentation`:\n```bash\nbash .\u002Ftools\u002Fdist_test.sh \u003C\u002Fpath\u002Fto\u002Fconfig> \u003C\u002Fpath\u002Fto\u002Fcheckpoint> 1\n```\n*use `--tta` to get the `mIoU(ms)` in segmentation*\n\nTo train with `mmdetection` or `mmsegmentation`:\n```bash\nbash .\u002Ftools\u002Fdist_train.sh \u003C\u002Fpath\u002Fto\u002Fconfig> 8\n```\n\nFor more information about detection and segmentation tasks, please refer to the manual of [`mmdetection`](https:\u002F\u002Fmmdetection.readthedocs.io\u002Fen\u002Flatest\u002Fuser_guides\u002Ftrain.html) and [`mmsegmentation`](https:\u002F\u002Fmmsegmentation.readthedocs.io\u002Fen\u002Flatest\u002Fuser_guides\u002F4_train_test.html). Remember to use the appropriate backbone configurations in the `configs` directory.\n\n### Analysis Tools\n\nVMamba includes tools for visualizing mamba \"attention\" and effective receptive field, analysing throughput and train-throughput. Use the following commands to perform analysis:\n\n```bash\n# Visualize Mamba \"Attention\"\nCUDA_VISIBLE_DEVICES=0 python analyze\u002Fattnmap.py\n\n# Analyze the effective receptive field\nCUDA_VISIBLE_DEVICES=0 python analyze\u002Ferf.py\n\n# Analyze the throughput and train throughput\nCUDA_VISIBLE_DEVICES=0 python analyze\u002Ftp.py\n\n```\n\n***We also included other analysing tools that we may use in this project. Thanks to all who have contributes to these tools.***\n\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=MzeroMiko\u002FVMamba&type=Date)](https:\u002F\u002Fstar-history.com\u002F#MzeroMiko\u002FVMamba&Date)\n\n## Citation\n\n```\n@article{liu2024vmamba,\n  title={VMamba: Visual State Space Model},\n  author={Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Liu, Yunfan},\n  journal={arXiv preprint arXiv:2401.10166},\n  year={2024}\n}\n```\n\n## Acknowledgment\n\nThis project is based on Mamba ([paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.00752), [code](https:\u002F\u002Fgithub.com\u002Fstate-spaces\u002Fmamba)), Swin-Transformer ([paper](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2103.14030.pdf), [code](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FSwin-Transformer)), ConvNeXt ([paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2201.03545), [code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FConvNeXt)), [OpenMMLab](https:\u002F\u002Fgithub.com\u002Fopen-mmlab),\nand the `analyze\u002Fget_erf.py` is adopted from [replknet](https:\u002F\u002Fgithub.com\u002FDingXiaoH\u002FRepLKNet-pytorch\u002Ftree\u002Fmain\u002Ferf), thanks for their excellent works.\n\n* **We release [Fast-iTPN](https:\u002F\u002Fgithub.com\u002Fsunsmarterjie\u002FiTPN\u002Ftree\u002Fmain\u002Ffast_itpn) recently, which reports the best performance on ImageNet-1K at Tiny\u002FSmall\u002FBase level models as far as we know. (Tiny-24M-86.5%, Small-40M-87.8%, Base-85M-88.75%)**\n","VMamba 是一个基于 Mamba 的视觉状态空间模型，旨在为计算机视觉任务提供高效的网络架构。其核心功能包括通过 Visual State-Space (VSS) 块和 2D Selective Scan (SS2D) 模块实现线性时间复杂度的处理能力，从而在保持高效的同时增强对二维视觉数据上下文信息的捕捉。项目采用 Python 编写，并且仅需少量步骤即可快速上手使用。VMamba 特别适用于需要高效率处理图像或视频数据的应用场景，如实时监控、自动驾驶等，能够显著提升这些领域中视觉任务的执行速度与准确性。","2026-06-11 03:41:27","high_star"]