[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72019":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":15,"stars30d":17,"stars90d":16,"forks30d":16,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":22,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":16,"starSnapshotCount":16,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},72019,"samurai","yangchris11\u002Fsamurai","yangchris11","Official repository of \"SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory\"","https:\u002F\u002Fyangchris11.github.io\u002Fsamurai\u002F",null,"Python",7073,498,59,4,0,12,39.09,"Apache License 2.0",false,"master",true,[],"2026-06-12 02:02:57","\u003Cdiv align=\"center\">\n\u003Cimg align=\"left\" width=\"100\" height=\"100\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F1834fc25-42ef-4237-9feb-53a01c137e83\" alt=\"\">\n\n# SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory\n\n[Cheng-Yen Yang](https:\u002F\u002Fyangchris11.github.io), [Hsiang-Wei Huang](https:\u002F\u002Fhsiangwei0903.github.io\u002F), [Wenhao Chai](https:\u002F\u002Frese1f.github.io\u002F), [Zhongyu Jiang](https:\u002F\u002Fzhyjiang.github.io\u002F#\u002F), [Jenq-Neng Hwang](https:\u002F\u002Fpeople.ece.uw.edu\u002Fhwang\u002F)\n\n[Information Processing Lab, University of Washington](https:\u002F\u002Fipl-uw.github.io\u002F) \n\u003C\u002Fdiv>\n\n\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fsamurai-adapting-segment-anything-model-for-1\u002Fvisual-object-tracking-on-lasot-ext)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fvisual-object-tracking-on-lasot-ext?p=samurai-adapting-segment-anything-model-for-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fsamurai-adapting-segment-anything-model-for-1\u002Fvisual-object-tracking-on-got-10k)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fvisual-object-tracking-on-got-10k?p=samurai-adapting-segment-anything-model-for-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fsamurai-adapting-segment-anything-model-for-1\u002Fvisual-object-tracking-on-needforspeed)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fvisual-object-tracking-on-needforspeed?p=samurai-adapting-segment-anything-model-for-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fsamurai-adapting-segment-anything-model-for-1\u002Fvisual-object-tracking-on-lasot)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fvisual-object-tracking-on-lasot?p=samurai-adapting-segment-anything-model-for-1)\n[![PWC](https:\u002F\u002Fimg.shields.io\u002Fendpoint.svg?url=https:\u002F\u002Fpaperswithcode.com\u002Fbadge\u002Fsamurai-adapting-segment-anything-model-for-1\u002Fvisual-object-tracking-on-otb-2015)](https:\u002F\u002Fpaperswithcode.com\u002Fsota\u002Fvisual-object-tracking-on-otb-2015?p=samurai-adapting-segment-anything-model-for-1)\n\n[[Arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.11922) [[Project Page]](https:\u002F\u002Fyangchris11.github.io\u002Fsamurai\u002F) [[Raw Results]](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1ssiDmsC7mw5AiItYQG4poiR1JgRq305y?usp=sharing) \n\nThis repository is the official implementation of SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F9d368ca7-2e9b-4fed-9da0-d2efbf620d88\n\nAll rights are reserved to the copyright owners (TM & © Universal (2019)). This clip is not intended for commercial use and is solely for academic demonstration in a research paper. Original source can be found [here](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=cwUzUzpG8aM&t=4s).\n\n## News\n- [ ] **Incoming**: Support vot-challenge toolkit intergration.\n- [ ] **Incoming**: Release demo script to support inference on video (with mask prompt).\n- [x] **2025\u002F02\u002F18**: Release multi-GPU inference script.\n- [x] **2025\u002F01\u002F27**: Release [inference script](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai\u002Fblob\u002Fmaster\u002Fsam2\u002Ftools\u002FREADME.md#samurai-vos-inference) on VOS task (SA-V)!\n- [x] **2024\u002F11\u002F21**: Release [demo script](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai?tab=readme-ov-file#demo-on-custom-video) to support inference on video (bounding box prompt).\n- [x] **2024\u002F11\u002F20** Release [inference script](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai?tab=readme-ov-file#main-inference) on VOT task (LaSOT, LaSOT-ext, GOT-10k, UAV123, TrackingNet, OTB100)!\n- [x] **2024\u002F11\u002F19**: Release [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.11922), [code](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai), and [raw results](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1ssiDmsC7mw5AiItYQG4poiR1JgRq305y?usp=sharing)!\n\n## Getting Started\n\n#### SAMURAI Installation \n\nSAM 2 needs to be installed first before use. The code requires `python>=3.10`, as well as `torch>=2.3.1` and `torchvision>=0.18.1`. Please follow the instructions [here](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam2?tab=readme-ov-file) to install both PyTorch and TorchVision dependencies. You can install **the SAMURAI version** of SAM 2 on a GPU machine using:\n```\ncd sam2\npip install -e .\npip install -e \".[notebooks]\"\n```\n\nPlease see [INSTALL.md](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam2\u002Fblob\u002Fmain\u002FINSTALL.md) from the original SAM 2 repository for FAQs on potential issues and solutions.\n\nInstall other requirements:\n```\npip install matplotlib==3.7 tikzplotlib jpeg4py opencv-python lmdb pandas scipy loguru\n```\n\n#### SAM 2.1 Checkpoint Download\n\n```\ncd checkpoints && \\\n.\u002Fdownload_ckpts.sh && \\\ncd ..\n```\n\n#### Data Preparation\n\nPlease prepare the data in the following format:\n```\ndata\u002FLaSOT\n├── airplane\u002F\n│   ├── airplane-1\u002F\n│   │   ├── full_occlusion.txt\n│   │   ├── groundtruth.txt\n│   │   ├── img\n│   │   ├── nlp.txt\n│   │   └── out_of_view.txt\n│   ├── airplane-2\u002F\n│   ├── airplane-3\u002F\n│   ├── ...\n├── basketball\n├── bear\n├── bicycle\n...\n├── training_set.txt\n└── testing_set.txt\n```\n\n#### Main Inference\n```\npython scripts\u002Fmain_inference.py \n```\n\n## Demo on Custom Video\n\nTo run the demo with your custom video or frame directory, use the following examples:\n\n**Note:** The `.txt` file contains a single line with the bounding box of the first frame in `x,y,w,h` format while the SAM 2 takes `x1,y1,x2,y2` format as bbox input.\n\n### Input is Video File\n\n```\npython scripts\u002Fdemo.py --video_path \u003Cyour_video.mp4> --txt_path \u003Cpath_to_first_frame_bbox.txt>\n```\n\n### Input is Frame Folder\n```\n# Only JPG images are supported\npython scripts\u002Fdemo.py --video_path \u003Cyour_frame_directory> --txt_path \u003Cpath_to_first_frame_bbox.txt>\n```\n\n## FAQs\n**Question 1:** Does SAMURAI need training? [issue 34](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai\u002Fissues\u002F34)\n\n**Answer 1:** Unlike real-life samurai, the proposed samurai do not require additional training. It is a zero-shot method, we directly use the weights from SAM 2.1 to conduct VOT experiments. The Kalman filter is used to estimate the current and future state (bounding box location and scale in our case) of a moving object based on measurements over time, it is a common approach that had been adopted in the field of tracking for a long time, which does not require any training. Please refer to code for more detail.\n\n**Question 2:** Does SAMURAI support streaming input (e.g. webcam)?\n\n**Answer 2:** Not yet. The existing code doesn't support live\u002Fstreaming video as we inherit most of the codebase from the amazing SAM 2. Some discussion that you might be interested in: facebookresearch\u002Fsam2#90, facebookresearch\u002Fsam2#388 (comment).\n\n**Question 3:** How to use SAMURAI in longer video?\n\n**Answer 3:** See the discussion from sam2 https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam2\u002Fissues\u002F264.\n\n**Question 4:** How do you run the evaluation on the VOT benchmarks?\n\n**Answer 4:** For LaSOT, LaSOT-ext, OTB, NFS please refer to the [issue 74](https:\u002F\u002Fgithub.com\u002Fyangchris11\u002Fsamurai\u002Fissues\u002F74) for more details. For GOT-10k-test and TrackingNet, please refer to the official portal for submission.\n\n## Acknowledgment\n\nSAMURAI is built on top of [SAM 2](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fsam2?tab=readme-ov-file) by Meta FAIR.\n\nThe VOT evaluation code is modifed from [VOT Toolkit](https:\u002F\u002Fgithub.com\u002Fvotchallenge\u002Ftoolkit) by Luka Čehovin Zajc.\n\n## Citation\n\nPlease consider citing our paper and the wonderful `SAM 2` if you found our work interesting and useful.\n```\n@article{ravi2024sam2,\n  title={SAM 2: Segment Anything in Images and Videos},\n  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\\\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\\'a}r, Piotr and Feichtenhofer, Christoph},\n  journal={arXiv preprint arXiv:2408.00714},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.00714},\n  year={2024}\n}\n\n@misc{yang2024samurai,\n  title={SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory}, \n  author={Cheng-Yen Yang and Hsiang-Wei Huang and Wenhao Chai and Zhongyu Jiang and Jenq-Neng Hwang},\n  year={2024},\n  eprint={2411.11922},\n  archivePrefix={arXiv},\n  primaryClass={cs.CV},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.11922}, \n}\n```\n","SAMURAI 项目旨在将 Segment Anything 模型适应于零样本视觉跟踪，并引入了运动感知记忆机制。其核心功能包括基于运动感知的记忆管理，以增强模型在不同场景下的跟踪性能，特别是在目标物体发生遮挡或快速移动的情况下依然能够保持高精度跟踪。该项目采用 Python 编写，具备良好的扩展性和易用性，支持多 GPU 推理加速。适用于需要实时或近实时处理的视频监控、自动驾驶以及任何需要精确物体跟踪的应用场景中。",2,"2026-06-11 03:39:59","high_star"]