[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-70735":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":29,"discoverSource":30},70735,"detr","facebookresearch\u002Fdetr","facebookresearch","End-to-End Object Detection with Transformers",null,"Python",15301,2666,2,240,0,6,12,36,18,45,"Apache License 2.0",true,false,"main",[],"2026-06-12 02:02:42","**DE⫶TR**: End-to-End Object Detection with Transformers\n========\n\n[![Support Ukraine](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSupport-Ukraine-FFD500?style=flat&labelColor=005BBB)](https:\u002F\u002Fopensource.fb.com\u002Fsupport-ukraine)\n\nPyTorch training code and pretrained models for **DETR** (**DE**tection **TR**ansformer).\nWe replace the full complex hand-crafted object detection pipeline with a Transformer, and match Faster R-CNN with a ResNet-50, obtaining **42 AP** on COCO using half the computation power (FLOPs) and the same number of parameters. Inference in 50 lines of PyTorch.\n\n![DETR](.github\u002FDETR.png)\n\n**What it is**. Unlike traditional computer vision techniques, DETR approaches object detection as a direct set prediction problem. It consists of a set-based global loss, which forces unique predictions via bipartite matching, and a Transformer encoder-decoder architecture. \nGiven a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. Due to this parallel nature, DETR is very fast and efficient.\n\n**About the code**. We believe that object detection should not be more difficult than classification,\nand should not require complex libraries for training and inference.\nDETR is very simple to implement and experiment with, and we provide a\n[standalone Colab Notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002Fdetr\u002Fblob\u002Fcolab\u002Fnotebooks\u002Fdetr_demo.ipynb)\nshowing how to do inference with DETR in only a few lines of PyTorch code.\nTraining code follows this idea - it is not a library,\nbut simply a [main.py](main.py) importing model and criterion\ndefinitions with standard training loops.\n\nAdditionnally, we provide a Detectron2 wrapper in the d2\u002F folder. See the readme there for more information.\n\nFor details see [End-to-End Object Detection with Transformers](https:\u002F\u002Fai.facebook.com\u002Fresearch\u002Fpublications\u002Fend-to-end-object-detection-with-transformers) by Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko.\n\nSee our [blog post](https:\u002F\u002Fai.facebook.com\u002Fblog\u002Fend-to-end-object-detection-with-transformers\u002F) to learn more about end to end object detection with transformers.\n# Model Zoo\nWe provide baseline DETR and DETR-DC5 models, and plan to include more in future.\nAP is computed on COCO 2017 val5k, and inference time is over the first 100 val5k COCO images,\nwith torchscript transformer.\n\n\u003Ctable>\n  \u003Cthead>\n    \u003Ctr style=\"text-align: right;\">\n      \u003Cth>\u003C\u002Fth>\n      \u003Cth>name\u003C\u002Fth>\n      \u003Cth>backbone\u003C\u002Fth>\n      \u003Cth>schedule\u003C\u002Fth>\n      \u003Cth>inf_time\u003C\u002Fth>\n      \u003Cth>box AP\u003C\u002Fth>\n      \u003Cth>url\u003C\u002Fth>\n      \u003Cth>size\u003C\u002Fth>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Cth>0\u003C\u002Fth>\n      \u003Ctd>DETR\u003C\u002Ftd>\n      \u003Ctd>R50\u003C\u002Ftd>\n      \u003Ctd>500\u003C\u002Ftd>\n      \u003Ctd>0.036\u003C\u002Ftd>\n      \u003Ctd>42.0\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r50-e632da11.pth\">model\u003C\u002Fa>&nbsp;|&nbsp;\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Flogs\u002Fdetr-r50_log.txt\">logs\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>159Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Cth>1\u003C\u002Fth>\n      \u003Ctd>DETR-DC5\u003C\u002Ftd>\n      \u003Ctd>R50\u003C\u002Ftd>\n      \u003Ctd>500\u003C\u002Ftd>\n      \u003Ctd>0.083\u003C\u002Ftd>\n      \u003Ctd>43.3\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r50-dc5-f0fb7ef5.pth\">model\u003C\u002Fa>&nbsp;|&nbsp;\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Flogs\u002Fdetr-r50-dc5_log.txt\">logs\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>159Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Cth>2\u003C\u002Fth>\n      \u003Ctd>DETR\u003C\u002Ftd>\n      \u003Ctd>R101\u003C\u002Ftd>\n      \u003Ctd>500\u003C\u002Ftd>\n      \u003Ctd>0.050\u003C\u002Ftd>\n      \u003Ctd>43.5\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r101-2c7b67e5.pth\">model\u003C\u002Fa>&nbsp;|&nbsp;\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Flogs\u002Fdetr-r101_log.txt\">logs\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>232Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Cth>3\u003C\u002Fth>\n      \u003Ctd>DETR-DC5\u003C\u002Ftd>\n      \u003Ctd>R101\u003C\u002Ftd>\n      \u003Ctd>500\u003C\u002Ftd>\n      \u003Ctd>0.097\u003C\u002Ftd>\n      \u003Ctd>44.9\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r101-dc5-a2e86def.pth\">model\u003C\u002Fa>&nbsp;|&nbsp;\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Flogs\u002Fdetr-r101-dc5_log.txt\">logs\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>232Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\nCOCO val5k evaluation results can be found in this [gist](https:\u002F\u002Fgist.github.com\u002Fszagoruyko\u002F9c9ebb8455610958f7deaa27845d7918).\n\nThe models are also available via torch hub,\nto load DETR R50 with pretrained weights simply do:\n```python\nmodel = torch.hub.load('facebookresearch\u002Fdetr:main', 'detr_resnet50', pretrained=True)\n```\n\n\nCOCO panoptic val5k models:\n\u003Ctable>\n  \u003Cthead>\n    \u003Ctr style=\"text-align: right;\">\n      \u003Cth>\u003C\u002Fth>\n      \u003Cth>name\u003C\u002Fth>\n      \u003Cth>backbone\u003C\u002Fth>\n      \u003Cth>box AP\u003C\u002Fth>\n      \u003Cth>segm AP\u003C\u002Fth>\n      \u003Cth>PQ\u003C\u002Fth>\n      \u003Cth>url\u003C\u002Fth>\n      \u003Cth>size\u003C\u002Fth>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Cth>0\u003C\u002Fth>\n      \u003Ctd>DETR\u003C\u002Ftd>\n      \u003Ctd>R50\u003C\u002Ftd>\n      \u003Ctd>38.8\u003C\u002Ftd>\n      \u003Ctd>31.1\u003C\u002Ftd>\n      \u003Ctd>43.4\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r50-panoptic-00ce5173.pth\">download\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>165Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Cth>1\u003C\u002Fth>\n      \u003Ctd>DETR-DC5\u003C\u002Ftd>\n      \u003Ctd>R50\u003C\u002Ftd>\n      \u003Ctd>40.2\u003C\u002Ftd>\n      \u003Ctd>31.9\u003C\u002Ftd>\n      \u003Ctd>44.6\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r50-dc5-panoptic-da08f1b1.pth\">download\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>165Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Cth>2\u003C\u002Fth>\n      \u003Ctd>DETR\u003C\u002Ftd>\n      \u003Ctd>R101\u003C\u002Ftd>\n      \u003Ctd>40.1\u003C\u002Ftd>\n      \u003Ctd>33\u003C\u002Ftd>\n      \u003Ctd>45.1\u003C\u002Ftd>\n      \u003Ctd>\u003Ca href=\"https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r101-panoptic-40021d53.pth\">download\u003C\u002Fa>\u003C\u002Ftd>\n      \u003Ctd>237Mb\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\nCheckout our [panoptic colab](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002Fdetr\u002Fblob\u002Fcolab\u002Fnotebooks\u002FDETR_panoptic.ipynb)\nto see how to use and visualize DETR's panoptic segmentation prediction.\n\n# Notebooks\n\nWe provide a few notebooks in colab to help you get a grasp on DETR:\n* [DETR's hands on Colab Notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002Fdetr\u002Fblob\u002Fcolab\u002Fnotebooks\u002Fdetr_attention.ipynb): Shows how to load a model from hub, generate predictions, then visualize the attention of the model (similar to the figures of the paper)\n* [Standalone Colab Notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002Fdetr\u002Fblob\u002Fcolab\u002Fnotebooks\u002Fdetr_demo.ipynb): In this notebook, we demonstrate how to implement a simplified version of DETR from the grounds up in 50 lines of Python, then visualize the predictions. It is a good starting point if you want to gain better understanding the architecture and poke around before diving in the codebase.\n* [Panoptic Colab Notebook](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Ffacebookresearch\u002Fdetr\u002Fblob\u002Fcolab\u002Fnotebooks\u002FDETR_panoptic.ipynb): Demonstrates how to use DETR for panoptic segmentation and plot the predictions.\n\n\n# Usage - Object detection\nThere are no extra compiled components in DETR and package dependencies are minimal,\nso the code is very simple to use. We provide instructions how to install dependencies via conda.\nFirst, clone the repository locally:\n```\ngit clone https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fdetr.git\n```\nThen, install PyTorch 1.5+ and torchvision 0.6+:\n```\nconda install -c pytorch pytorch torchvision\n```\nInstall pycocotools (for evaluation on COCO) and scipy (for training):\n```\nconda install cython scipy\npip install -U 'git+https:\u002F\u002Fgithub.com\u002Fcocodataset\u002Fcocoapi.git#subdirectory=PythonAPI'\n```\nThat's it, should be good to train and evaluate detection models.\n\n(optional) to work with panoptic install panopticapi:\n```\npip install git+https:\u002F\u002Fgithub.com\u002Fcocodataset\u002Fpanopticapi.git\n```\n\n## Data preparation\n\nDownload and extract COCO 2017 train and val images with annotations from\n[http:\u002F\u002Fcocodataset.org](http:\u002F\u002Fcocodataset.org\u002F#download).\nWe expect the directory structure to be the following:\n```\npath\u002Fto\u002Fcoco\u002F\n  annotations\u002F  # annotation json files\n  train2017\u002F    # train images\n  val2017\u002F      # val images\n```\n\n## Training\nTo train baseline DETR on a single node with 8 gpus for 300 epochs run:\n```\npython -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path \u002Fpath\u002Fto\u002Fcoco \n```\nA single epoch takes 28 minutes, so 300 epoch training\ntakes around 6 days on a single machine with 8 V100 cards.\nTo ease reproduction of our results we provide\n[results and training logs](https:\u002F\u002Fgist.github.com\u002Fszagoruyko\u002Fb4c3b2c3627294fc369b899987385a3f)\nfor 150 epoch schedule (3 days on a single machine), achieving 39.5\u002F60.3 AP\u002FAP50.\n\nWe train DETR with AdamW setting learning rate in the transformer to 1e-4 and 1e-5 in the backbone.\nHorizontal flips, scales and crops are used for augmentation.\nImages are rescaled to have min size 800 and max size 1333.\nThe transformer is trained with dropout of 0.1, and the whole model is trained with grad clip of 0.1.\n\n\n## Evaluation\nTo evaluate DETR R50 on COCO val5k with a single GPU run:\n```\npython main.py --batch_size 2 --no_aux_loss --eval --resume https:\u002F\u002Fdl.fbaipublicfiles.com\u002Fdetr\u002Fdetr-r50-e632da11.pth --coco_path \u002Fpath\u002Fto\u002Fcoco\n```\nWe provide results for all DETR detection models in this\n[gist](https:\u002F\u002Fgist.github.com\u002Fszagoruyko\u002F9c9ebb8455610958f7deaa27845d7918).\nNote that numbers vary depending on batch size (number of images) per GPU.\nNon-DC5 models were trained with batch size 2, and DC5 with 1,\nso DC5 models show a significant drop in AP if evaluated with more\nthan 1 image per GPU.\n\n## Multinode training\nDistributed training is available via Slurm and [submitit](https:\u002F\u002Fgithub.com\u002Ffacebookincubator\u002Fsubmitit):\n```\npip install submitit\n```\nTrain baseline DETR-6-6 model on 4 nodes for 300 epochs:\n```\npython run_with_submitit.py --timeout 3000 --coco_path \u002Fpath\u002Fto\u002Fcoco\n```\n\n# Usage - Segmentation\n\nWe show that it is relatively straightforward to extend DETR to predict segmentation masks. We mainly demonstrate strong panoptic segmentation results.\n\n## Data preparation\n\nFor panoptic segmentation, you need the panoptic annotations additionally to the coco dataset (see above for the coco dataset). You need to download and extract the [annotations](http:\u002F\u002Fimages.cocodataset.org\u002Fannotations\u002Fpanoptic_annotations_trainval2017.zip).\nWe expect the directory structure to be the following:\n```\npath\u002Fto\u002Fcoco_panoptic\u002F\n  annotations\u002F  # annotation json files\n  panoptic_train2017\u002F    # train panoptic annotations\n  panoptic_val2017\u002F      # val panoptic annotations\n```\n\n## Training\n\nWe recommend training segmentation in two stages: first train DETR to detect all the boxes, and then train the segmentation head.\nFor panoptic segmentation, DETR must learn to detect boxes for both stuff and things classes. You can train it on a single node with 8 gpus for 300 epochs with:\n```\npython -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --coco_path \u002Fpath\u002Fto\u002Fcoco  --coco_panoptic_path \u002Fpath\u002Fto\u002Fcoco_panoptic --dataset_file coco_panoptic --output_dir \u002Foutput\u002Fpath\u002Fbox_model\n```\nFor instance segmentation, you can simply train a normal box model (or used a pre-trained one we provide).\n\nOnce you have a box model checkpoint, you need to freeze it, and train the segmentation head in isolation.\nFor panoptic segmentation you can train on a single node with 8 gpus for 25 epochs:\n```\npython -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --masks --epochs 25 --lr_drop 15 --coco_path \u002Fpath\u002Fto\u002Fcoco  --coco_panoptic_path \u002Fpath\u002Fto\u002Fcoco_panoptic  --dataset_file coco_panoptic --frozen_weights \u002Foutput\u002Fpath\u002Fbox_model\u002Fcheckpoint.pth --output_dir \u002Foutput\u002Fpath\u002Fsegm_model\n```\nFor instance segmentation only, simply remove the `dataset_file` and `coco_panoptic_path` arguments from the above command line.\n\n# License\nDETR is released under the Apache 2.0 license. Please see the [LICENSE](LICENSE) file for more information.\n\n# Contributing\nWe actively welcome your pull requests! Please see [CONTRIBUTING.md](.github\u002FCONTRIBUTING.md) and [CODE_OF_CONDUCT.md](.github\u002FCODE_OF_CONDUCT.md) for more info.\n","DETR 是一个基于Transformer的端到端物体检测框架。它利用Transformer架构替代了传统复杂的物体检测流程，通过全局损失函数和并行预测机制直接输出检测结果，简化了模型结构，并在COCO数据集上达到了42 AP的成绩，同时使用了更少的计算资源。该模型特别适合需要高效且准确物体检测的应用场景，如自动驾驶、视频监控等。项目提供了简洁的PyTorch实现代码及预训练模型，易于理解和实验，降低了物体检测任务的入门门槛。","2026-06-11 03:33:54","high_star"]