Real-time Object Detection using Detection Transformer | Implemented from Scratch using PyTorch

Опубликовано: 03 Июнь 2026
на канале: Raphael Senn
29
4

DETR was implemented from scratch using PyTorch and trained on the Pascal VOC dataset on a single RTX 3060 TI GPU. Training took ~63 hours (2.6 days). I trained on Pascal VOC trainval07+12 (16551 images) and evaluated on Pascal VOC test07 (4952 images). The model achived a mAP50 score of 0.7563 on the test07 set.

Code:
https://github.com/raphaelsenn/DETR

Original paper:
End-to-End Object Detection with Transformers, Carion et al., 2020
https://arxiv.org/abs/2005.12872