we boost the YOLOv3 to 47.3%AP (YOLOX-DarkNet53) on COCO with 640 × 640 resolution, surpassing the current best practice of YOLOv3(44.3% AP, ultralytics version2) by a large margin.’
YOLOv5 640x640; 50.0% AP, supass 1.8%AP
YOLOX
Implementation details
300 epoch , 5 epoch warmup
lr = init_lr * (batchsize/64), init_lr = 0.01, cos lr schedule, (8-gpu, batchsize:128)
input size : 448 to 832 , 32 strides
YOLOv3 baseline
DarkNet53 + SPPlayer
==adding EMA weights updating, cosine lr schedule, IoU loss and IoU-aware branch. We use BCE Loss for training cls and obj branch, and IoU Loss for training reg branch.==
Decoupled head
Strong data augmentzation
Mosaic and Mixup, and closed it for the last 15 epoches
After using strong data augmentation, ImageNet pre-trained is no more beneficial , == train all the following models from scratch.==
Anchor-free
Multi positives
not only the center point as positive ; center 3x3 area as positives ; as ‘center sampling’ in Fcos, balance the positive/negative samplings.
SimOTA:
==Zheng Ge, Songtao Liu, Zeming Li, Osamu Yoshie, and Jian Sun. Ota: Optimal transport assignment for object detection. In CVPR, 2021.==