Instance Localization for Self-supervised Detection Pretraining
Prior research on self-supervised learning has led to considerable progress on image classification, but often with degraded transfer performance on object detection. The objective of this paper is to advance self-supervised pretrained models specifically for object detection. Based on the inherent...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Prior research on self-supervised learning has led to considerable progress
on image classification, but often with degraded transfer performance on object
detection. The objective of this paper is to advance self-supervised pretrained
models specifically for object detection. Based on the inherent difference
between classification and detection, we propose a new self-supervised pretext
task, called instance localization. Image instances are pasted at various
locations and scales onto background images. The pretext task is to predict the
instance category given the composited images as well as the foreground
bounding boxes. We show that integration of bounding boxes into pretraining
promotes better task alignment and architecture alignment for transfer
learning. In addition, we propose an augmentation method on the bounding boxes
to further enhance the feature alignment. As a result, our model becomes weaker
at Imagenet semantic classification but stronger at image patch localization,
with an overall stronger pretrained model for object detection. Experimental
results demonstrate that our approach yields state-of-the-art transfer learning
results for object detection on PASCAL VOC and MSCOCO. |
---|---|
DOI: | 10.48550/arxiv.2102.08318 |