Dense Contrastive Learning for Self-Supervised Visual Pre-Training
To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | To date, most existing self-supervised learning methods are designed and
optimized for image classification. These pre-trained models can be sub-optimal
for dense prediction tasks due to the discrepancy between image-level
prediction and pixel-level prediction. To fill this gap, we aim to design an
effective, dense self-supervised learning method that directly works at the
level of pixels (or local features) by taking into account the correspondence
between local features. We present dense contrastive learning, which implements
self-supervised learning by optimizing a pairwise contrastive (dis)similarity
loss at the pixel level between two views of input images. Compared to the
baseline method MoCo-v2, our method introduces negligible computation overhead
(only |
---|---|
DOI: | 10.48550/arxiv.2011.09157 |