DropCluster: A structured dropout for convolutional networks
Dropout as a regularizer in deep neural networks has been less effective in convolutional layers than in fully connected layers. This is due to the fact that dropout drops features randomly. When features are spatially correlated as in the case of convolutional layers, information about the dropped...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Dropout as a regularizer in deep neural networks has been less effective in
convolutional layers than in fully connected layers. This is due to the fact
that dropout drops features randomly. When features are spatially correlated as
in the case of convolutional layers, information about the dropped pixels can
still propagate to the next layers via neighboring pixels. In order to address
this problem, more structured forms of dropout have been proposed. A drawback
of these methods is that they do not adapt to the data. In this work, we
introduce a novel structured regularization for convolutional layers, which we
call DropCluster. Our regularizer relies on data-driven structure. It finds
clusters of correlated features in convolutional layer outputs and drops the
clusters randomly at each iteration. The clusters are learned and updated
during model training so that they adapt both to the data and to the model
weights. Our experiments on the ResNet-50 architecture demonstrate that our
approach achieves better performance than DropBlock or other existing
structured dropout variants. We also demonstrate the robustness of our approach
when the size of training data is limited and when there is corruption in the
data at test time. |
---|---|
DOI: | 10.48550/arxiv.2002.02997 |