An Attention Module for Convolutional Neural Networks
Attention mechanism has been regarded as an advanced technique to capture long-range feature interactions and to boost the representation capability for convolutional neural networks. However, we found two ignored problems in current attentional activations-based models: the approximation problem an...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Attention mechanism has been regarded as an advanced technique to capture
long-range feature interactions and to boost the representation capability for
convolutional neural networks. However, we found two ignored problems in
current attentional activations-based models: the approximation problem and the
insufficient capacity problem of the attention maps. To solve the two problems
together, we initially propose an attention module for convolutional neural
networks by developing an AW-convolution, where the shape of attention maps
matches that of the weights rather than the activations. Our proposed attention
module is a complementary method to previous attention-based schemes, such as
those that apply the attention mechanism to explore the relationship between
channel-wise and spatial features. Experiments on several datasets for image
classification and object detection tasks show the effectiveness of our
proposed attention module. In particular, our proposed attention module
achieves 1.00% Top-1 accuracy improvement on ImageNet classification over a
ResNet101 baseline and 0.63 COCO-style Average Precision improvement on the
COCO object detection on top of a Faster R-CNN baseline with the backbone of
ResNet101-FPN. When integrating with the previous attentional activations-based
models, our proposed attention module can further increase their Top-1 accuracy
on ImageNet classification by up to 0.57% and COCO-style Average Precision on
the COCO object detection by up to 0.45. Code and pre-trained models will be
publicly available. |
---|---|
DOI: | 10.48550/arxiv.2108.08205 |