Depthwise grouped convolution for object detection
Object detection usually adopts two-stage end-to-end networks, which use backbone network (such as VGG and ResNet) for feature extraction and are combined with the region proposal network (RPN) for object localization and classification. In this paper, we explore a novel depthwise grouped convolutio...
Gespeichert in:
Veröffentlicht in: | Machine vision and applications 2021-11, Vol.32 (6), Article 115 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Object detection usually adopts two-stage end-to-end networks, which use backbone network (such as VGG and ResNet) for feature extraction and are combined with the region proposal network (RPN) for object localization and classification. In this paper, we explore a novel depthwise grouped convolution (DGC) in the backbone network by integrating channels grouping and depthwise separable convolution, which is able to share the convolution parameters in different channels to reduce the amounts of parameters for speeding up training. In particular, split and shuffle strategies of channels are introduced to enhance information exchange between different groups of channels in DGC block, which can prevent the decrease of performance caused by insufficient object samples. Furthermore, non-local block is adopted in RPN to focus on small objects that are hard to identify. Consequently, we introduce margin-based loss to guide the model training together with the loss of classification and regression. Experiments conducted on the VOC2007, VOC2012 and COCO2017 datasets demonstrate the efficiency and effectiveness of our method for object detection. |
---|---|
ISSN: | 0932-8092 1432-1769 |
DOI: | 10.1007/s00138-021-01243-0 |