Adaptive Long-neck Network with Atrous-Residual Structure for Instance Segmentation

Instance segmentation is an important yet challenging task in computer vision field. Existing mainstream single-stage solution with parameterized mask representation has designed the neck models to fuse features of different layers; however, the performance of instance segmentation is still restrict...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE sensors journal 2023-04, Vol.23 (7), p.1-1
Hauptverfasser: Geng, Wenjie, Cao, Zhiqiang, Guan, Peiyu, Ren, Guangli, Yu, Junzhi, Jing, Fengshui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Instance segmentation is an important yet challenging task in computer vision field. Existing mainstream single-stage solution with parameterized mask representation has designed the neck models to fuse features of different layers; however, the performance of instance segmentation is still restricted to the layer-by-layer transmission scheme. In this paper, an instance segmentation framework with an adaptive long-neck network and atrous-residual structure is proposed. The long-neck network is composed of two bi-directional fusion units, which are cascaded to facilitate the information communication among features of different layers in top-down and bottom-up pathways. Specially, a new cross-layer transmission scheme is introduced in top-down pathway to achieve hybrid dense fusion of multi-scale features and weights of different features are learned adaptively according to their respective contributions to promote the network convergence. Meanwhile, a bottom-up pathway further complements the features with more location clues. In this way, high-level semantic information and low-level location information are tightly integrated. Furthermore, an atrous-residual structure is added to the mask prototype branch of instance prediction to capture more contextual information. This contributes to the generation of high-quality masks. The experiment results indicate that the proposed method achieves effective segmentation and the outputted masks match the contours of objects.
ISSN:1530-437X
1558-1748
DOI:10.1109/JSEN.2023.3244818