Searching sharing relationship for instance segmentation decoder

Instance segmentation is a typical visual task that requires per-pixel mask prediction with a category label for each instance. For the decoder in instance segmentation network, parallel branches or towers are commonly adopted to deal with instance- and dense-level predictions. However, this paralle...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-09, Vol.53 (18), p.20938-20949
Hauptverfasser: Xi, Yuling, Wang, Ning, Wan, Shaohua, Wang, Xiaoming, Wang, Peng, Zhang, Yanning
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Instance segmentation is a typical visual task that requires per-pixel mask prediction with a category label for each instance. For the decoder in instance segmentation network, parallel branches or towers are commonly adopted to deal with instance- and dense-level predictions. However, this parallelism ignores inter-branch and inner-branch relationships. Besides, how the different branches are connected is unclear, which is difficult to explore manually in practice. To address the above issues, we introduce Neural Architecture Search (NAS) to automatically search for hardware and memory-friendly feature sharing branch. Concretely, applying to instance segmentation, we design a search space considering both operations and sharing connections of parallel branches. Through a tailored reinforcement learning(RL) paradigm, we can efficiently search multiple architectures with different shared patterns and tap more feature selection possibilities. Our method is generically useful and can be transferred to analogous multi-task networks. The searched architecture shares features in the middle of the head branches and utilizes instance-level head features to generate pixel-level predictions. Extensive experiments demonstrate the effectiveness and surpass classical parallel decoder networks, exceeding BlendMask by 1.2% on bounding box mAP and 0.9% on segmentation mAP.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-022-04434-y