PointINS: Point-Based Instance Segmentation

In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features. Differentiating multiple potential instances within a single PoI feature is challenging, because learning a high-dimensional mask feature for each instance using vanilla convolution dema...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2022-10, Vol.44 (10), p.6377-6392
Hauptverfasser: Qi, Lu, Wang, Yi, Chen, Yukang, Chen, Ying-Cong, Zhang, Xiangyu, Sun, Jian, Jia, Jiaya
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features. Differentiating multiple potential instances within a single PoI feature is challenging, because learning a high-dimensional mask feature for each instance using vanilla convolution demands a heavy computing burden. To address this challenge, we propose an instance-aware convolution. It decomposes this mask representation learning task into two tractable modules as instance-aware weights and instance-agnostic features. The former is to parametrize convolution for producing mask features corresponding to different instances, improving mask learning efficiency by avoiding employing several independent convolutions. Meanwhile, the latter serves as mask templates in a single point. Together, instance-aware mask features are computed by convolving the template with dynamic weights, used for the mask prediction. Along with instance-aware convolution, we propose PointINS, a simple and practical instance segmentation approach, building upon dense one-stage detectors. Through extensive experiments, we evaluated the effectiveness of our framework built upon RetinaNet and FCOS. PointINS in ResNet101 backbone achieves a 38.3 mask mean average precision (mAP) on COCO dataset, outperforming existing point-based methods by a large margin. It gives a comparable performance to the region-based Mask R-CNN K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in Proc. IEEE Int. Conf. Comput. Vis. , 2017, pp. 2980-2988 with faster inference.
ISSN:0162-8828
2160-9292
1939-3539
DOI:10.1109/TPAMI.2021.3085295