Adapting Segment Anything Model for Unseen Object Instance Segmentation

Unseen Object Instance Segmentation (UOIS) is crucial for autonomous robots operating in unstructured environments. Previous approaches require full supervision on large-scale tabletop datasets for effective pretraining. In this paper, we propose UOIS-SAM, a data-efficient solution for the UOIS task...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-09
Hauptverfasser: Cao, Rui, Song, Chuanxin, Yang, Biqi, Wang, Jiangliu, Pheng-Ann Heng, Yun-Hui, Liu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Unseen Object Instance Segmentation (UOIS) is crucial for autonomous robots operating in unstructured environments. Previous approaches require full supervision on large-scale tabletop datasets for effective pretraining. In this paper, we propose UOIS-SAM, a data-efficient solution for the UOIS task that leverages SAM's high accuracy and strong generalization capabilities. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder, mitigating issues introduced by the SAM baseline, such as background confusion and over-segmentation, especially in scenarios involving occlusion and texture-rich objects. Extensive experimental results on OCID, OSD, and additional photometrically challenging datasets including PhoCAL and HouseCat6D, demonstrate that, even using only 10% of the training samples compared to previous methods, UOIS-SAM achieves state-of-the-art performance in unseen object segmentation, highlighting its effectiveness and robustness in various tabletop scenes.
ISSN:2331-8422