Deep Texton-Coherence Network for Camouflaged Object Detection

Camouflaged object detection is a challenging visual task since the appearance and morphology of foreground objects and background regions are highly similar in nature. Recent CNN-based studies gradually integrated the high-level semantic information and the low-level local features of images throug...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on multimedia 2023, Vol.25, p.5155-5165
Hauptverfasser:	Zhai, Wei, Cao, Yang, Xie, HaiYong, Zha, Zheng-Jun
Format:	Artikel
Sprache:	eng
Schlagworte:	Ablation Camouflaged object detection Coherence Convolution deep learning Feature extraction Formability Modules Object detection Object recognition Organizations Representations Semantics Spatial coherence texture representation Visual tasks
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Camouflaged object detection is a challenging visual task since the appearance and morphology of foreground objects and background regions are highly similar in nature. Recent CNN-based studies gradually integrated the high-level semantic information and the low-level local features of images through hierarchical and progressive structures to achieve camouflaged object detection. However, these methods ignore the spatial statistical properties of the local context, which is a critical cue for distinguishing and describing camouflaged objects. To address this problem, we propose a novel Deep Texton-Coherence Network (DTC-Net) that leverages the spatial organization of textons in the foreground and background regions as discriminative cues for camouflaged object detection. Specifically, a Local Bilinear module (LB) is devised to obtain the robust representation of texton to trivial details and illumination changes, by replacing the classic first-order linearization operations with bilinear second-order statistical operations in the convolution process. Next, these texton representations are associated with a Spatial Coherence Organization module (SCO) to capture irregular spatial coherence via a deformable convolutional strategy, and then the descriptions of the textons extracted by the LB module are used as weights to suppress features that are spatially adjacent but have different representations. Finally, the texton-coherence representation is integrated with the original features at different levels to achieve camouflaged object detection. Evaluation on the three most challenging camouflaged object detection datasets demonstrats the superiority of the proposed model when compared to the state-of-the-art methods. Furthermore, our ablation studies and performance analyses demonstrate the effectiveness of the texton-coherence module.
ISSN:	1520-9210 1941-0077
DOI:	10.1109/TMM.2022.3188401