CCDFormer: A dual-backbone complex crack detection network with transformer

Concrete crack detection is a critical aspect of infrastructure maintenance. However, existing methods often fail to deliver satisfactory results in real-world scenarios where various detection challenges coexist. We propose a Transformer-based model to enhance feature extraction for complex crack d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2025-05, Vol.161, p.111251, Article 111251
Hauptverfasser: Hu, Xiangkun, Li, Hua, Feng, Yixiong, Qian, Songrong, Li, Jian, Li, Shaobo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Concrete crack detection is a critical aspect of infrastructure maintenance. However, existing methods often fail to deliver satisfactory results in real-world scenarios where various detection challenges coexist. We propose a Transformer-based model to enhance feature extraction for complex crack detection (CCDFormer). CCDFormer employs a dual-backbone U-shaped structure to independently capture crack features from different perspectives, avoiding interference. Deformable linear convolution align with crack structures, while the proposed feature enhancement module enriches semantic features by boosting local features at multiple scales. The pyramid-shaped Transformer models long-range dependencies across different scales. A carefully designed feature fusion module addresses the shortcomings of local and contextual features, generating robust crack features. On a challenging public dataset for concrete crack detection, CCDFormer improves accuracy, recall, F-measure, and IoU by 3.54%, 0.71%, 2.17%, and 1.48%, compared to existing models. CCDFormer demonstrates higher precision and crack detection rates across various challenges, proving practical for real-world crack detection. •A dual-backbone model combining Convolutional Neural Network and Transformer in parallel is proposed.•A deformable linear convolution method is introduced to align with the topological structure of cracks.•A method for enhancing crack semantic features is presented.•A more flexible positional encoding scheme and a lightweight design approach are employed to optimize the Transformer.•A global and local fusion module is proposed to capture more robust crack features.
ISSN:0031-3203
DOI:10.1016/j.patcog.2024.111251