PD-DETR: towards efficient parallel hybrid matching with transformer for photovoltaic cell defects detection

Defect detection for photovoltaic (PV) cell images is a challenging task due to the small size of the defect features and the complexity of the background characteristics. Modern detectors rely mostly on proxy learning objectives for prediction and on manual post-processing components. One-to-one se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Complex & Intelligent Systems 2024-12, Vol.10 (6), p.7421-7434
Hauptverfasser: Zhao, Langyue, Wu, Yiquan, Yuan, Yubin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Defect detection for photovoltaic (PV) cell images is a challenging task due to the small size of the defect features and the complexity of the background characteristics. Modern detectors rely mostly on proxy learning objectives for prediction and on manual post-processing components. One-to-one set matching is a critical design for DEtection TRansformer (DETR) in order to provide end-to-end capability, so that does not need a hand-crafted Efficient Non-Maximum Suppression NMS. In order to detect PV cell defects faster and better, a technology called the PV cell Defects DEtection Transformer (PD-DETR) is proposed. To address the issue of slow convergence caused by DETR’s direct translation of image feature mapping into target detection results, we created a hybrid feature module. To achieve a balance between performance and computation, the image features are passed through a scoring network and dilated convolution, respectively, to obtain the foreground fine feature and contour high-frequency feature. The two features are then adaptively intercepted and fused. The capacity of the model to detect small-scale defects under complex background conditions is improved by the addition of high-frequency information. Furthermore, too few positive queries will be assigned to the defect target via one-to-one set matching, which will result in sparse supervision of the encoder and impair the decoder’s ability of attention learning. Consequently, we enhanced the detection effect by combining the original DETR with the one-to-many matching branch. Specifically, two Faster RCNN detection heads were added during training. To maintain the end-to-end benefits of DETR, inference is still performed using the original one-to-one set matching. Our model implements 64.7% AP on the PVEL-AD dataset.
ISSN:2199-4536
2198-6053
DOI:10.1007/s40747-024-01559-0