Small and overlapping worker detection at construction sites
•A novel feature extraction architecture, SOC-YOLO, that aims to improve the challenging task of detecting workers at construction sites (including vision occlusion, small targets, and low light) is proposed.•Experimental results show the detection accuracy has been improved through the utilization...
Gespeichert in:
Veröffentlicht in: | Automation in construction 2023-07, Vol.151, p.104856, Article 104856 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •A novel feature extraction architecture, SOC-YOLO, that aims to improve the challenging task of detecting workers at construction sites (including vision occlusion, small targets, and low light) is proposed.•Experimental results show the detection accuracy has been improved through the utilization of the proposed attention mechanism and replacing MaxPool with SoftPool in SPPF•The detection accuracy of construction site workers using the proposed SOC-YOLO model has been compared with state-of-the-art models.
Although there has been study on worker detection using computer vision (CV) for the safety of construction sites, it is still challenging to identify employees who are obstructed or have poor vision. To solve these problems, we propose a method of small and overlapping target (worker) detection at a complex construction site named SOC-YOLO. The method is based on YOLOv5 and utilizes distance intersection over union (DIoU) non-maximum suppression (NMS), incorporating weighted triplet attention, expansion feature-level, and Soft-pool. Workers can be captured with overlap, particularly in large-scale construction sites, using the DIoU-based loss function, and NMS contributed to accuracy improvement. Next, we propose a weighted-triplet attention mechanism that can extract feature information from space more effectively and channel attention when learning object detection networks, using a simple average approach based on the same weight between the existing triplet attention. Next, we propose a model that adds additional predictive heads and residual connections to address the poor detection accuracy of workers photographed over long distances. A low-level feature map containing more information regarding small targets is used by extending the feature level. Finally, Softpool-spatial pyramid pooling fast (Softpool-SPPF) is proposed to solve the problem of inconsistent input image sizes. Softpool-SPPF performs an spatial pyramid pooling (SPP) function while preserving more functional information for accurate small target detection. Experiments were conducted using published worker detection datasets and handmade datasets, and the results showed increase from 81.26% to 84.63% average precision (AP) for small objects, from 67.52% to 73.88% mAP for minute objects, from 74.56% to77.57% for overlapping objects. The proposed method is expected to be useful for safety monitoring by applying it to the construction site worker tracking model. |
---|---|
ISSN: | 0926-5805 1872-7891 |
DOI: | 10.1016/j.autcon.2023.104856 |