A lightweight method for apple-on-tree detection based on improved YOLOv5

After apple fruit maturation, the optimal harvest period is short, and the picking robot is expected to improve harvesting efficiency. While it is common for apples to be overlapped and occluded by branches and leaves, which pose challenges to the robot’s apple harvesting. Therefore, precise and swi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Signal, image and video processing image and video processing, 2024-09, Vol.18 (10), p.6713-6727
Hauptverfasser: Li, Mei, Zhang, Jiachuang, Liu, Hubin, Yuan, Yuhui, Li, Junhui, Zhao, Longlian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:After apple fruit maturation, the optimal harvest period is short, and the picking robot is expected to improve harvesting efficiency. While it is common for apples to be overlapped and occluded by branches and leaves, which pose challenges to the robot’s apple harvesting. Therefore, precise and swift identification and localization of the target fruit is crucial. To this end, this paper proposes a lightweight apple detection method, YOLOv5s-ShuffleNetV2-DWconv-Add, or “YOLOv5s-SDA” for short. The red and green apple datasets in natural environment were collected by a mobile phone, which were divided into four categories: red and green apples that can be directly grasped and cannot be directly grasped, in order to avoid damage to the robotic arm. Different deep learning object detection models were compared, with the YOLOv5s algorithm providing superior recognition performance. To improve harvest efficiency and portability of hardware devices, modifications are made to the YOLOv5s algorithm, replacing the Focus, C3, and Conv structures within the backbone with 3 × 3 Conv structures and ShuffleNetV2, removing SPP and C3 structures; substituting the C3 in the Neck portion with DWConv modules; and replacing two Concat layers in the PANet structure with smaller computational Add layers. Results demonstrate that the model achieved a mAP of 94.6% on the test set, doubled the detection speed, and compressed the model weight to 11.8% of its original value, while maintaining model accuracy. This new method exhibits promising performance in fruit target recognition in natural scenes, providing an effective means of visual acquisition for fruit picking robots.
ISSN:1863-1703
1863-1711
DOI:10.1007/s11760-024-03346-3