End-to-End lightweight Transformer-Based neural network for grasp detection towards fruit robotic handling
•An efficient and accurate grasping detection model MDETR was developed.•Local features of CNN and the global representation of transformer were fused.•Transfer learning and multi-stage training strategy were used to accelerate the model training process.•MDETR algorithm was tested using simulation...
Gespeichert in:
Veröffentlicht in: | Computers and electronics in agriculture 2024-06, Vol.221, p.109014, Article 109014 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •An efficient and accurate grasping detection model MDETR was developed.•Local features of CNN and the global representation of transformer were fused.•Transfer learning and multi-stage training strategy were used to accelerate the model training process.•MDETR algorithm was tested using simulation test environment.
Robotic picking and placing are common operations for fruits and vegetables in grading, sorting or packaging systems. However, due to the diverse shapes and irregular surfaces of fruits and vegetables, improper handling during the picking process can result in detachment or damage. To ensure the correct grasping positions, it is necessary to design targeted neural network algorithms for achieving intelligent sorting. Therefore, this study focuses on 20 common fruit and vegetable agricultural products to develop a deep learning-based grasping detection algorithm model. By combining local features from convolutional neural networks with global features from Transformers, a lightweight end-to-end fruit and vegetable grasping detection network, MDETR, is constructed. Experimental results demonstrate that the MDETR algorithm not only achieves high accuracy in fruit and vegetable grasping detection but also improves the speed of pose detection. The average time required for detecting a single image is approximately 29.6 ms, meeting real-time requirements. The algorithm achieves a pose detection accuracy rate of 96 %, enabling precise detection and positioning of fruit and vegetable poses and achieving fast and accurate picking and placing. Additionally, a Pybullet simulation platform is developed for conducting grasping experiments, where the MDETR model achieves a grasping success rate of 88.9 %. This validates the robustness and generalization capabilities of the proposed detection algorithm model, designed specifically for fruit and vegetable grasping tasks. |
---|---|
ISSN: | 0168-1699 |
DOI: | 10.1016/j.compag.2024.109014 |