Autoencoder-based 3D representation learning for industrial seedling abnormality detection
Industrial seedling quality assessment, such as attempting to find abnormal seedlings, is a challenging task where assessment methods must contend with the natural variability of seedlings, as well as the subjective nature of expert judgements. Furthermore, obtaining expert judgements is expensive a...
Gespeichert in:
Veröffentlicht in: | Computers and electronics in agriculture 2023-03, Vol.206, p.107619, Article 107619 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Industrial seedling quality assessment, such as attempting to find abnormal seedlings, is a challenging task where assessment methods must contend with the natural variability of seedlings, as well as the subjective nature of expert judgements. Furthermore, obtaining expert judgements is expensive and time-consuming, so machine learning approaches which rely on fewer judgements would be useful in practice. We investigate autoencoders, operating on 3D point clouds obtained from 6732 seedlings to address this challenge, exploiting such systems’ ability to work with partially labelled data. Point clouds from tomato seedlings are recorded using a 3D data capture platform, MARVIN™, and the quality of each seedling is determined by expert judgement. An existing system is used to establish baseline performance scores using a rule-based expert system and machine learning with handcrafted features. Autoencoders are trained on the point clouds to learn representations for subsequent use in classification. We examine scenarios where large amounts of partially labelled data are available, and compare with the case where fully labelled data is available. To improve performance, we compare the architectural subcomponents based on PointNet and PointNet++, as well as the effect of different training strategies. We find, with 13.6% of training data labelled, our model has correct classification rates of 97.7% and 82.7% for normals and abnormals respectively. With further improvements and fully labelled data, we find that correct classification rates of 97.6% and 96.1% can be reached. The results demonstrate that semi-supervised learning supported by partially labelled data has the potential to greatly reduce the cost of data curation, with minimal impact on overall accuracy.
•A dataset of 6732 tomato seedling 3D point clouds is collected and labelled by experts.•Autoencoders perform abnormality detection on 3D point clouds of plants.•Partially labelled data can be used, lowering the labelling burden substantially.•Accuracy is 97.7% (normals) and 82.7% (abnormals) with 13.6% of data labelled.•Accuracy of 97.6% (normals) and 96.1% (abnormals) possible with fully labelled data. |
---|---|
ISSN: | 0168-1699 1872-7107 |
DOI: | 10.1016/j.compag.2023.107619 |