Self-supervised transformer-based pre-training method using latent semantic masking auto-encoder for pest and disease classification

Pest and disease classification is a challenging issue in agriculture. Currently, the classification algorithms of pests and diseases based on CNN models have become popular. However, these methods have a limited performance improvement due to a lack of global information interaction and discriminat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers and electronics in agriculture 2022-12, Vol.203, p.107448, Article 107448
Hauptverfasser: Liu, Honglin, Zhan, Yongzhao, Xia, Huifen, Mao, Qirong, Tan, Yixin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Pest and disease classification is a challenging issue in agriculture. Currently, the classification algorithms of pests and diseases based on CNN models have become popular. However, these methods have a limited performance improvement due to a lack of global information interaction and discriminative feature representation. Therefore, we propose a self-supervised transformer-based pre-training method using latent semantic masking auto-encoder (LSMAE). In this method, a feature relationship conditional filtering (FRCF) based on k-NN graph is proposed for filtering irrelevant data from the source domain and generating a subset of source domain. The data in this subset are similar to that of target domain which can supplement feature learning of the target domain. To further improve the performance, a novel auto-encoder based on latent semantic masking is proposed for transformer model pre-training. This auto-encoder can select key patches of each image in the subset of the source domain and let the transformer model learn a more discriminative feature representation. Finally, the target domain data are utilized to fine-tune the pre-trained transformer model. Experiments conducted on public datasets, such as IP102, CPB, and Plant Village, show that our method outperforms the state-of-the-art methods. For example, our method achieves 74.69%/76.99%/99.93% accuracy on IP102/CPB/Plant Village, demonstrating that the proposed self-supervised transformer-based pre-training method is more effective in the pest and disease classification field than CNN-based methods. •A transformer-based method is proposed for pest and disease classification.•Conditional filtering is used to select relevant data from the source domain.•Latent semantic masking benefits the accuracy of pest and disease classification.•Our method surpasses the CNN-based methods by a large margin on IP102.
ISSN:0168-1699
1872-7107
DOI:10.1016/j.compag.2022.107448