A methods guideline for deep learning for tabular data in agriculture with a case study to forecast cereal yield

[Display omitted] •Deep learning concepts and how they apply to agricultural tabular data are clarified.•Guidelines and recommendations are provided for applying deep learning in tabular agricultural datasets.•RF, XGBoost, MLP and TabNet were compared to forecast cereal yield based on remote sensing...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers and electronics in agriculture 2023-02, Vol.205, p.107642, Article 107642
Hauptverfasser: Richetti, Jonathan, Diakogianis, Foivos I., Bender, Asher, Colaço, André F., Lawes, Roger A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •Deep learning concepts and how they apply to agricultural tabular data are clarified.•Guidelines and recommendations are provided for applying deep learning in tabular agricultural datasets.•RF, XGBoost, MLP and TabNet were compared to forecast cereal yield based on remote sensing and weather data.•Deep learning and traditional machine learning algorithms performed equally when forecasting cereal yields. Machine learning (ML) and its branch, deep learning (DL), is rapidly evolving and gaining popularity as it outperforms other, more traditional methods in different areas of agriculture. However, ML and DL techniques must be correctly applied to a problem to produce an acceptable solution. This article provides guidelines for using DL techniques with a case study using different models/methods to forecast yields in cereals; some of the concepts presented here are also applicable to ML more broadly. The objective is to provide clarity for new users around the use of DL techniques to solve agronomic problems. DL concepts are introduced; best practices for data pre-processing steps and metrics are recommended. Cross-validation is clarified, and its importance is highlighted. It is shown that DL performance can vary with architecture and that the optimal choice is task-dependent. Emphasis on practical aspects for applying DL models for agricultural datasets is provided, such as dataset size (26 representative samples in each field sufficed) and cross-validation (indispensable on small datasets). Lastly, a standard guideline for DL applied to tabular data is recommended.
ISSN:0168-1699
1872-7107
DOI:10.1016/j.compag.2023.107642