Predicting spatial and temporal variability in soybean yield using deep learning and open source data

Spatial crop yield prediction provides valuable insights for supporting sustainable and precise crop management decisions. This study assessed the capabilities of advanced Deep Learning (DL) architectures in predicting within-field soybean yields using spectral bands from Sentinel-2 (RS-Inputs), wea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:European journal of agronomy 2025-03, Vol.164, p.127498, Article 127498
Hauptverfasser: Gaso, Deborah V., Cue La Rosa, Laura Elena, Puntel, Laila A., Rattalino Edreira, Juan I., de Wit, Allard, Kooistra, Lammert
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Spatial crop yield prediction provides valuable insights for supporting sustainable and precise crop management decisions. This study assessed the capabilities of advanced Deep Learning (DL) architectures in predicting within-field soybean yields using spectral bands from Sentinel-2 (RS-Inputs), weather (W-Inputs), and topographic attributes (TA-Inputs). DL architectures included 1-D convolutional neural network (1D-CNN), long short-term memory (LSTM) and transformer-based (TRFM). We used an extensive dataset ( ∼ 700 K) of yield observations, collected with a combine harvester, from 310 fields across three growing seasons (2020, 2021 and 2022) in Uruguay. The DL architectures were assessed under two testing strategies: across-fields and across-years. We compared results from DL architectures against a baseline that uses a process-based method with data assimilation. Our results revealed that DL architectures outperformed the baseline in testing across-fields only (RRMSE 35 % vs 40 %). The DL architectures encountered more challenges with temporal extrapolation (RRMSE 51 % in across-years). There were no substantial differences in performance among the DL architectures. The TA-Inputs enhanced accuracy in 1D-CNN (reduced RRMSE by ∼ 13 %), while W-Inputs led to a small improvement in 1D-CNN and LSTM (reduced RRMSE by ∼ 2 %) when tested across-years. All combinations of DL architectures and input settings encountered challenges in predicting the tails of the yield distribution (mean bias ∼ 1000 kg ha−1). We discussed the current limitations of DL architectures in capturing crop yield complexity using openly available spatial data and provided further directions for improving the reliability and interpretability of data-driven models by integrating process-based approaches. Beyond this, the performance of data-driven methods alone is expected to improve with the increasing availability of collected and stored data. Incorporating historical yield maps and in-season crop management data into open-source datasets will facilitate continuous training and enhancement for tailored models. [Display omitted] •A process-based approach was superior to DL in temporal extrapolation.•DL is more promising for predicting unseen fields than years (RRMSE 35 % vs 51 %).•Performance was comparable among DL architectures.•Adding weather and topography inputs had a minor impact on accuracy.•Errors increased towards the tails of the distribution across all models and inputs.
ISSN:1161-0301
DOI:10.1016/j.eja.2024.127498