Improving Yield Data Analysis Using Contextual Data

Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that da...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Applied engineering in agriculture 2023, Vol.39 (4), p.391-398
Hauptverfasser:	Hawkins, Elizabeth M., Buckmaster, Dennis R.
Format:	Artikel
Sprache:	eng
Schlagworte:	Cleaning Confidence intervals Data analysis Data collection Data points Datasets Decision analysis Estimates Mass flow Mean Resource allocation Soils Statistical analysis Topography
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Highlights Context-driven yield data cleaning resulted in more accurate whole field yield estimates Using a context-driven yield data cleaning method can improve yield estimates for zones within fields Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection. Keyword
ISSN:	1943-7838 0883-8542 1943-7838
DOI:	10.13031/aea.14655