t-SNE: A study on reducing the dimensionality of hyperspectral data for the regression problem of estimating oenological parameters

In recent years there is a growing importance in using machine learning techniques to improve procedures in precision agriculture: in this work we perform a study on models capable of predicting oenological parameters from hyperspectral images of wine grape berries, a specially relevant topic to boo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Artificial intelligence in agriculture 2023-03, Vol.7, p.58-68
Hauptverfasser:	Silva, Rui, Melo-Pinto, Pedro
Format:	Artikel
Sprache:	eng
Schlagworte:	Dimensionality reduction Hyperspectral images Regression Support vector machines T-SNE Wine grape berries
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In recent years there is a growing importance in using machine learning techniques to improve procedures in precision agriculture: in this work we perform a study on models capable of predicting oenological parameters from hyperspectral images of wine grape berries, a specially relevant topic to boost production tasks for winemakers. Specifically, we explore the capabilities of a novel technique mostly used for visualization, t-Distributed Stochastic Neighbor Embedding (t-SNE), for reducing the dimensionality of the highly complex hyperspectral data and compare its performance with Principal Component Analysis (PCA) method, which despite the introduction of many nonlinear dimensionality reduction techniques over the years, had achieved the best results for real-world data across several studies in literature. Additionally we explore the potential of Kernel t-SNE, an extension to the t-SNE method that allows for the usage of the technique in streaming data or online scenarios. Our results show that, in a direct comparison, t-SNE achieves better metrics than PCA for most of the data sets in this work and that the regressor (Support Vector Regression, SVR) performs better with the t-SNE reduced features as inputs, accomplishing better predictions with lower error rates. Comparing the results with current literature, our shallow learning model paired with t-SNE achieves either better or on par results than those reported, even competing with more advanced models that use deep learning techniques, which should propel the introduction of t-SNE in more studies that require dimensionality reduction. •t-SNE outperforms PCA in trustworthiness and continuity metrics.•The ML model with t-SNE as a DR step obtains better estimates than a model with PCA.•Kernel t-SNE achieves strong results and presents itself as an alternative to PCA.•Kernel t-SNE can be used as a substitute to t-SNE on online/streaming data problems.•A regressor combined with t-SNE as a DR step achieved better results than DL models.
ISSN:	2589-7217 2589-7217
DOI:	10.1016/j.aiia.2023.02.003