Application of the combination method based on RF and LE in near infrared spectral modeling

[Display omitted] •A combined dimensionality reduction method (RF + LE) for NIRS data was proposed.•RF + LE can significantly improve model performance for near-infrared applications.•Combined method has certain advantages over the single dimension reduction method. The dimensionality of near-infrar...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy Molecular and biomolecular spectroscopy, 2023-03, Vol.289, p.122247, Article 122247
Hauptverfasser: Zhang, Xiao-Wen, Chen, Zheng-Guang, Jiao, Feng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •A combined dimensionality reduction method (RF + LE) for NIRS data was proposed.•RF + LE can significantly improve model performance for near-infrared applications.•Combined method has certain advantages over the single dimension reduction method. The dimensionality of near-infrared (NIR) spectral data is often extremely large. Dimensionality reduction of spectral data can effectively reduce the redundant information and correlation between spectral variables and simplify the model, which is crucial to increasing the model's performance. As a nonlinear feature extraction method, Laplacian Eigenmaps (LE) may preserve the local neighborhood information of the dataset, has high robustness, and is simple to compute. However, when the LE algorithm maps the data from high-dimensional space to low-dimensional space, it is often disturbed by irrelevant information and multicollinearity in the spectral data, which lowers the model's prediction performance. Random Frog (RF) algorithm can eliminate noise and collinearity in the spectrum. Therefore, before using the LE algorithm, we use the RF algorithm to eliminate irrelevant information in the spectrum and reduce the correlation between the spectra variables to increase the efficiency of the LE algorithm. We used the RF + LE algorithm to reduce the dimensionality of two public NIRS datasets (soil datasets and pharmaceutical tablets datasets) and compared it with RF and LE algorithms alone. We utilized Partial Least Squares Regression (PLSR) and Support Vector Regression (SVR) to establish regression models. The experimental findings demonstrate that compared with the RF algorithm and LE algorithm, the RF + LE combination method can reduce the dimension of spectral variables and model complexity, and improve regression models' prediction accuracy and stability. It is an effective dimensionality reduction method for the near-infrared spectrum.
ISSN:1386-1425
1873-3557
DOI:10.1016/j.saa.2022.122247