Application of stacking ensemble learning model in quantitative analysis of biomaterial activity

[Display omitted] •A rapid non-destructive testing method for biomaterial activity is proposed.•A quantitative detection model of biomaterial activity based on stacking ensemble learning was constructed.•Stacking ensemble learning model has better prediction performance than the single models.•The m...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Microchemical journal 2022-12, Vol.183, p.108075, Article 108075
Hauptverfasser:	Cao, Hao, Gu, Youlin, Fang, Jiajie, Hu, Yihua, Ding, Wanying, He, Haihao, Chen, Guolong
Format:	Artikel
Sprache:	eng
Schlagworte:	Biomaterial activity detection Mid-infrared spectroscopy Principal component analysis Quantitative determination Stacking ensemble learning model
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	[Display omitted] •A rapid non-destructive testing method for biomaterial activity is proposed.•A quantitative detection model of biomaterial activity based on stacking ensemble learning was constructed.•Stacking ensemble learning model has better prediction performance than the single models.•The method was successfully applied to biological analysis. Quantitative analysis techniques based on attenuated total reflection Fourier transform infrared spectroscopy (ATR FT-IR) are widely used for component detection of cells rather than activity levels. In this study, a rapid nondestructive detection method for the activity of biomaterials is proposed. The method is based on the infrared spectroscopy technique, which analyzes the infrared absorption peaks of three different biomaterials before and after inactivation, and then obtains the changes of their surface functional groups after inactivation. According to the regular difference of their absorption spectra, the stacked ensemble learning model is used to accurately detect the activity ratio of the biomaterials. In the two-level fusion framework of the ensemble model, partial least squares regression (PLSR), gradient boosted decision tree (GBDT), random forest (RF) and extra tree (ET) are used as primary learners, linear regression is used as secondary learner. Duplicates and interfering data in the raw spectral can be eliminated by multiplicative scatter correction (MSC) and principal component analysis (PCA). The coefficient of determination of prediction set (R2p) for three biomaterials were 0.9641, 0.9946 and 0.9939, respectively. The root mean square error of prediction (RMSEP) were 5.7%, 2.1% and 2.3%, respectively. Compared with these single algorithm models, the stacking ensemble learning model has the highest and lowest values for R2p and RMSEP, respectively. The results reveal that the fusion model could enhance the generalization ability and prediction accuracy for detecting activity of biomaterial with the influence of various factors. This study not only provides a new method for detection of biomaterial activity, but also guidance for fusion algorithms as well.
ISSN:	0026-265X 1095-9149
DOI:	10.1016/j.microc.2022.108075