Benchtop volatilomics supercharged: How machine learning based design of experiment helps optimizing untargeted GC-IMS gas phase metabolomics

Gas chromatography-ion mobility spectrometry (GC-IMS) plays a significant role in both targeted and non-targeted analyses. However, the non-linear behavior of IMS and its complex ion chemistry pose challenges for finding optimal experimental conditions using existing methodologies. To address these...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Talanta (Oxford) 2024-05, Vol.272, p.125788-125788, Article 125788
Hauptverfasser:	Parastar, Hadi, Weller, Philipp
Format:	Artikel
Sprache:	eng
Schlagworte:	Chemometrics Design of experiment Flavoromics Gas chromatography-ion mobility spectrometry Machine learning Volatilomics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Gas chromatography-ion mobility spectrometry (GC-IMS) plays a significant role in both targeted and non-targeted analyses. However, the non-linear behavior of IMS and its complex ion chemistry pose challenges for finding optimal experimental conditions using existing methodologies. To address these issues, integrating machine learning (ML) strategies offers a promising approach. In this study, we propose a hybrid strategy, combining design of experiment (DOE) and machine learning (ML) for optimizing GC-IMS conditions in non-targeted volatilomic/flavoromic analysis, with saffron volatiles as a case study. To begin, a rotatable circumscribed central composite design (CCD) is used to define five influential GC-IMS factors of sample amount, headspace temperature, incubation time, injection volume, and split ratio. Subsequently, two ML models are utilized: multiple linear regression (MLR) as a linear model and Bayesian regularized-artificial neural network (BR-ANN) as a nonlinear model. These models are employed to predict the response variables of total peak areas (PAs) and the number of detected peaks (PNs) in GC-IMS. The findings show that there is a direct correlation between the factors in GC-IMS and the PNs, suggesting that MLR is a suitable approach for building a model in this scenario. However, the PAs exhibit nonlinear behavior, suggesting that BR-ANN is better suitable to capture this complexity. Notably, Derringer's desirability function is utilized to integrate the PAs and PNs, and in this scenario, MLR demonstrates satisfactory performance in modeling the GC-IMS factors. [Display omitted] •Integrating DOE with ML for GC-IMS optimization for non-targeted metabolomics.•MLR for linear optimization of GC-IMS factors with peak numbers as response.•Desirability function combining peak areas and peak numbers is used for GC-IMS optimization.•BRANN is proposed for nonlinear optimization of GC-IMS factors with peak areas as response.•Saffron volatiles are used as an example to explore the novel optimization strategy.
ISSN:	0039-9140 1873-3573
DOI:	10.1016/j.talanta.2024.125788