Retention Time Prediction with Message-Passing Neural Networks

Retention time prediction, facilitated by advances in machine learning, has become a useful tool in untargeted LC-MS applications. State-of-the-art approaches include graph neural networks and 1D-convolutional neural networks that are trained on the METLIN small molecule retention time dataset (SMRT...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Separations 2022-10, Vol.9 (10), p.291
Hauptverfasser: Osipenko, Sergey, Nikolaev, Eugene, Kostyukevich, Yury
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Retention time prediction, facilitated by advances in machine learning, has become a useful tool in untargeted LC-MS applications. State-of-the-art approaches include graph neural networks and 1D-convolutional neural networks that are trained on the METLIN small molecule retention time dataset (SMRT). These approaches demonstrate accurate predictions comparable with the experimental error for the training set. The weak point of retention time prediction approaches is the transfer of predictions to various systems. The accuracy of this step depends both on the method of mapping and on the accuracy of the general model trained on SMRT. Therefore, improvements to both parts of prediction workflows may lead to improved compound annotations. Here, we evaluate capabilities of message-passing neural networks (MPNN) that have demonstrated outstanding performance on many chemical tasks to accurately predict retention times. The model was initially trained on SMRT, providing mean and median absolute cross-validation errors of 32 and 16 s, respectively. The pretrained MPNN was further fine-tuned on five publicly available small reversed-phase retention sets in a transfer learning mode and demonstrated up to 30% improvement of prediction accuracy for these sets compared with the state-of-the-art methods. We demonstrated that filtering isomeric candidates by predicted retention with the thresholds obtained from ROC curves eliminates up to 50% of false identities.
ISSN:2297-8739
2297-8739
DOI:10.3390/separations9100291