UNIQUE: A Framework for Uncertainty Quantification Benchmarking

Machine learning (ML) models have become key in decision-making for many disciplines, including drug discovery and medicinal chemistry. ML models are generally evaluated prior to their usage in high-stakes decisions, such as compound synthesis or experimental testing. However, no ML model is robust...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of chemical information and modeling 2024-11, Vol.64 (22), p.8379-8386
Hauptverfasser:	Lanini, Jessica, Huynh, Minh Tam Davide, Scebba, Gaetano, Schneider, Nadine, Rodríguez-Pérez, Raquel
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Application Note Benchmarking Benchmarks Chemical synthesis Datasets Drug Discovery - methods Estimates Machine Learning Predictions Robustness Uncertainty
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Machine learning (ML) models have become key in decision-making for many disciplines, including drug discovery and medicinal chemistry. ML models are generally evaluated prior to their usage in high-stakes decisions, such as compound synthesis or experimental testing. However, no ML model is robust or predictive in all real-world scenarios. Therefore, uncertainty quantification (UQ) in ML predictions has gained importance in recent years. Many investigations have focused on developing methodologies that provide accurate uncertainty estimates for ML-based predictions. Unfortunately, there is no UQ strategy that consistently provides robust estimates about model’s applicability on new samples. Depending on the dataset, prediction task, and algorithm, accurate uncertainty estimations might be unfeasible to obtain. Moreover, the optimum UQ metric also varies across applications, and previous investigations have shown a lack of consistency across benchmarks. Herein, the UNIQUE (UNcertaInty QUantification bEnchmarking) framework is introduced to facilitate a comparison of UQ strategies in ML-based predictions. This Python library unifies the benchmarking of multiple UQ metrics, including the calculation of nonstandard UQ metrics (combining information from the dataset and model), and provides a comprehensive evaluation. In this framework, UQ metrics are evaluated for different application scenarios, e.g., eliminating the predictions with the lowest confidence or obtaining a reliable uncertainty estimate for an acquisition function. Taken together, this library will help to standardize UQ investigations and evaluate new methodologies.
ISSN:	1549-9596 1549-960X 1549-960X
DOI:	10.1021/acs.jcim.4c01578