vEXP: A virtual enhanced cross screen panel for off-target pharmacology alerts

We describe the development of the GSK vEXP (virtual enhanced cross screen panel) for off-target pharmacology alerts. The derivation of a panel of machine learning classification models or QSAR models (Quantitative Structure-Activity Relationship) for off-target safety assessment allows early alerti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational toxicology 2024-09, Vol.31, p.100324, Article 100324
Hauptverfasser: Lumley, James A., Fallon, David, Whatling, Ryan, Coupry, Damien, Brown, Andrew
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We describe the development of the GSK vEXP (virtual enhanced cross screen panel) for off-target pharmacology alerts. The derivation of a panel of machine learning classification models or QSAR models (Quantitative Structure-Activity Relationship) for off-target safety assessment allows early alerting to risk factors in candidate drugs. The models are matched to an internal in-vitro biochemical screening panel described previously with some updates reported here. The extreme imbalance of some internal GSK datasets and most of the related external ChEMBL datasets is shown when considering potency thresholds relevant to in-vitro screening. The small size and bias to the active class make many ChEMBL datasets un-modellable using such thresholds. Although larger, many GSK datasets remain too imbalanced to give a performant model. The value of merging internal and external data to help rebalance datasets and improve the domain of applicability is demonstrated with improvements in model performance frequently seen on merged data. Efforts to collate public datasets with a far better balance of the missing in-actives would likely do more to improve opensource models than simply increasing dataset size. We investigate the use of moving the probability threshold and applying imbalanced learners to help overcome the imbalance problem. Both methods can produce models with improved performance when applied to imbalanced datasets. Datasets with class imbalance 95:5 % or with
ISSN:2468-1113
2468-1113
DOI:10.1016/j.comtox.2024.100324