Systematic Improvement of the Performance of Machine Learning Scoring Functions by Incorporating Features of Protein-Bound Water Molecules

Water molecules at the ligand–protein interfaces play crucial roles in the binding of the ligands, but the behavior of protein-bound water is largely ignored in many currently used machine learning (ML)-based scoring functions (SFs). In an attempt to improve the prediction performance of existing ML...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of chemical information and modeling 2022-09, Vol.62 (18), p.4369-4379
Hauptverfasser:	Qu, Xiaoyang, Dong, Lina, Zhang, Jinyan, Si, Yubing, Wang, Binju
Format:	Artikel
Sprache:	eng
Schlagworte:	Crystal structure Feature extraction Ligands Machine learning Machine Learning and Deep Learning Performance enhancement Performance prediction Proteins Water chemistry Water distribution Water engineering
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Water molecules at the ligand–protein interfaces play crucial roles in the binding of the ligands, but the behavior of protein-bound water is largely ignored in many currently used machine learning (ML)-based scoring functions (SFs). In an attempt to improve the prediction performance of existing ML-based SFs, we estimated the water distribution with a HydraMap (HM) method and then incorporated the features extracted from protein-bound waters obtained in this way into three ML-based SFs: RF-Score, ECIF, and PLEC. It was found that a combination of HM-based features can consistently improve the performance of all three SFs, including their scoring, ranking, and docking power. HydraMap-based features show consistently good performance with both crystal structures and docked structures, demonstrating their robustness for SFs. Overall, HM-based features, which are a statistical representation of hydration sites at protein–ligand interfaces, are expected to improve the prediction performance for diverse SFs.
ISSN:	1549-9596 1549-960X
DOI:	10.1021/acs.jcim.2c00916