PHACTboost: A Phylogeny-Aware Pathogenicity Predictor for Missense Mutations via Boosting

Abstract Most algorithms that are used to predict the effects of variants rely on evolutionary conservation. However, a majority of such techniques compute evolutionary conservation by solely using the alignment of multiple sequences while overlooking the evolutionary context of substitution events....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular biology and evolution 2024-07, Vol.41 (7)
Hauptverfasser: Dereli, Onur, Kuru, Nurdan, Akkoyun, Emrah, Bircan, Aylin, Tastan, Oznur, Adebali, Ogün
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Most algorithms that are used to predict the effects of variants rely on evolutionary conservation. However, a majority of such techniques compute evolutionary conservation by solely using the alignment of multiple sequences while overlooking the evolutionary context of substitution events. We had introduced PHACT, a scoring-based pathogenicity predictor for missense mutations that can leverage phylogenetic trees, in our previous study. By building on this foundation, we now propose PHACTboost, a gradient boosting tree–based classifier that combines PHACT scores with information from multiple sequence alignments, phylogenetic trees, and ancestral reconstruction. By learning from data, PHACTboost outperforms PHACT. Furthermore, the results of comprehensive experiments on carefully constructed sets of variants demonstrated that PHACTboost can outperform 40 prevalent pathogenicity predictors reported in the dbNSFP, including conventional tools, metapredictors, and deep learning–based approaches as well as more recent tools such as AlphaMissense, EVE, and CPT-1. The superiority of PHACTboost over these methods was particularly evident in case of hard variants for which different pathogenicity predictors offered conflicting results. We provide predictions of 215 million amino acid alterations over 20,191 proteins. PHACTboost is available at https://github.com/CompGenomeLab/PHACTboost. PHACTboost can improve our understanding of genetic diseases and facilitate more accurate diagnoses.
ISSN:0737-4038
1537-1719
1537-1719
DOI:10.1093/molbev/msae136