Pathogenicity classification of missense mutations based on deep generative model
Missense mutations affect the function of human proteins and are closely associated with multiple acute and chronic diseases. The identification of disease-associated missense mutations and their classification for pathogenicity can provide insights into the genetic basis of disease and protein func...
Gespeichert in:
Veröffentlicht in: | Computers in biology and medicine 2024-03, Vol.170, p.107980, Article 107980 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Missense mutations affect the function of human proteins and are closely associated with multiple acute and chronic diseases. The identification of disease-associated missense mutations and their classification for pathogenicity can provide insights into the genetic basis of disease and protein function. This paper proposes MLAE (Method based on LSTM-Ladder AutoEncoder), a deep learning classification model for identifying disease-associated missense mutations and classifying their pathogenicity based on the Variational AutoEncoder (VAE) framework. MLAE overcomes the limitations of the VAE framework by introducing the Ladder structure, combined with LSTM networks. This reduces the loss of original information during the transmission process, thereby making the model more effective in learning. In the experiment, MLAE classified all 27572 possible missense variants of the three input proteins with an average classification AUC of 0.941. This result provides evidence that MLAE is effective in predicting pathogenicity. Additionally, MLAE provides results for multi-label classification, with an average Hamming loss of 0.196, supporting the classification of complex variants. The proposed MLAE method provides an insightful approach to effectively capture amino acid sequence information and accurately predict the pathogenicity of mutations, thereby providing an analytical basis for the study and prevention of related diseases.
•MLAE studies association of 27,572 missense mutations with disease.•MLAE introduces an extendable multi-label framework for missense mutations.•This paper studies LSTM, CNN, and Ladder in processing biological sequence.•MLAE aligns with ACMG and ClinVar, providing multi-label classification results.•MLAE enhances the VAE architecture with LSTM and ladder structure. |
---|---|
ISSN: | 0010-4825 1879-0534 1879-0534 |
DOI: | 10.1016/j.compbiomed.2024.107980 |