DeMaSk: a deep mutational scanning substitution matrix and its use for variant impact prediction
Abstract Motivation Accurately predicting the quantitative impact of a substitution on a protein’s molecular function would be a great aid in understanding the effects of observed genetic variants across populations. While this remains a challenging task, new approaches can leverage data from the in...
Gespeichert in:
Veröffentlicht in: | Bioinformatics (Oxford, England) England), 2021-04, Vol.36 (22-23), p.5322-5329 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract
Motivation
Accurately predicting the quantitative impact of a substitution on a protein’s molecular function would be a great aid in understanding the effects of observed genetic variants across populations. While this remains a challenging task, new approaches can leverage data from the increasing numbers of comprehensive deep mutational scanning (DMS) studies that systematically mutate proteins and measure fitness.
Results
We introduce DeMaSk, an intuitive and interpretable method based only upon DMS datasets and sequence homologs that predicts the impact of missense mutations within any protein. DeMaSk first infers a directional amino acid substitution matrix from DMS datasets and then fits a linear model that combines these substitution scores with measures of per-position evolutionary conservation and variant frequency across homologs. Despite its simplicity, DeMaSk has state-of-the-art performance in predicting the impact of amino acid substitutions, and can easily and rapidly be applied to any protein sequence.
Availability and implementation
https://demask.princeton.edu generates fitness impact predictions and visualizations for any user-submitted protein sequence.
Supplementary information
Supplementary data are available at Bioinformatics online. |
---|---|
ISSN: | 1367-4803 1367-4811 |
DOI: | 10.1093/bioinformatics/btaa1030 |