A percolation theory analysis of continuous functional paths in protein sequence space affirms previous insights on the optimization of proteins for adaptability
A key question in protein evolution and protein engineering is the prevalence of evolutionary paths between distinct proteins. An evolutionary path corresponds to a continuous path of functional sequences in sequence space leading from one protein to another. Natural selection could direct a mutatin...
Gespeichert in:
Veröffentlicht in: | PloS one 2024-12, Vol.19 (12), p.e0314929 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A key question in protein evolution and protein engineering is the prevalence of evolutionary paths between distinct proteins. An evolutionary path corresponds to a continuous path of functional sequences in sequence space leading from one protein to another. Natural selection could direct a mutating coding region in DNA along a continuous functional path (CFP), so a new protein could arise far more easily than if a coding region were randomly mutating without any constraints. The distribution and length of CFPs undergird theories on the origin of natural proteins and strategies for engineering artificial proteins. This study examined the distribution of long CFPs within the framework of percolation theory, which addresses the proportion of randomly filled sites in a lattice above which long continuous paths of neighboring filled sites become common (aka percolation threshold). It also used a simulation to demonstrate that the percolation threshold in protein sequence space approximates the reciprocal of the average number of protein variants that could result from a single mutation. For diverse proteins, the ratio was calculated between the percolation threshold and the proportion of sequences reported to perform a protein's function, relative to the total number of sequences of that protein's length. This ratio represents a measure of the biasing in the distribution of functional sequences required for evolutionary paths to possibly exist, so it provides a means to quantify the specificity in protein sequence and structure required to allow for a protein to develop new catalytic functions. The consistently high ratio demonstrates that CFPs can only connect distinct proteins if the biasing in the distribution of functional sequences in sequence space is often extremely large. Regions in sequence space are identified where the biasing is sufficient to allow for extensive CFPs. The calculated levels of required biasing and the identified regions of high biasing reinforce the conclusion of previous studies that some proteins are highly optimized, so mutations can enable or enhance catalytic functions while maintaining the protein's structure. The conclusions of this study also challenge the results of a previous application of percolation theory to sequence space that did not properly incorporate the percolation threshold. Steps are outlined for integrating the percolation threshold and the biasing measure into studies of protein sequence space. |
---|---|
ISSN: | 1932-6203 1932-6203 |
DOI: | 10.1371/journal.pone.0314929 |