PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure

Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics advances 2022, Vol.2 (1), p.vbac058-vbac058
Hauptverfasser: Townsley, Thomas D, Wilson, James T, Akers, Harrison, Bryant, Timothy, Cordova, Salvador, Wallace, T L, Durston, Kirk K, Deweese, Joseph E
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation https://github.com/jdeweeselab/psicalc-package Supplementary information Supplementary data are available at Bioinformatics Advances online.
ISSN:2635-0041
2635-0041
DOI:10.1093/bioadv/vbac058