United Neighborhood Closeness Centrality and Orthology for Predicting Essential Proteins

Identifying essential proteins plays an important role in disease study, drug design, and understanding the minimal requirement for cellular life. Computational methods for essential proteins discovery overcome the disadvantages of biological experimental methods that are often time-consuming, expen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on computational biology and bioinformatics 2020-07, Vol.17 (4), p.1451-1458
Hauptverfasser:	Li, Gaoshi, Li, Min, Wang, Jianxin, Li, Yaohang, Pan, Yi
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Computational Biology - methods Computer applications Databases, Protein Drug development E coli Eigenvectors EPOC Escherichia coli Proteins - chemistry Escherichia coli Proteins - metabolism essential proteins Experimental methods Fuses Gene expression Information sources Integrated circuit modeling Models, Biological Neighborhood closeness centrality Neighborhoods orthologous Orthology PPI network Predictions Protein interaction Protein Interaction Maps Proteins Random walk Research methodology Saccharomyces cerevisiae Proteins - chemistry Saccharomyces cerevisiae Proteins - metabolism
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Identifying essential proteins plays an important role in disease study, drug design, and understanding the minimal requirement for cellular life. Computational methods for essential proteins discovery overcome the disadvantages of biological experimental methods that are often time-consuming, expensive, and inefficient. The topological features of protein-protein interaction (PPI) networks are often used to design computational prediction methods, such as Degree Centrality (DC), Betweenness Centrality (BC), Closeness Centrality (CC), Subgraph Centrality (SC), Eigenvector Centrality (EC), Information Centrality (IC), and Neighborhood Centrality (NC). However, the prediction accuracies of these individual methods still have space to be improved. Studies show that additional information, such as orthologous relations, helps discover essential proteins. Many researchers have proposed different methods by combining multiple information sources to gain improvement of prediction accuracy. In this study, we find that essential proteins appear in triangular structure in PPI network significantly more often than nonessential ones. Based on this phenomenon, we propose a novel pure centrality measure, so-called Neighborhood Closeness Centrality (NCC). Accordingly, we develop a new combination model, Extended Pareto Optimality Consensus model, named EPOC, to fuse NCC and Orthology information and a novel essential proteins identification method, NCCO, is fully proposed. Compared with seven existing classic centrality methods (DC, BC, IC, CC, SC, EC, and NC) and three consensus methods (PeC, ION, and CSC), our results on S.cerevisiae and E.coli datasets show that NCCO has clear advantages. As a consensus method, EPOC also yields better performance than the random walk model.
ISSN:	1545-5963 1557-9964
DOI:	10.1109/TCBB.2018.2889978