Decision tree classifier based on topological characteristics of subgraph for the mining of protein complexes from large scale PPI networks

The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational feat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational biology and chemistry 2023-10, Vol.106, p.107935-107935, Article 107935
Hauptverfasser: Sahoo, Tushar Ranjan, Patra, Sabyasachi, Vipsita, Swati
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational features of protein protein interaction (PPI) networks. Clustering PPI networks has proven useful in numerous research over the past two decades for identifying functional modules, understanding the roles of previously unknown proteins, and other purposes. Protein complexes represent one of the essential cellular components for creating biological activities. Inferring protein complexes has been made more accessible by experimental approaches. We offer a novel method that integrates the classification model with local topological data, making it more reliable and efficient. This article describes a decision tree classifier based on topological characteristics of the subgraph for mining protein complexes. The proposed graph-based algorithm is an effective and efficient way to identify protein complexes from large-scale PPI networks. The performance of the proposed algorithm is observed in protein–protein interaction networks of yeast and human in the Database of Interacting Proteins (DIP) and the Biological General Repository for Interaction Datasets (BioGRID) using widely accepted benchmark protein complexes from the comprehensive resource of mammalian protein complexes (CORUM) and the comprehensive catalogue of yeast protein complexes (CYC2008). The outcomes demonstrate that our method can outperform the best-performing supervised, semi-supervised, and unsupervised approaches to detecting protein complexes. [Display omitted] •Complex prediction by decision tree and topological characteristics of subgraph(DPMC).•An effective and efficient way to predict complexes from large-scale PPI networks.•Datasets: DIP and BioGRID for PPI network; CORUM and CYC2008 for benchmark complexes•The proposed algorithm outperforms MCODE, DPClus, RNSC, COACH, and ClusterONE.•The proposed algorithm is at par with CART, ClusterSS, BHHO, PCPCRO, PROCOP and ESCC.
ISSN:1476-9271
1476-928X
DOI:10.1016/j.compbiolchem.2023.107935