Decision tree classifier based on topological characteristics of subgraph for the mining of protein complexes from large scale PPI networks
The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational feat...
Gespeichert in:
Veröffentlicht in: | Computational biology and chemistry 2023-10, Vol.106, p.107935-107935, Article 107935 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The growing accessibility of large-scale protein interaction data demands extensive research to understand cell organization and its functioning at the network level. Bioinformatics and data mining researchers have extensively studied network clustering to examine the structural and operational features of protein protein interaction (PPI) networks. Clustering PPI networks has proven useful in numerous research over the past two decades for identifying functional modules, understanding the roles of previously unknown proteins, and other purposes. Protein complexes represent one of the essential cellular components for creating biological activities. Inferring protein complexes has been made more accessible by experimental approaches. We offer a novel method that integrates the classification model with local topological data, making it more reliable and efficient. This article describes a decision tree classifier based on topological characteristics of the subgraph for mining protein complexes. The proposed graph-based algorithm is an effective and efficient way to identify protein complexes from large-scale PPI networks. The performance of the proposed algorithm is observed in protein–protein interaction networks of yeast and human in the Database of Interacting Proteins (DIP) and the Biological General Repository for Interaction Datasets (BioGRID) using widely accepted benchmark protein complexes from the comprehensive resource of mammalian protein complexes (CORUM) and the comprehensive catalogue of yeast protein complexes (CYC2008). The outcomes demonstrate that our method can outperform the best-performing supervised, semi-supervised, and unsupervised approaches to detecting protein complexes.
[Display omitted]
•Complex prediction by decision tree and topological characteristics of subgraph(DPMC).•An effective and efficient way to predict complexes from large-scale PPI networks.•Datasets: DIP and BioGRID for PPI network; CORUM and CYC2008 for benchmark complexes•The proposed algorithm outperforms MCODE, DPClus, RNSC, COACH, and ClusterONE.•The proposed algorithm is at par with CART, ClusterSS, BHHO, PCPCRO, PROCOP and ESCC. |
---|---|
ISSN: | 1476-9271 1476-928X |
DOI: | 10.1016/j.compbiolchem.2023.107935 |