SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM
Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA...
Gespeichert in:
Veröffentlicht in: | Analytical biochemistry 2020-01, Vol.589, p.113494-113494, Article 113494 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far.
A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted]
•Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate. |
---|---|
ISSN: | 0003-2697 1096-0309 |
DOI: | 10.1016/j.ab.2019.113494 |