SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM

Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Analytical biochemistry 2020-01, Vol.589, p.113494-113494, Article 113494
Hauptverfasser: Ali, Farman, Arif, Muhammad, Khan, Zaheer Ullah, Kabir, Muhammad, Ahmed, Saeed, Yu, Dong-Jun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 113494
container_issue
container_start_page 113494
container_title Analytical biochemistry
container_volume 589
creator Ali, Farman
Arif, Muhammad
Khan, Zaheer Ullah
Kabir, Muhammad
Ahmed, Saeed
Yu, Dong-Jun
description Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far. A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted] •Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.
doi_str_mv 10.1016/j.ab.2019.113494
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2312807835</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0003269719300703</els_id><sourcerecordid>2312807835</sourcerecordid><originalsourceid>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</originalsourceid><addsrcrecordid>eNp1kcFu1DAQhi0Eokvhzgn5yMXLOE7suLfSAkUUWGnhbMX2ZOXVrl1iB7WPwtuSNAVx4TKWRt989vgn5CWHNQcu3-zXnV1XwPWac1Hr-hFZcdCSgQD9mKwAQLBKanVCnuW8B-C8buRTciK41KJV1Yr82l6-3bDNgP6MzjW4ElKkqac5xN0BWS5DFz16OlXq02j_7V1-OWc2RD-h9GZIBUPM1N5RvC24dF2KGWMeM834Y8To8F70iWXcHTGW7v66WVhwFzDTEEuim-3283PypO8OGV88nKfk-_t33y6u2PXXDx8vzq-ZEw0U5luPqm5txaXvK6_B9ta3SinhtKuV5ELrWnLddt5yrfu6aVwDCiR3QmnhxSl5vXinBaYX5mKOITs8HLqIacymErxqQbWimVBYUDeknAfszc0Qjt1wZziYORCzN501cyBmCWQaefVgH-0R_d-BPwlMwNkC4LTjz4CDyS7M_-TDgK4Yn8L_7b8Bxg-bwg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2312807835</pqid></control><display><type>article</type><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><source>Elsevier ScienceDirect Journals</source><creator>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</creator><creatorcontrib>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</creatorcontrib><description>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far. A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted] •Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</description><identifier>ISSN: 0003-2697</identifier><identifier>EISSN: 1096-0309</identifier><identifier>DOI: 10.1016/j.ab.2019.113494</identifier><identifier>PMID: 31693872</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Consensus sequence ; DNA-binding proteins ; Double-stranded DNA-binding proteins ; K-segmentation ; Single-stranded DNA-binding proteins ; Support vector machine</subject><ispartof>Analytical biochemistry, 2020-01, Vol.589, p.113494-113494, Article 113494</ispartof><rights>2019 Elsevier Inc.</rights><rights>Copyright © 2019 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</citedby><cites>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ab.2019.113494$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,45974</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31693872$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ali, Farman</creatorcontrib><creatorcontrib>Arif, Muhammad</creatorcontrib><creatorcontrib>Khan, Zaheer Ullah</creatorcontrib><creatorcontrib>Kabir, Muhammad</creatorcontrib><creatorcontrib>Ahmed, Saeed</creatorcontrib><creatorcontrib>Yu, Dong-Jun</creatorcontrib><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><title>Analytical biochemistry</title><addtitle>Anal Biochem</addtitle><description>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far. A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted] •Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</description><subject>Consensus sequence</subject><subject>DNA-binding proteins</subject><subject>Double-stranded DNA-binding proteins</subject><subject>K-segmentation</subject><subject>Single-stranded DNA-binding proteins</subject><subject>Support vector machine</subject><issn>0003-2697</issn><issn>1096-0309</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kcFu1DAQhi0Eokvhzgn5yMXLOE7suLfSAkUUWGnhbMX2ZOXVrl1iB7WPwtuSNAVx4TKWRt989vgn5CWHNQcu3-zXnV1XwPWac1Hr-hFZcdCSgQD9mKwAQLBKanVCnuW8B-C8buRTciK41KJV1Yr82l6-3bDNgP6MzjW4ElKkqac5xN0BWS5DFz16OlXq02j_7V1-OWc2RD-h9GZIBUPM1N5RvC24dF2KGWMeM834Y8To8F70iWXcHTGW7v66WVhwFzDTEEuim-3283PypO8OGV88nKfk-_t33y6u2PXXDx8vzq-ZEw0U5luPqm5txaXvK6_B9ta3SinhtKuV5ELrWnLddt5yrfu6aVwDCiR3QmnhxSl5vXinBaYX5mKOITs8HLqIacymErxqQbWimVBYUDeknAfszc0Qjt1wZziYORCzN501cyBmCWQaefVgH-0R_d-BPwlMwNkC4LTjz4CDyS7M_-TDgK4Yn8L_7b8Bxg-bwg</recordid><startdate>20200115</startdate><enddate>20200115</enddate><creator>Ali, Farman</creator><creator>Arif, Muhammad</creator><creator>Khan, Zaheer Ullah</creator><creator>Kabir, Muhammad</creator><creator>Ahmed, Saeed</creator><creator>Yu, Dong-Jun</creator><general>Elsevier Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20200115</creationdate><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><author>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Consensus sequence</topic><topic>DNA-binding proteins</topic><topic>Double-stranded DNA-binding proteins</topic><topic>K-segmentation</topic><topic>Single-stranded DNA-binding proteins</topic><topic>Support vector machine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ali, Farman</creatorcontrib><creatorcontrib>Arif, Muhammad</creatorcontrib><creatorcontrib>Khan, Zaheer Ullah</creatorcontrib><creatorcontrib>Kabir, Muhammad</creatorcontrib><creatorcontrib>Ahmed, Saeed</creatorcontrib><creatorcontrib>Yu, Dong-Jun</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Analytical biochemistry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ali, Farman</au><au>Arif, Muhammad</au><au>Khan, Zaheer Ullah</au><au>Kabir, Muhammad</au><au>Ahmed, Saeed</au><au>Yu, Dong-Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</atitle><jtitle>Analytical biochemistry</jtitle><addtitle>Anal Biochem</addtitle><date>2020-01-15</date><risdate>2020</risdate><volume>589</volume><spage>113494</spage><epage>113494</epage><pages>113494-113494</pages><artnum>113494</artnum><issn>0003-2697</issn><eissn>1096-0309</eissn><abstract>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far. A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted] •Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>31693872</pmid><doi>10.1016/j.ab.2019.113494</doi><tpages>1</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0003-2697
ispartof Analytical biochemistry, 2020-01, Vol.589, p.113494-113494, Article 113494
issn 0003-2697
1096-0309
language eng
recordid cdi_proquest_miscellaneous_2312807835
source Elsevier ScienceDirect Journals
subjects Consensus sequence
DNA-binding proteins
Double-stranded DNA-binding proteins
K-segmentation
Single-stranded DNA-binding proteins
Support vector machine
title SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T00%3A55%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SDBP-Pred:%20Prediction%20of%20single-stranded%20and%20double-stranded%20DNA-binding%20proteins%20by%20extending%20consensus%20sequence%20and%20K-segmentation%20strategies%20into%20PSSM&rft.jtitle=Analytical%20biochemistry&rft.au=Ali,%20Farman&rft.date=2020-01-15&rft.volume=589&rft.spage=113494&rft.epage=113494&rft.pages=113494-113494&rft.artnum=113494&rft.issn=0003-2697&rft.eissn=1096-0309&rft_id=info:doi/10.1016/j.ab.2019.113494&rft_dat=%3Cproquest_cross%3E2312807835%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2312807835&rft_id=info:pmid/31693872&rft_els_id=S0003269719300703&rfr_iscdi=true