SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM
Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA...
Gespeichert in:
Veröffentlicht in: | Analytical biochemistry 2020-01, Vol.589, p.113494-113494, Article 113494 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 113494 |
---|---|
container_issue | |
container_start_page | 113494 |
container_title | Analytical biochemistry |
container_volume | 589 |
creator | Ali, Farman Arif, Muhammad Khan, Zaheer Ullah Kabir, Muhammad Ahmed, Saeed Yu, Dong-Jun |
description | Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far.
A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted]
•Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate. |
doi_str_mv | 10.1016/j.ab.2019.113494 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2312807835</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0003269719300703</els_id><sourcerecordid>2312807835</sourcerecordid><originalsourceid>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</originalsourceid><addsrcrecordid>eNp1kcFu1DAQhi0Eokvhzgn5yMXLOE7suLfSAkUUWGnhbMX2ZOXVrl1iB7WPwtuSNAVx4TKWRt989vgn5CWHNQcu3-zXnV1XwPWac1Hr-hFZcdCSgQD9mKwAQLBKanVCnuW8B-C8buRTciK41KJV1Yr82l6-3bDNgP6MzjW4ElKkqac5xN0BWS5DFz16OlXq02j_7V1-OWc2RD-h9GZIBUPM1N5RvC24dF2KGWMeM834Y8To8F70iWXcHTGW7v66WVhwFzDTEEuim-3283PypO8OGV88nKfk-_t33y6u2PXXDx8vzq-ZEw0U5luPqm5txaXvK6_B9ta3SinhtKuV5ELrWnLddt5yrfu6aVwDCiR3QmnhxSl5vXinBaYX5mKOITs8HLqIacymErxqQbWimVBYUDeknAfszc0Qjt1wZziYORCzN501cyBmCWQaefVgH-0R_d-BPwlMwNkC4LTjz4CDyS7M_-TDgK4Yn8L_7b8Bxg-bwg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2312807835</pqid></control><display><type>article</type><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><source>Elsevier ScienceDirect Journals</source><creator>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</creator><creatorcontrib>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</creatorcontrib><description>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far.
A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted]
•Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</description><identifier>ISSN: 0003-2697</identifier><identifier>EISSN: 1096-0309</identifier><identifier>DOI: 10.1016/j.ab.2019.113494</identifier><identifier>PMID: 31693872</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Consensus sequence ; DNA-binding proteins ; Double-stranded DNA-binding proteins ; K-segmentation ; Single-stranded DNA-binding proteins ; Support vector machine</subject><ispartof>Analytical biochemistry, 2020-01, Vol.589, p.113494-113494, Article 113494</ispartof><rights>2019 Elsevier Inc.</rights><rights>Copyright © 2019 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</citedby><cites>FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ab.2019.113494$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3536,27903,27904,45974</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31693872$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ali, Farman</creatorcontrib><creatorcontrib>Arif, Muhammad</creatorcontrib><creatorcontrib>Khan, Zaheer Ullah</creatorcontrib><creatorcontrib>Kabir, Muhammad</creatorcontrib><creatorcontrib>Ahmed, Saeed</creatorcontrib><creatorcontrib>Yu, Dong-Jun</creatorcontrib><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><title>Analytical biochemistry</title><addtitle>Anal Biochem</addtitle><description>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far.
A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted]
•Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</description><subject>Consensus sequence</subject><subject>DNA-binding proteins</subject><subject>Double-stranded DNA-binding proteins</subject><subject>K-segmentation</subject><subject>Single-stranded DNA-binding proteins</subject><subject>Support vector machine</subject><issn>0003-2697</issn><issn>1096-0309</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kcFu1DAQhi0Eokvhzgn5yMXLOE7suLfSAkUUWGnhbMX2ZOXVrl1iB7WPwtuSNAVx4TKWRt989vgn5CWHNQcu3-zXnV1XwPWac1Hr-hFZcdCSgQD9mKwAQLBKanVCnuW8B-C8buRTciK41KJV1Yr82l6-3bDNgP6MzjW4ElKkqac5xN0BWS5DFz16OlXq02j_7V1-OWc2RD-h9GZIBUPM1N5RvC24dF2KGWMeM834Y8To8F70iWXcHTGW7v66WVhwFzDTEEuim-3283PypO8OGV88nKfk-_t33y6u2PXXDx8vzq-ZEw0U5luPqm5txaXvK6_B9ta3SinhtKuV5ELrWnLddt5yrfu6aVwDCiR3QmnhxSl5vXinBaYX5mKOITs8HLqIacymErxqQbWimVBYUDeknAfszc0Qjt1wZziYORCzN501cyBmCWQaefVgH-0R_d-BPwlMwNkC4LTjz4CDyS7M_-TDgK4Yn8L_7b8Bxg-bwg</recordid><startdate>20200115</startdate><enddate>20200115</enddate><creator>Ali, Farman</creator><creator>Arif, Muhammad</creator><creator>Khan, Zaheer Ullah</creator><creator>Kabir, Muhammad</creator><creator>Ahmed, Saeed</creator><creator>Yu, Dong-Jun</creator><general>Elsevier Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20200115</creationdate><title>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</title><author>Ali, Farman ; Arif, Muhammad ; Khan, Zaheer Ullah ; Kabir, Muhammad ; Ahmed, Saeed ; Yu, Dong-Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c350t-d8de748b216df2d90bfbd87773c9c476139946198adb199f455c507061c3793d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Consensus sequence</topic><topic>DNA-binding proteins</topic><topic>Double-stranded DNA-binding proteins</topic><topic>K-segmentation</topic><topic>Single-stranded DNA-binding proteins</topic><topic>Support vector machine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ali, Farman</creatorcontrib><creatorcontrib>Arif, Muhammad</creatorcontrib><creatorcontrib>Khan, Zaheer Ullah</creatorcontrib><creatorcontrib>Kabir, Muhammad</creatorcontrib><creatorcontrib>Ahmed, Saeed</creatorcontrib><creatorcontrib>Yu, Dong-Jun</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Analytical biochemistry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ali, Farman</au><au>Arif, Muhammad</au><au>Khan, Zaheer Ullah</au><au>Kabir, Muhammad</au><au>Ahmed, Saeed</au><au>Yu, Dong-Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM</atitle><jtitle>Analytical biochemistry</jtitle><addtitle>Anal Biochem</addtitle><date>2020-01-15</date><risdate>2020</risdate><volume>589</volume><spage>113494</spage><epage>113494</epage><pages>113494-113494</pages><artnum>113494</artnum><issn>0003-2697</issn><eissn>1096-0309</eissn><abstract>Identification of DNA-binding proteins (DNA-BPs) is a hot issue in protein science due to its key role in various biological processes. These processes are highly concerned with DNA-binding protein types. DNA-BPs are classified into single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). SSBs mainly involved in DNA recombination, replication, and repair, while DSBs regulate transcription process, DNA cleavage, and chromosome packaging. In spite of the aforementioned significance, few methods have been proposed for discrimination of SSBs and DSBs. Therefore, more predictors with favorable performance are indispensable. In this work, we present an innovative predictor, called SDBP-Pred with a novel feature descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM). We encoded the local discriminative features concealed in PSSM via K-segmentation strategy and the global potential features by applying the notion of the consensus sequence. The obtained feature vector then input to support vector machine (SVM) with linear, polynomial and radial base function (RBF) kernels. Our model with SVM-RBF achieved the highest accuracies on three tests namely jackknife, 10-fold, and independent tests, respectively than the recent method. The obtained prediction results illustrate the superlative prediction performance of SDBP-Pred over existing studies in the literature so far.
A novel sequence-based predictor, called SDBP-Pred is designed for discrimination of single-stranded DNA-binding proteins (SSBs) and double-stranded DNA-binding proteins (DSBs). The features are extracted by descriptor, named consensus sequence-based K-segmentation position-specific scoring matrix (CSKS-PSSM) and the classification is performed with support vector machine. [Display omitted]
•Designed a novel predictor SDBP-Pred for discrimination of SSBs and DSBs.•The local discriminative information from PSSM was discovered by K-segmentation.•Consensus sequence notion was applied to extract the global feature.•SVM-LNR, SVM-POL, and SVM-RBF used as classification algorithms.•Our innovative feature encoder CSKS-PSSM with SVM-RBF achieved the highest success rate.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>31693872</pmid><doi>10.1016/j.ab.2019.113494</doi><tpages>1</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0003-2697 |
ispartof | Analytical biochemistry, 2020-01, Vol.589, p.113494-113494, Article 113494 |
issn | 0003-2697 1096-0309 |
language | eng |
recordid | cdi_proquest_miscellaneous_2312807835 |
source | Elsevier ScienceDirect Journals |
subjects | Consensus sequence DNA-binding proteins Double-stranded DNA-binding proteins K-segmentation Single-stranded DNA-binding proteins Support vector machine |
title | SDBP-Pred: Prediction of single-stranded and double-stranded DNA-binding proteins by extending consensus sequence and K-segmentation strategies into PSSM |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T00%3A55%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SDBP-Pred:%20Prediction%20of%20single-stranded%20and%20double-stranded%20DNA-binding%20proteins%20by%20extending%20consensus%20sequence%20and%20K-segmentation%20strategies%20into%20PSSM&rft.jtitle=Analytical%20biochemistry&rft.au=Ali,%20Farman&rft.date=2020-01-15&rft.volume=589&rft.spage=113494&rft.epage=113494&rft.pages=113494-113494&rft.artnum=113494&rft.issn=0003-2697&rft.eissn=1096-0309&rft_id=info:doi/10.1016/j.ab.2019.113494&rft_dat=%3Cproquest_cross%3E2312807835%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2312807835&rft_id=info:pmid/31693872&rft_els_id=S0003269719300703&rfr_iscdi=true |