BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins

The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This st...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Advances in Bioinformatics 2016, Vol.2016, p.207-217
Hauptverfasser: Selvaraj, MuthuKrishnan, Puri, Munish, Dikshit, Kanak L., Lefevre, Christophe
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 217
container_issue
container_start_page 207
container_title Advances in Bioinformatics
container_volume 2016
creator Selvaraj, MuthuKrishnan
Puri, Munish
Dikshit, Kanak L.
Lefevre, Christophe
description The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM) models were developed for predicting HbL proteins based upon amino acid composition (AC), dipeptide composition (DC), hybrid method (AC + DC), and position specific scoring matrix (PSSM). In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM) profiles. The average accuracy, standard deviation (SD), false positive rate (FPR), confusion matrix, and receiver operating characteristic (ROC) were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.
doi_str_mv 10.1155/2016/8150784
format Article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4789356</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A491909914</galeid><airiti_id>16878027_201612_201702070002_201702070002_207_217</airiti_id><sourcerecordid>A491909914</sourcerecordid><originalsourceid>FETCH-LOGICAL-a4954-76a3f8b8a92d9756d7c5f04dae8faacb3e6ca784fee0c2f9bc79c5730ccc7d4b3</originalsourceid><addsrcrecordid>eNqFktuL1DAUxoso7kXffJaCL4J2N0lz9UFYF3WEWRS8vIY0PZnJ2klq2q7435s64zCrguQhycnvfOFLvqJ4hNEZxoydE4T5ucQMCUnvFMeYS1FJVLO7-zURR8XJMFwjxInC9f3iiAhUU87pcdG8MnbR9AnaF-XHqe9jGssvYMeYyitj1z5AeQXjOrZD6XJtXEP5IcPejj6GMroy94-QvOnKBWziqouND9XSf525OIIPw4PinjPdAA9382nx-c3rT5eLavn-7bvLi2VlqGK0EtzUTjbSKNIqwXgrLHOItgakM8Y2NXBrskcHgCxxqrFCWSZqZK0VLW3q0-LlVrefmg20FsKYTKf75Dcm_dDReH37JPi1XsUbTYVUNeNZ4OlOIMVvEwyj3vjBQteZAHEaNBZCKMkUmdEnf6DXcUoh25spIiTH_IBamQ60Dy7me-0sqi-owgophWmmzv5B5dHCxtsYwPlcv9XwfNtgUxyGBG7vESM9Z0LPmdC7TGT88eG77OHfIcjAsy2Qv7s13_3_5BZb2vjkR39gPKdtDtsvHJN5EogggRD6a5MhLOqfQbzRyg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1772786166</pqid></control><display><type>article</type><title>BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins</title><source>PubMed Central Open Access</source><source>Wiley-Blackwell Open Access Titles</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Selvaraj, MuthuKrishnan ; Puri, Munish ; Dikshit, Kanak L. ; Lefevre, Christophe</creator><contributor>Harrison, Paul</contributor><creatorcontrib>Selvaraj, MuthuKrishnan ; Puri, Munish ; Dikshit, Kanak L. ; Lefevre, Christophe ; Harrison, Paul</creatorcontrib><description>The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM) models were developed for predicting HbL proteins based upon amino acid composition (AC), dipeptide composition (DC), hybrid method (AC + DC), and position specific scoring matrix (PSSM). In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM) profiles. The average accuracy, standard deviation (SD), false positive rate (FPR), confusion matrix, and receiver operating characteristic (ROC) were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.</description><identifier>ISSN: 1687-8027</identifier><identifier>EISSN: 1687-8035</identifier><identifier>DOI: 10.1155/2016/8150784</identifier><identifier>PMID: 27034664</identifier><language>eng</language><publisher>Egypt: Hindawi Limiteds</publisher><subject>Genetic aspects ; Genetic vectors ; Hemoglobin ; Physiological aspects</subject><ispartof>Advances in Bioinformatics, 2016, Vol.2016, p.207-217</ispartof><rights>Copyright © 2016 MuthuKrishnan Selvaraj et al.</rights><rights>COPYRIGHT 2016 John Wiley &amp; Sons, Inc.</rights><rights>Copyright © 2016 MuthuKrishnan Selvaraj et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2016 MuthuKrishnan Selvaraj et al. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a4954-76a3f8b8a92d9756d7c5f04dae8faacb3e6ca784fee0c2f9bc79c5730ccc7d4b3</citedby><cites>FETCH-LOGICAL-a4954-76a3f8b8a92d9756d7c5f04dae8faacb3e6ca784fee0c2f9bc79c5730ccc7d4b3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4789356/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC4789356/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,4024,27923,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/27034664$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Harrison, Paul</contributor><creatorcontrib>Selvaraj, MuthuKrishnan</creatorcontrib><creatorcontrib>Puri, Munish</creatorcontrib><creatorcontrib>Dikshit, Kanak L.</creatorcontrib><creatorcontrib>Lefevre, Christophe</creatorcontrib><title>BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins</title><title>Advances in Bioinformatics</title><addtitle>Adv Bioinformatics</addtitle><description>The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM) models were developed for predicting HbL proteins based upon amino acid composition (AC), dipeptide composition (DC), hybrid method (AC + DC), and position specific scoring matrix (PSSM). In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM) profiles. The average accuracy, standard deviation (SD), false positive rate (FPR), confusion matrix, and receiver operating characteristic (ROC) were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.</description><subject>Genetic aspects</subject><subject>Genetic vectors</subject><subject>Hemoglobin</subject><subject>Physiological aspects</subject><issn>1687-8027</issn><issn>1687-8035</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>RHX</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqFktuL1DAUxoso7kXffJaCL4J2N0lz9UFYF3WEWRS8vIY0PZnJ2klq2q7435s64zCrguQhycnvfOFLvqJ4hNEZxoydE4T5ucQMCUnvFMeYS1FJVLO7-zURR8XJMFwjxInC9f3iiAhUU87pcdG8MnbR9AnaF-XHqe9jGssvYMeYyitj1z5AeQXjOrZD6XJtXEP5IcPejj6GMroy94-QvOnKBWziqouND9XSf525OIIPw4PinjPdAA9382nx-c3rT5eLavn-7bvLi2VlqGK0EtzUTjbSKNIqwXgrLHOItgakM8Y2NXBrskcHgCxxqrFCWSZqZK0VLW3q0-LlVrefmg20FsKYTKf75Dcm_dDReH37JPi1XsUbTYVUNeNZ4OlOIMVvEwyj3vjBQteZAHEaNBZCKMkUmdEnf6DXcUoh25spIiTH_IBamQ60Dy7me-0sqi-owgophWmmzv5B5dHCxtsYwPlcv9XwfNtgUxyGBG7vESM9Z0LPmdC7TGT88eG77OHfIcjAsy2Qv7s13_3_5BZb2vjkR39gPKdtDtsvHJN5EogggRD6a5MhLOqfQbzRyg</recordid><startdate>2016</startdate><enddate>2016</enddate><creator>Selvaraj, MuthuKrishnan</creator><creator>Puri, Munish</creator><creator>Dikshit, Kanak L.</creator><creator>Lefevre, Christophe</creator><general>Hindawi Limiteds</general><general>Hindawi Publishing Corporation</general><general>John Wiley &amp; Sons, Inc</general><general>Hindawi Limited</general><scope>188</scope><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QO</scope><scope>7XB</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>CWDGH</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>KB.</scope><scope>LK8</scope><scope>M0N</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>2016</creationdate><title>BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins</title><author>Selvaraj, MuthuKrishnan ; Puri, Munish ; Dikshit, Kanak L. ; Lefevre, Christophe</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a4954-76a3f8b8a92d9756d7c5f04dae8faacb3e6ca784fee0c2f9bc79c5730ccc7d4b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Genetic aspects</topic><topic>Genetic vectors</topic><topic>Hemoglobin</topic><topic>Physiological aspects</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Selvaraj, MuthuKrishnan</creatorcontrib><creatorcontrib>Puri, Munish</creatorcontrib><creatorcontrib>Dikshit, Kanak L.</creatorcontrib><creatorcontrib>Lefevre, Christophe</creatorcontrib><collection>Airiti Library</collection><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access Journals</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Middle East &amp; Africa Database</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Materials Science Database</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Materials Science Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Advances in Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Selvaraj, MuthuKrishnan</au><au>Puri, Munish</au><au>Dikshit, Kanak L.</au><au>Lefevre, Christophe</au><au>Harrison, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins</atitle><jtitle>Advances in Bioinformatics</jtitle><addtitle>Adv Bioinformatics</addtitle><date>2016</date><risdate>2016</risdate><volume>2016</volume><spage>207</spage><epage>217</epage><pages>207-217</pages><issn>1687-8027</issn><eissn>1687-8035</eissn><abstract>The recent upsurge in microbial genome data has revealed that hemoglobin-like (HbL) proteins may be widely distributed among bacteria and that some organisms may carry more than one HbL encoding gene. However, the discovery of HbL proteins has been limited to a small number of bacteria only. This study describes the prediction of HbL proteins and their domain classification using a machine learning approach. Support vector machine (SVM) models were developed for predicting HbL proteins based upon amino acid composition (AC), dipeptide composition (DC), hybrid method (AC + DC), and position specific scoring matrix (PSSM). In addition, we introduce for the first time a new prediction method based on max to min amino acid residue (MM) profiles. The average accuracy, standard deviation (SD), false positive rate (FPR), confusion matrix, and receiver operating characteristic (ROC) were analyzed. We also compared the performance of our proposed models in homology detection databases. The performance of the different approaches was estimated using fivefold cross-validation techniques. Prediction accuracy was further investigated through confusion matrix and ROC curve analysis. All experimental results indicate that the proposed BacHbpred can be a perspective predictor for determination of HbL related proteins. BacHbpred, a web tool, has been developed for HbL prediction.</abstract><cop>Egypt</cop><pub>Hindawi Limiteds</pub><pmid>27034664</pmid><doi>10.1155/2016/8150784</doi><tpages>11</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1687-8027
ispartof Advances in Bioinformatics, 2016, Vol.2016, p.207-217
issn 1687-8027
1687-8035
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4789356
source PubMed Central Open Access; Wiley-Blackwell Open Access Titles; EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Genetic aspects
Genetic vectors
Hemoglobin
Physiological aspects
title BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T13%3A17%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=BacHbpred:%20Support%20Vector%20Machine%20Methods%20for%20the%20Prediction%20of%20Bacterial%20Hemoglobin-Like%20Proteins&rft.jtitle=Advances%20in%20Bioinformatics&rft.au=Selvaraj,%20MuthuKrishnan&rft.date=2016&rft.volume=2016&rft.spage=207&rft.epage=217&rft.pages=207-217&rft.issn=1687-8027&rft.eissn=1687-8035&rft_id=info:doi/10.1155/2016/8150784&rft_dat=%3Cgale_pubme%3EA491909914%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1772786166&rft_id=info:pmid/27034664&rft_galeid=A491909914&rft_airiti_id=16878027_201612_201702070002_201702070002_207_217&rfr_iscdi=true