Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer

Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecul...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular bioSystems 2012-06, Vol.8 (6), p.1716-1722
Hauptverfasser: Aguiar-Pulido, Vanessa, Munteanu, Cristian R, Seoane, José A, Fernández-Blanco, Enrique, Pérez-Montoto, Lázaro G, González-Díaz, Humberto, Dorado, Julián
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1722
container_issue 6
container_start_page 1716
container_title Molecular bioSystems
container_volume 8
creator Aguiar-Pulido, Vanessa
Munteanu, Cristian R
Seoane, José A
Fernández-Blanco, Enrique
Pérez-Montoto, Lázaro G
González-Díaz, Humberto
Dorado, Julián
description Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence. An algorithm for colon cancer prediction is presented. Amino acid sequences are coded using spiral graph topological and Shannon entropy indices. These indices are then classified using a Naïve Bayes model.
doi_str_mv 10.1039/c2mb25039j
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmed_primary_22466084</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1012207235</sourcerecordid><originalsourceid>FETCH-LOGICAL-c335t-122441537a11ab4ecbc585c74d430601479341f48fa39aac34cd2511e316f4e93</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EoqWwYQ8yO4QU8CuPLqE8pQoEBYldNHEc6pLEwU6Q-lV8BD-GUUu7Y-Vrz5nrmYvQPiWnlPDhmWRVxkKvZhuoT2PBAkZCurnS0WsP7Tg3I4QngpJt1GNMRBFJRB919_D99anwBcyVw4-TyycsS3BOF1pCq02NM3Aqx164RlsogzcLzRRPplDX_lHVrTWN9r2FsbixplXa92hTgX1X1mF_m3YV1Fia0vMSaqnsLtoqoHRqb3kO0Mv11fPoNhg_3NyNzseB5DxsA-rnFDTkMVAKmVAyk2ESyljkgpOIUBEPuaCFSArgQwDJhcxZSKniNCqEGvIBOl74-sE-OuXatNJOqrKEWpnOpZT4L0jMeOjRkwUqrXHOqiJtrPZLzD2U_sacrmP28OHSt8sqla_Qv1w9cLQArJOr6togbfLCMwf_MfwHlYiOpw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1012207235</pqid></control><display><type>article</type><title>Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer</title><source>MEDLINE</source><source>Royal Society Of Chemistry Journals 2008-</source><source>Alma/SFX Local Collection</source><creator>Aguiar-Pulido, Vanessa ; Munteanu, Cristian R ; Seoane, José A ; Fernández-Blanco, Enrique ; Pérez-Montoto, Lázaro G ; González-Díaz, Humberto ; Dorado, Julián</creator><creatorcontrib>Aguiar-Pulido, Vanessa ; Munteanu, Cristian R ; Seoane, José A ; Fernández-Blanco, Enrique ; Pérez-Montoto, Lázaro G ; González-Díaz, Humberto ; Dorado, Julián</creatorcontrib><description>Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence. An algorithm for colon cancer prediction is presented. Amino acid sequences are coded using spiral graph topological and Shannon entropy indices. These indices are then classified using a Naïve Bayes model.</description><identifier>ISSN: 1742-206X</identifier><identifier>EISSN: 1742-2051</identifier><identifier>DOI: 10.1039/c2mb25039j</identifier><identifier>PMID: 22466084</identifier><language>eng</language><publisher>England</publisher><subject>Amino Acid Sequence ; Area Under Curve ; Bayes Theorem ; Biomarkers, Tumor - analysis ; Biomarkers, Tumor - chemistry ; Colonic Neoplasms - chemistry ; Colonic Neoplasms - diagnosis ; Computational Biology - methods ; Entropy ; Humans ; Models, Biological ; Molecular Sequence Data ; Proteins - analysis ; Proteins - chemistry ; Quantitative Structure-Activity Relationship ; ROC Curve ; Sequence Analysis, Protein - methods</subject><ispartof>Molecular bioSystems, 2012-06, Vol.8 (6), p.1716-1722</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c335t-122441537a11ab4ecbc585c74d430601479341f48fa39aac34cd2511e316f4e93</citedby><cites>FETCH-LOGICAL-c335t-122441537a11ab4ecbc585c74d430601479341f48fa39aac34cd2511e316f4e93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22466084$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Aguiar-Pulido, Vanessa</creatorcontrib><creatorcontrib>Munteanu, Cristian R</creatorcontrib><creatorcontrib>Seoane, José A</creatorcontrib><creatorcontrib>Fernández-Blanco, Enrique</creatorcontrib><creatorcontrib>Pérez-Montoto, Lázaro G</creatorcontrib><creatorcontrib>González-Díaz, Humberto</creatorcontrib><creatorcontrib>Dorado, Julián</creatorcontrib><title>Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer</title><title>Molecular bioSystems</title><addtitle>Mol Biosyst</addtitle><description>Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence. An algorithm for colon cancer prediction is presented. Amino acid sequences are coded using spiral graph topological and Shannon entropy indices. These indices are then classified using a Naïve Bayes model.</description><subject>Amino Acid Sequence</subject><subject>Area Under Curve</subject><subject>Bayes Theorem</subject><subject>Biomarkers, Tumor - analysis</subject><subject>Biomarkers, Tumor - chemistry</subject><subject>Colonic Neoplasms - chemistry</subject><subject>Colonic Neoplasms - diagnosis</subject><subject>Computational Biology - methods</subject><subject>Entropy</subject><subject>Humans</subject><subject>Models, Biological</subject><subject>Molecular Sequence Data</subject><subject>Proteins - analysis</subject><subject>Proteins - chemistry</subject><subject>Quantitative Structure-Activity Relationship</subject><subject>ROC Curve</subject><subject>Sequence Analysis, Protein - methods</subject><issn>1742-206X</issn><issn>1742-2051</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMtOwzAQRS0EoqWwYQ8yO4QU8CuPLqE8pQoEBYldNHEc6pLEwU6Q-lV8BD-GUUu7Y-Vrz5nrmYvQPiWnlPDhmWRVxkKvZhuoT2PBAkZCurnS0WsP7Tg3I4QngpJt1GNMRBFJRB919_D99anwBcyVw4-TyycsS3BOF1pCq02NM3Aqx164RlsogzcLzRRPplDX_lHVrTWN9r2FsbixplXa92hTgX1X1mF_m3YV1Fia0vMSaqnsLtoqoHRqb3kO0Mv11fPoNhg_3NyNzseB5DxsA-rnFDTkMVAKmVAyk2ESyljkgpOIUBEPuaCFSArgQwDJhcxZSKniNCqEGvIBOl74-sE-OuXatNJOqrKEWpnOpZT4L0jMeOjRkwUqrXHOqiJtrPZLzD2U_sacrmP28OHSt8sqla_Qv1w9cLQArJOr6togbfLCMwf_MfwHlYiOpw</recordid><startdate>201206</startdate><enddate>201206</enddate><creator>Aguiar-Pulido, Vanessa</creator><creator>Munteanu, Cristian R</creator><creator>Seoane, José A</creator><creator>Fernández-Blanco, Enrique</creator><creator>Pérez-Montoto, Lázaro G</creator><creator>González-Díaz, Humberto</creator><creator>Dorado, Julián</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>201206</creationdate><title>Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer</title><author>Aguiar-Pulido, Vanessa ; Munteanu, Cristian R ; Seoane, José A ; Fernández-Blanco, Enrique ; Pérez-Montoto, Lázaro G ; González-Díaz, Humberto ; Dorado, Julián</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c335t-122441537a11ab4ecbc585c74d430601479341f48fa39aac34cd2511e316f4e93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Amino Acid Sequence</topic><topic>Area Under Curve</topic><topic>Bayes Theorem</topic><topic>Biomarkers, Tumor - analysis</topic><topic>Biomarkers, Tumor - chemistry</topic><topic>Colonic Neoplasms - chemistry</topic><topic>Colonic Neoplasms - diagnosis</topic><topic>Computational Biology - methods</topic><topic>Entropy</topic><topic>Humans</topic><topic>Models, Biological</topic><topic>Molecular Sequence Data</topic><topic>Proteins - analysis</topic><topic>Proteins - chemistry</topic><topic>Quantitative Structure-Activity Relationship</topic><topic>ROC Curve</topic><topic>Sequence Analysis, Protein - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Aguiar-Pulido, Vanessa</creatorcontrib><creatorcontrib>Munteanu, Cristian R</creatorcontrib><creatorcontrib>Seoane, José A</creatorcontrib><creatorcontrib>Fernández-Blanco, Enrique</creatorcontrib><creatorcontrib>Pérez-Montoto, Lázaro G</creatorcontrib><creatorcontrib>González-Díaz, Humberto</creatorcontrib><creatorcontrib>Dorado, Julián</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Molecular bioSystems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Aguiar-Pulido, Vanessa</au><au>Munteanu, Cristian R</au><au>Seoane, José A</au><au>Fernández-Blanco, Enrique</au><au>Pérez-Montoto, Lázaro G</au><au>González-Díaz, Humberto</au><au>Dorado, Julián</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer</atitle><jtitle>Molecular bioSystems</jtitle><addtitle>Mol Biosyst</addtitle><date>2012-06</date><risdate>2012</risdate><volume>8</volume><issue>6</issue><spage>1716</spage><epage>1722</epage><pages>1716-1722</pages><issn>1742-206X</issn><eissn>1742-2051</eissn><abstract>Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence. An algorithm for colon cancer prediction is presented. Amino acid sequences are coded using spiral graph topological and Shannon entropy indices. These indices are then classified using a Naïve Bayes model.</abstract><cop>England</cop><pmid>22466084</pmid><doi>10.1039/c2mb25039j</doi><tpages>7</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1742-206X
ispartof Molecular bioSystems, 2012-06, Vol.8 (6), p.1716-1722
issn 1742-206X
1742-2051
language eng
recordid cdi_pubmed_primary_22466084
source MEDLINE; Royal Society Of Chemistry Journals 2008-; Alma/SFX Local Collection
subjects Amino Acid Sequence
Area Under Curve
Bayes Theorem
Biomarkers, Tumor - analysis
Biomarkers, Tumor - chemistry
Colonic Neoplasms - chemistry
Colonic Neoplasms - diagnosis
Computational Biology - methods
Entropy
Humans
Models, Biological
Molecular Sequence Data
Proteins - analysis
Proteins - chemistry
Quantitative Structure-Activity Relationship
ROC Curve
Sequence Analysis, Protein - methods
title Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T21%3A14%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Na%C3%AFve%20Bayes%20QSDR%20classification%20based%20on%20spiral-graph%20Shannon%20entropies%20for%20protein%20biomarkers%20in%20human%20colon%20cancer&rft.jtitle=Molecular%20bioSystems&rft.au=Aguiar-Pulido,%20Vanessa&rft.date=2012-06&rft.volume=8&rft.issue=6&rft.spage=1716&rft.epage=1722&rft.pages=1716-1722&rft.issn=1742-206X&rft.eissn=1742-2051&rft_id=info:doi/10.1039/c2mb25039j&rft_dat=%3Cproquest_pubme%3E1012207235%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1012207235&rft_id=info:pmid/22466084&rfr_iscdi=true