Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches

In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PloS one 2009-10, Vol.4 (10), p.e7490-e7490
Hauptverfasser: Matsuda, Fumio, Shinbo, Yoko, Oikawa, Akira, Hirai, Masami Yokota, Fiehn, Oliver, Kanaya, Shigehiko, Saito, Kazuki
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e7490
container_issue 10
container_start_page e7490
container_title PloS one
container_volume 4
creator Matsuda, Fumio
Shinbo, Yoko
Oikawa, Akira
Hirai, Masami Yokota
Fiehn, Oliver
Kanaya, Shigehiko
Saito, Kazuki
description In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR
doi_str_mv 10.1371/journal.pone.0007490
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1292466862</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A472844517</galeid><doaj_id>oai_doaj_org_article_2fa835191288442db982270f41be510d</doaj_id><sourcerecordid>A472844517</sourcerecordid><originalsourceid>FETCH-LOGICAL-c729t-3e37f3b4c8927eef30af4f990f7aafba22484364aa59fd2fff4da54c3768ff373</originalsourceid><addsrcrecordid>eNqNk02P0zAQhiMEYpeFf4AgEhKIQ4u_Eid7QKpWfFRaaSW-rtYkGbdZOXHXdip64q_jtAFaxAH5YMt-5h3Pq5kkeUrJnHJJ39zawfVg5hvb45wQIkVJ7iXntORsljPC7x-dz5JH3t8SkvEizx8mZ7QshOREnCc_Ft6j9x32IbU67TBAZY3tMIW-twFCa_v0bgDTht1lCiOwtk2qrUtxC2aIQL9KwxpTDcZj2rS-tlt0u9RBwFESDY7qYNLadhvr272kR3D1Gv3j5ME-8Mm0XyRf37_7cvVxdn3zYXm1uJ7VkpVhxpFLzStRFyWTiJoT0EKXJdESQFfAmCgEzwVAVuqGaa1FA5moucwLrbnkF8nzg-7GWK8m77yirGQiz4ucRWJ5IBoLt2rj2g7cTllo1f7CupUCF9raoGIaCp7RkrKiEII1VVkwJokWtMKMkiZqvZ2yDVWHTR3rd2BORE9f-natVnarmMxpJmgUeDUJOHs3oA-qi8aiMdCjHbySnOdlIclIvviL_Hdx8wO1gvj_ttc2pq3jarBr69hBuo33CyFZLCijo1-vTwIiE_B7WMHgvVp-_vT_7M23U_blEbtGMGHtrRnGpvCnoDiAtbPeO9S_3aNEjQPwq041DoCaBiCGPTt2_k_Q1PH8J7cGBAs</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1292466862</pqid></control><display><type>article</type><title>Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS) Journals Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Free Full-Text Journals in Chemistry</source><creator>Matsuda, Fumio ; Shinbo, Yoko ; Oikawa, Akira ; Hirai, Masami Yokota ; Fiehn, Oliver ; Kanaya, Shigehiko ; Saito, Kazuki</creator><contributor>El-Shemy, Hany A.</contributor><creatorcontrib>Matsuda, Fumio ; Shinbo, Yoko ; Oikawa, Akira ; Hirai, Masami Yokota ; Fiehn, Oliver ; Kanaya, Shigehiko ; Saito, Kazuki ; El-Shemy, Hany A.</creatorcontrib><description>In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR &gt;70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR &lt;10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0007490</identifier><identifier>PMID: 19847304</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Analysis ; Annotations ; Arabidopsis ; Arabidopsis - metabolism ; Biotechnology ; Chemical composition ; Chromatography ; Computational Biology - methods ; Computational Biology/Metabolic Networks ; Computer simulation ; Data bases ; Data Interpretation, Statistical ; Databases, Factual ; False Positive Reactions ; Fourier Analysis ; Fourier transforms ; Genomes ; Genomics ; Identification ; Information science ; Ions ; Mass spectrometry ; Mass spectroscopy ; Metabolism ; Metabolites ; Metabolome ; Metabolomics ; Metabolomics - methods ; Methods ; Monte Carlo methods ; Monte Carlo simulation ; Oryza - metabolism ; Peptides ; Plant Biology/Agricultural Biotechnology ; Plant Proteins - metabolism ; Plant sciences ; Quality assessment ; Scientific imaging ; Searching ; Sequence Alignment - methods ; Sequence Analysis, Protein - methods ; Simulation ; Software ; Tandem Mass Spectrometry - methods ; Theoretical analysis ; Trends</subject><ispartof>PloS one, 2009-10, Vol.4 (10), p.e7490-e7490</ispartof><rights>COPYRIGHT 2009 Public Library of Science</rights><rights>2009 Matsuda et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Matsuda et al. 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c729t-3e37f3b4c8927eef30af4f990f7aafba22484364aa59fd2fff4da54c3768ff373</citedby><cites>FETCH-LOGICAL-c729t-3e37f3b4c8927eef30af4f990f7aafba22484364aa59fd2fff4da54c3768ff373</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761541/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2761541/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793,79600,79601</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/19847304$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>El-Shemy, Hany A.</contributor><creatorcontrib>Matsuda, Fumio</creatorcontrib><creatorcontrib>Shinbo, Yoko</creatorcontrib><creatorcontrib>Oikawa, Akira</creatorcontrib><creatorcontrib>Hirai, Masami Yokota</creatorcontrib><creatorcontrib>Fiehn, Oliver</creatorcontrib><creatorcontrib>Kanaya, Shigehiko</creatorcontrib><creatorcontrib>Saito, Kazuki</creatorcontrib><title>Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR &gt;70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR &lt;10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.</description><subject>Accuracy</subject><subject>Analysis</subject><subject>Annotations</subject><subject>Arabidopsis</subject><subject>Arabidopsis - metabolism</subject><subject>Biotechnology</subject><subject>Chemical composition</subject><subject>Chromatography</subject><subject>Computational Biology - methods</subject><subject>Computational Biology/Metabolic Networks</subject><subject>Computer simulation</subject><subject>Data bases</subject><subject>Data Interpretation, Statistical</subject><subject>Databases, Factual</subject><subject>False Positive Reactions</subject><subject>Fourier Analysis</subject><subject>Fourier transforms</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Identification</subject><subject>Information science</subject><subject>Ions</subject><subject>Mass spectrometry</subject><subject>Mass spectroscopy</subject><subject>Metabolism</subject><subject>Metabolites</subject><subject>Metabolome</subject><subject>Metabolomics</subject><subject>Metabolomics - methods</subject><subject>Methods</subject><subject>Monte Carlo methods</subject><subject>Monte Carlo simulation</subject><subject>Oryza - metabolism</subject><subject>Peptides</subject><subject>Plant Biology/Agricultural Biotechnology</subject><subject>Plant Proteins - metabolism</subject><subject>Plant sciences</subject><subject>Quality assessment</subject><subject>Scientific imaging</subject><subject>Searching</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Analysis, Protein - methods</subject><subject>Simulation</subject><subject>Software</subject><subject>Tandem Mass Spectrometry - methods</subject><subject>Theoretical analysis</subject><subject>Trends</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNqNk02P0zAQhiMEYpeFf4AgEhKIQ4u_Eid7QKpWfFRaaSW-rtYkGbdZOXHXdip64q_jtAFaxAH5YMt-5h3Pq5kkeUrJnHJJ39zawfVg5hvb45wQIkVJ7iXntORsljPC7x-dz5JH3t8SkvEizx8mZ7QshOREnCc_Ft6j9x32IbU67TBAZY3tMIW-twFCa_v0bgDTht1lCiOwtk2qrUtxC2aIQL9KwxpTDcZj2rS-tlt0u9RBwFESDY7qYNLadhvr272kR3D1Gv3j5ME-8Mm0XyRf37_7cvVxdn3zYXm1uJ7VkpVhxpFLzStRFyWTiJoT0EKXJdESQFfAmCgEzwVAVuqGaa1FA5moucwLrbnkF8nzg-7GWK8m77yirGQiz4ucRWJ5IBoLt2rj2g7cTllo1f7CupUCF9raoGIaCp7RkrKiEII1VVkwJokWtMKMkiZqvZ2yDVWHTR3rd2BORE9f-natVnarmMxpJmgUeDUJOHs3oA-qi8aiMdCjHbySnOdlIclIvviL_Hdx8wO1gvj_ttc2pq3jarBr69hBuo33CyFZLCijo1-vTwIiE_B7WMHgvVp-_vT_7M23U_blEbtGMGHtrRnGpvCnoDiAtbPeO9S_3aNEjQPwq041DoCaBiCGPTt2_k_Q1PH8J7cGBAs</recordid><startdate>20091016</startdate><enddate>20091016</enddate><creator>Matsuda, Fumio</creator><creator>Shinbo, Yoko</creator><creator>Oikawa, Akira</creator><creator>Hirai, Masami Yokota</creator><creator>Fiehn, Oliver</creator><creator>Kanaya, Shigehiko</creator><creator>Saito, Kazuki</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20091016</creationdate><title>Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches</title><author>Matsuda, Fumio ; Shinbo, Yoko ; Oikawa, Akira ; Hirai, Masami Yokota ; Fiehn, Oliver ; Kanaya, Shigehiko ; Saito, Kazuki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c729t-3e37f3b4c8927eef30af4f990f7aafba22484364aa59fd2fff4da54c3768ff373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Accuracy</topic><topic>Analysis</topic><topic>Annotations</topic><topic>Arabidopsis</topic><topic>Arabidopsis - metabolism</topic><topic>Biotechnology</topic><topic>Chemical composition</topic><topic>Chromatography</topic><topic>Computational Biology - methods</topic><topic>Computational Biology/Metabolic Networks</topic><topic>Computer simulation</topic><topic>Data bases</topic><topic>Data Interpretation, Statistical</topic><topic>Databases, Factual</topic><topic>False Positive Reactions</topic><topic>Fourier Analysis</topic><topic>Fourier transforms</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Identification</topic><topic>Information science</topic><topic>Ions</topic><topic>Mass spectrometry</topic><topic>Mass spectroscopy</topic><topic>Metabolism</topic><topic>Metabolites</topic><topic>Metabolome</topic><topic>Metabolomics</topic><topic>Metabolomics - methods</topic><topic>Methods</topic><topic>Monte Carlo methods</topic><topic>Monte Carlo simulation</topic><topic>Oryza - metabolism</topic><topic>Peptides</topic><topic>Plant Biology/Agricultural Biotechnology</topic><topic>Plant Proteins - metabolism</topic><topic>Plant sciences</topic><topic>Quality assessment</topic><topic>Scientific imaging</topic><topic>Searching</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Analysis, Protein - methods</topic><topic>Simulation</topic><topic>Software</topic><topic>Tandem Mass Spectrometry - methods</topic><topic>Theoretical analysis</topic><topic>Trends</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Matsuda, Fumio</creatorcontrib><creatorcontrib>Shinbo, Yoko</creatorcontrib><creatorcontrib>Oikawa, Akira</creatorcontrib><creatorcontrib>Hirai, Masami Yokota</creatorcontrib><creatorcontrib>Fiehn, Oliver</creatorcontrib><creatorcontrib>Kanaya, Shigehiko</creatorcontrib><creatorcontrib>Saito, Kazuki</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing &amp; Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing &amp; Allied Health Database (Alumni Edition)</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials Science Collection</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Matsuda, Fumio</au><au>Shinbo, Yoko</au><au>Oikawa, Akira</au><au>Hirai, Masami Yokota</au><au>Fiehn, Oliver</au><au>Kanaya, Shigehiko</au><au>Saito, Kazuki</au><au>El-Shemy, Hany A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2009-10-16</date><risdate>2009</risdate><volume>4</volume><issue>10</issue><spage>e7490</spage><epage>e7490</epage><pages>e7490-e7490</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30-50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR &gt;70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR &lt;10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>19847304</pmid><doi>10.1371/journal.pone.0007490</doi><tpages>e7490</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6203
ispartof PloS one, 2009-10, Vol.4 (10), p.e7490-e7490
issn 1932-6203
1932-6203
language eng
recordid cdi_plos_journals_1292466862
source MEDLINE; DOAJ Directory of Open Access Journals; Public Library of Science (PLoS) Journals Open Access; EZB-FREE-00999 freely available EZB journals; PubMed Central; Free Full-Text Journals in Chemistry
subjects Accuracy
Analysis
Annotations
Arabidopsis
Arabidopsis - metabolism
Biotechnology
Chemical composition
Chromatography
Computational Biology - methods
Computational Biology/Metabolic Networks
Computer simulation
Data bases
Data Interpretation, Statistical
Databases, Factual
False Positive Reactions
Fourier Analysis
Fourier transforms
Genomes
Genomics
Identification
Information science
Ions
Mass spectrometry
Mass spectroscopy
Metabolism
Metabolites
Metabolome
Metabolomics
Metabolomics - methods
Methods
Monte Carlo methods
Monte Carlo simulation
Oryza - metabolism
Peptides
Plant Biology/Agricultural Biotechnology
Plant Proteins - metabolism
Plant sciences
Quality assessment
Scientific imaging
Searching
Sequence Alignment - methods
Sequence Analysis, Protein - methods
Simulation
Software
Tandem Mass Spectrometry - methods
Theoretical analysis
Trends
title Assessment of metabolome annotation quality: a method for evaluating the false discovery rate of elemental composition searches
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T19%3A51%3A55IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessment%20of%20metabolome%20annotation%20quality:%20a%20method%20for%20evaluating%20the%20false%20discovery%20rate%20of%20elemental%20composition%20searches&rft.jtitle=PloS%20one&rft.au=Matsuda,%20Fumio&rft.date=2009-10-16&rft.volume=4&rft.issue=10&rft.spage=e7490&rft.epage=e7490&rft.pages=e7490-e7490&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0007490&rft_dat=%3Cgale_plos_%3EA472844517%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1292466862&rft_id=info:pmid/19847304&rft_galeid=A472844517&rft_doaj_id=oai_doaj_org_article_2fa835191288442db982270f41be510d&rfr_iscdi=true