Statistical considerations and database limitations in NMR-based metabolic profiling studies

Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Metabolomics 2023-06, Vol.19 (7), p.64, Article 64
Hauptverfasser: Ross, Imani L., Beardslee, Julie A., Steil, Maria M., Chihanga, Tafadzwa, Kennedy, Michael A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 7
container_start_page 64
container_title Metabolomics
container_volume 19
creator Ross, Imani L.
Beardslee, Julie A.
Steil, Maria M.
Chihanga, Tafadzwa
Kennedy, Michael A.
description Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes. Objectives The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases. Methods P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified. Results P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database. Conclusions Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.
doi_str_mv 10.1007/s11306-023-02027-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2830494349</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2830494349</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</originalsourceid><addsrcrecordid>eNp9kMtKAzEUhoMotlZfwIUEXI8mk8tMllKsClXBy04ImSRTUuZSk8zCtzd16mXlIuRc_vOfwwfAKUYXGKHiMmBMEM9QTtJDeZGxPTDFrCAZKQXa_xNPwFEIa4QoFQU6BBNSkKLkJZqCt-eoogvRadVA3XfBGetTJUVQdQYaFVWlgoWNa13cNVwHH-6fsm3dwNYmRd84DTe-r13juhUMcTDOhmNwUKsm2JPdPwOvi-uX-W22fLy5m18tM50LEjOerme64tZoRpjloqi0rg2mJebaakat0ILzlAmjDS5zQxVCJaurWtS40mQGzkffdMH7YEOU637wXVop85IgKiihIqnyUaV9H4K3tdx41yr_ITGSW6ByBCoTUPkFVLI0dLazHqrWmp-Rb4JJQEZBSK1uZf3v7n9sPwG03IIk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2830494349</pqid></control><display><type>article</type><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><source>MEDLINE</source><source>SpringerLink Journals</source><creator>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</creator><creatorcontrib>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</creatorcontrib><description>Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes. Objectives The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases. Methods P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified. Results P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database. Conclusions Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</description><identifier>ISSN: 1573-3890</identifier><identifier>ISSN: 1573-3882</identifier><identifier>EISSN: 1573-3890</identifier><identifier>DOI: 10.1007/s11306-023-02027-5</identifier><identifier>PMID: 37378680</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Animals ; Biochemistry ; Biomedical and Life Sciences ; Biomedicine ; Cell Biology ; Cell Line ; Databases, Factual ; Developmental Biology ; Humans ; Life Sciences ; Magnetic Resonance Imaging ; Magnetic Resonance Spectroscopy ; Metabolism ; Metabolites ; Metabolomics ; Mice ; Molecular Medicine ; NMR ; Nuclear magnetic resonance ; Original Article ; Pancreatic cancer ; Statistical analysis ; Statistical significance ; Tumor cell lines</subject><ispartof>Metabolomics, 2023-06, Vol.19 (7), p.64, Article 64</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>2023. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11306-023-02027-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11306-023-02027-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37378680$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ross, Imani L.</creatorcontrib><creatorcontrib>Beardslee, Julie A.</creatorcontrib><creatorcontrib>Steil, Maria M.</creatorcontrib><creatorcontrib>Chihanga, Tafadzwa</creatorcontrib><creatorcontrib>Kennedy, Michael A.</creatorcontrib><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><title>Metabolomics</title><addtitle>Metabolomics</addtitle><addtitle>Metabolomics</addtitle><description>Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes. Objectives The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases. Methods P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified. Results P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database. Conclusions Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</description><subject>Animals</subject><subject>Biochemistry</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedicine</subject><subject>Cell Biology</subject><subject>Cell Line</subject><subject>Databases, Factual</subject><subject>Developmental Biology</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Magnetic Resonance Imaging</subject><subject>Magnetic Resonance Spectroscopy</subject><subject>Metabolism</subject><subject>Metabolites</subject><subject>Metabolomics</subject><subject>Mice</subject><subject>Molecular Medicine</subject><subject>NMR</subject><subject>Nuclear magnetic resonance</subject><subject>Original Article</subject><subject>Pancreatic cancer</subject><subject>Statistical analysis</subject><subject>Statistical significance</subject><subject>Tumor cell lines</subject><issn>1573-3890</issn><issn>1573-3882</issn><issn>1573-3890</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kMtKAzEUhoMotlZfwIUEXI8mk8tMllKsClXBy04ImSRTUuZSk8zCtzd16mXlIuRc_vOfwwfAKUYXGKHiMmBMEM9QTtJDeZGxPTDFrCAZKQXa_xNPwFEIa4QoFQU6BBNSkKLkJZqCt-eoogvRadVA3XfBGetTJUVQdQYaFVWlgoWNa13cNVwHH-6fsm3dwNYmRd84DTe-r13juhUMcTDOhmNwUKsm2JPdPwOvi-uX-W22fLy5m18tM50LEjOerme64tZoRpjloqi0rg2mJebaakat0ILzlAmjDS5zQxVCJaurWtS40mQGzkffdMH7YEOU637wXVop85IgKiihIqnyUaV9H4K3tdx41yr_ITGSW6ByBCoTUPkFVLI0dLazHqrWmp-Rb4JJQEZBSK1uZf3v7n9sPwG03IIk</recordid><startdate>20230628</startdate><enddate>20230628</enddate><creator>Ross, Imani L.</creator><creator>Beardslee, Julie A.</creator><creator>Steil, Maria M.</creator><creator>Chihanga, Tafadzwa</creator><creator>Kennedy, Michael A.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M7P</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope></search><sort><creationdate>20230628</creationdate><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><author>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Animals</topic><topic>Biochemistry</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedicine</topic><topic>Cell Biology</topic><topic>Cell Line</topic><topic>Databases, Factual</topic><topic>Developmental Biology</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Magnetic Resonance Imaging</topic><topic>Magnetic Resonance Spectroscopy</topic><topic>Metabolism</topic><topic>Metabolites</topic><topic>Metabolomics</topic><topic>Mice</topic><topic>Molecular Medicine</topic><topic>NMR</topic><topic>Nuclear magnetic resonance</topic><topic>Original Article</topic><topic>Pancreatic cancer</topic><topic>Statistical analysis</topic><topic>Statistical significance</topic><topic>Tumor cell lines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ross, Imani L.</creatorcontrib><creatorcontrib>Beardslee, Julie A.</creatorcontrib><creatorcontrib>Steil, Maria M.</creatorcontrib><creatorcontrib>Chihanga, Tafadzwa</creatorcontrib><creatorcontrib>Kennedy, Michael A.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Biological Science Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Metabolomics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ross, Imani L.</au><au>Beardslee, Julie A.</au><au>Steil, Maria M.</au><au>Chihanga, Tafadzwa</au><au>Kennedy, Michael A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Statistical considerations and database limitations in NMR-based metabolic profiling studies</atitle><jtitle>Metabolomics</jtitle><stitle>Metabolomics</stitle><addtitle>Metabolomics</addtitle><date>2023-06-28</date><risdate>2023</risdate><volume>19</volume><issue>7</issue><spage>64</spage><pages>64-</pages><artnum>64</artnum><issn>1573-3890</issn><issn>1573-3882</issn><eissn>1573-3890</eissn><abstract>Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes. Objectives The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases. Methods P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified. Results P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database. Conclusions Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</abstract><cop>New York</cop><pub>Springer US</pub><pmid>37378680</pmid><doi>10.1007/s11306-023-02027-5</doi></addata></record>
fulltext fulltext
identifier ISSN: 1573-3890
ispartof Metabolomics, 2023-06, Vol.19 (7), p.64, Article 64
issn 1573-3890
1573-3882
1573-3890
language eng
recordid cdi_proquest_journals_2830494349
source MEDLINE; SpringerLink Journals
subjects Animals
Biochemistry
Biomedical and Life Sciences
Biomedicine
Cell Biology
Cell Line
Databases, Factual
Developmental Biology
Humans
Life Sciences
Magnetic Resonance Imaging
Magnetic Resonance Spectroscopy
Metabolism
Metabolites
Metabolomics
Mice
Molecular Medicine
NMR
Nuclear magnetic resonance
Original Article
Pancreatic cancer
Statistical analysis
Statistical significance
Tumor cell lines
title Statistical considerations and database limitations in NMR-based metabolic profiling studies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T10%3A03%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Statistical%20considerations%20and%20database%20limitations%20in%20NMR-based%20metabolic%20profiling%20studies&rft.jtitle=Metabolomics&rft.au=Ross,%20Imani%20L.&rft.date=2023-06-28&rft.volume=19&rft.issue=7&rft.spage=64&rft.pages=64-&rft.artnum=64&rft.issn=1573-3890&rft.eissn=1573-3890&rft_id=info:doi/10.1007/s11306-023-02027-5&rft_dat=%3Cproquest_cross%3E2830494349%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2830494349&rft_id=info:pmid/37378680&rfr_iscdi=true