Statistical considerations and database limitations in NMR-based metabolic profiling studies
Introduction Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior...
Gespeichert in:
Veröffentlicht in: | Metabolomics 2023-06, Vol.19 (7), p.64, Article 64 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 7 |
container_start_page | 64 |
container_title | Metabolomics |
container_volume | 19 |
creator | Ross, Imani L. Beardslee, Julie A. Steil, Maria M. Chihanga, Tafadzwa Kennedy, Michael A. |
description | Introduction
Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes.
Objectives
The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases.
Methods
P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified.
Results
P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database.
Conclusions
Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation. |
doi_str_mv | 10.1007/s11306-023-02027-5 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2830494349</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2830494349</sourcerecordid><originalsourceid>FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</originalsourceid><addsrcrecordid>eNp9kMtKAzEUhoMotlZfwIUEXI8mk8tMllKsClXBy04ImSRTUuZSk8zCtzd16mXlIuRc_vOfwwfAKUYXGKHiMmBMEM9QTtJDeZGxPTDFrCAZKQXa_xNPwFEIa4QoFQU6BBNSkKLkJZqCt-eoogvRadVA3XfBGetTJUVQdQYaFVWlgoWNa13cNVwHH-6fsm3dwNYmRd84DTe-r13juhUMcTDOhmNwUKsm2JPdPwOvi-uX-W22fLy5m18tM50LEjOerme64tZoRpjloqi0rg2mJebaakat0ILzlAmjDS5zQxVCJaurWtS40mQGzkffdMH7YEOU637wXVop85IgKiihIqnyUaV9H4K3tdx41yr_ITGSW6ByBCoTUPkFVLI0dLazHqrWmp-Rb4JJQEZBSK1uZf3v7n9sPwG03IIk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2830494349</pqid></control><display><type>article</type><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><source>MEDLINE</source><source>SpringerLink Journals</source><creator>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</creator><creatorcontrib>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</creatorcontrib><description>Introduction
Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes.
Objectives
The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases.
Methods
P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified.
Results
P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database.
Conclusions
Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</description><identifier>ISSN: 1573-3890</identifier><identifier>ISSN: 1573-3882</identifier><identifier>EISSN: 1573-3890</identifier><identifier>DOI: 10.1007/s11306-023-02027-5</identifier><identifier>PMID: 37378680</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Animals ; Biochemistry ; Biomedical and Life Sciences ; Biomedicine ; Cell Biology ; Cell Line ; Databases, Factual ; Developmental Biology ; Humans ; Life Sciences ; Magnetic Resonance Imaging ; Magnetic Resonance Spectroscopy ; Metabolism ; Metabolites ; Metabolomics ; Mice ; Molecular Medicine ; NMR ; Nuclear magnetic resonance ; Original Article ; Pancreatic cancer ; Statistical analysis ; Statistical significance ; Tumor cell lines</subject><ispartof>Metabolomics, 2023-06, Vol.19 (7), p.64, Article 64</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>2023. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11306-023-02027-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11306-023-02027-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37378680$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ross, Imani L.</creatorcontrib><creatorcontrib>Beardslee, Julie A.</creatorcontrib><creatorcontrib>Steil, Maria M.</creatorcontrib><creatorcontrib>Chihanga, Tafadzwa</creatorcontrib><creatorcontrib>Kennedy, Michael A.</creatorcontrib><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><title>Metabolomics</title><addtitle>Metabolomics</addtitle><addtitle>Metabolomics</addtitle><description>Introduction
Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes.
Objectives
The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases.
Methods
P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified.
Results
P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database.
Conclusions
Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</description><subject>Animals</subject><subject>Biochemistry</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedicine</subject><subject>Cell Biology</subject><subject>Cell Line</subject><subject>Databases, Factual</subject><subject>Developmental Biology</subject><subject>Humans</subject><subject>Life Sciences</subject><subject>Magnetic Resonance Imaging</subject><subject>Magnetic Resonance Spectroscopy</subject><subject>Metabolism</subject><subject>Metabolites</subject><subject>Metabolomics</subject><subject>Mice</subject><subject>Molecular Medicine</subject><subject>NMR</subject><subject>Nuclear magnetic resonance</subject><subject>Original Article</subject><subject>Pancreatic cancer</subject><subject>Statistical analysis</subject><subject>Statistical significance</subject><subject>Tumor cell lines</subject><issn>1573-3890</issn><issn>1573-3882</issn><issn>1573-3890</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><recordid>eNp9kMtKAzEUhoMotlZfwIUEXI8mk8tMllKsClXBy04ImSRTUuZSk8zCtzd16mXlIuRc_vOfwwfAKUYXGKHiMmBMEM9QTtJDeZGxPTDFrCAZKQXa_xNPwFEIa4QoFQU6BBNSkKLkJZqCt-eoogvRadVA3XfBGetTJUVQdQYaFVWlgoWNa13cNVwHH-6fsm3dwNYmRd84DTe-r13juhUMcTDOhmNwUKsm2JPdPwOvi-uX-W22fLy5m18tM50LEjOerme64tZoRpjloqi0rg2mJebaakat0ILzlAmjDS5zQxVCJaurWtS40mQGzkffdMH7YEOU637wXVop85IgKiihIqnyUaV9H4K3tdx41yr_ITGSW6ByBCoTUPkFVLI0dLazHqrWmp-Rb4JJQEZBSK1uZf3v7n9sPwG03IIk</recordid><startdate>20230628</startdate><enddate>20230628</enddate><creator>Ross, Imani L.</creator><creator>Beardslee, Julie A.</creator><creator>Steil, Maria M.</creator><creator>Chihanga, Tafadzwa</creator><creator>Kennedy, Michael A.</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M7P</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope></search><sort><creationdate>20230628</creationdate><title>Statistical considerations and database limitations in NMR-based metabolic profiling studies</title><author>Ross, Imani L. ; Beardslee, Julie A. ; Steil, Maria M. ; Chihanga, Tafadzwa ; Kennedy, Michael A.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c293t-60205cb6edc535e697bccfd14816cec54e9c9668169dcd182d4a0085fbf9f1bc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Animals</topic><topic>Biochemistry</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedicine</topic><topic>Cell Biology</topic><topic>Cell Line</topic><topic>Databases, Factual</topic><topic>Developmental Biology</topic><topic>Humans</topic><topic>Life Sciences</topic><topic>Magnetic Resonance Imaging</topic><topic>Magnetic Resonance Spectroscopy</topic><topic>Metabolism</topic><topic>Metabolites</topic><topic>Metabolomics</topic><topic>Mice</topic><topic>Molecular Medicine</topic><topic>NMR</topic><topic>Nuclear magnetic resonance</topic><topic>Original Article</topic><topic>Pancreatic cancer</topic><topic>Statistical analysis</topic><topic>Statistical significance</topic><topic>Tumor cell lines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ross, Imani L.</creatorcontrib><creatorcontrib>Beardslee, Julie A.</creatorcontrib><creatorcontrib>Steil, Maria M.</creatorcontrib><creatorcontrib>Chihanga, Tafadzwa</creatorcontrib><creatorcontrib>Kennedy, Michael A.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Biological Science Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Metabolomics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ross, Imani L.</au><au>Beardslee, Julie A.</au><au>Steil, Maria M.</au><au>Chihanga, Tafadzwa</au><au>Kennedy, Michael A.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Statistical considerations and database limitations in NMR-based metabolic profiling studies</atitle><jtitle>Metabolomics</jtitle><stitle>Metabolomics</stitle><addtitle>Metabolomics</addtitle><date>2023-06-28</date><risdate>2023</risdate><volume>19</volume><issue>7</issue><spage>64</spage><pages>64-</pages><artnum>64</artnum><issn>1573-3890</issn><issn>1573-3882</issn><eissn>1573-3890</eissn><abstract>Introduction
Interpretation and analysis of NMR-based metabolic profiling studies is limited by substantially incomplete commercial and academic databases. Statistical significance tests, including p-values, VIP scores, AUC values and FC values, can be largely inconsistent. Data normalization prior to statistical analysis can cause erroneous outcomes.
Objectives
The objectives were (1) to quantitatively assess consistency among p-values, VIP scores, AUC values and FC values in representative NMR-based metabolic profiling datasets, (2) to assess how data normalization can impact statistical significance outcomes, (3) to determine resonance peak assignment completion potential using commonly used databases and (4) to analyze intersection and uniqueness of metabolite space in these databases.
Methods
P-values, VIP scores, AUC values and FC values, and their dependence on data normalization, were determined in orthotopic mouse model of pancreatic cancer and two human pancreatic cancer cell lines. Completeness of resonance assignments were evaluated using Chenomx, the human metabolite database (HMDB) and the COLMAR database. The intersection and uniqueness of the databases was quantified.
Results
P-values and AUC values were strongly correlated compared to VIP or FC values. Distributions of statistically significant bins depended strongly on whether or not datasets were normalized. 40–45% of peaks had either no or ambiguous database matches. 9–22% of metabolites were unique to each database.
Conclusions
Lack of consistency in statistical analyses of metabolomics data can lead to misleading or inconsistent interpretation. Data normalization can have large effects on statistical analysis and should be justified. About 40% of peak assignments remain ambiguous or impossible with current databases. 1D and 2D databases should be made consistent to maximize metabolite assignment confidence and validation.</abstract><cop>New York</cop><pub>Springer US</pub><pmid>37378680</pmid><doi>10.1007/s11306-023-02027-5</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1573-3890 |
ispartof | Metabolomics, 2023-06, Vol.19 (7), p.64, Article 64 |
issn | 1573-3890 1573-3882 1573-3890 |
language | eng |
recordid | cdi_proquest_journals_2830494349 |
source | MEDLINE; SpringerLink Journals |
subjects | Animals Biochemistry Biomedical and Life Sciences Biomedicine Cell Biology Cell Line Databases, Factual Developmental Biology Humans Life Sciences Magnetic Resonance Imaging Magnetic Resonance Spectroscopy Metabolism Metabolites Metabolomics Mice Molecular Medicine NMR Nuclear magnetic resonance Original Article Pancreatic cancer Statistical analysis Statistical significance Tumor cell lines |
title | Statistical considerations and database limitations in NMR-based metabolic profiling studies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T10%3A03%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Statistical%20considerations%20and%20database%20limitations%20in%20NMR-based%20metabolic%20profiling%20studies&rft.jtitle=Metabolomics&rft.au=Ross,%20Imani%20L.&rft.date=2023-06-28&rft.volume=19&rft.issue=7&rft.spage=64&rft.pages=64-&rft.artnum=64&rft.issn=1573-3890&rft.eissn=1573-3890&rft_id=info:doi/10.1007/s11306-023-02027-5&rft_dat=%3Cproquest_cross%3E2830494349%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2830494349&rft_id=info:pmid/37378680&rfr_iscdi=true |