An approach for feature selection with data modelling in LC-MS metabolomics

The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Analytical methods 2020-07, Vol.12 (28), p.3582-3591
Hauptverfasser: Plyushchenko, Ivan, Shakhmatov, Dmitry, Bolotnik, Timofey, Baygildiev, Timur, Nesterenko, Pavel N, Rodin, Igor
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3591
container_issue 28
container_start_page 3582
container_title Analytical methods
container_volume 12
creator Plyushchenko, Ivan
Shakhmatov, Dmitry
Bolotnik, Timofey
Baygildiev, Timur
Nesterenko, Pavel N
Rodin, Igor
description The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion. The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling.
doi_str_mv 10.1039/d0ay00204f
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1039_D0AY00204F</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2427308322</sourcerecordid><originalsourceid>FETCH-LOGICAL-c400t-12532b8e1b64f81eef1930b16fe3ac4204883961a417c9a83532776f3d311c953</originalsourceid><addsrcrecordid>eNp90UFPwyAYBmBiNG5OL941GC_GpPpROijHZTo1znhQD54aSsHVtGVCG7N_L3NzJh48QcKTj5cXhA4JXBCg4rIAuQCIITFbqE_4UESCcbG92TPooT3v3wGYoIzsoh6NORDgaR_djxos53NnpZphYx02Wrad09jrSqu2tA3-LNsZLmQrcW0LXVVl84bLBk_H0cMTrnUrc1vZulR-H-0YWXl9sF4H6GVy_Ty-jaaPN3fj0TRSCUAbkXhI4zzVJGeJSYnWhggKOWFGU6mS8I40pYIRmRCuhExp4JwzQwtKiBJDOkBnq7kh9kenfZvVpVchmWy07XwWJzEbUh6uCfT0D323nWtCuqXiFFIax0Gdr5Ry1nunTTZ3ZS3dIiOQLSvOrmD0-l3xJODj9cgur3WxoT-dBnCyAs6rzenvH2XzwgRz9J-hX0qAiLk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2427308322</pqid></control><display><type>article</type><title>An approach for feature selection with data modelling in LC-MS metabolomics</title><source>MEDLINE</source><source>Royal Society Of Chemistry Journals 2008-</source><creator>Plyushchenko, Ivan ; Shakhmatov, Dmitry ; Bolotnik, Timofey ; Baygildiev, Timur ; Nesterenko, Pavel N ; Rodin, Igor</creator><creatorcontrib>Plyushchenko, Ivan ; Shakhmatov, Dmitry ; Bolotnik, Timofey ; Baygildiev, Timur ; Nesterenko, Pavel N ; Rodin, Igor</creatorcontrib><description>The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion. The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling.</description><identifier>ISSN: 1759-9660</identifier><identifier>EISSN: 1759-9679</identifier><identifier>DOI: 10.1039/d0ay00204f</identifier><identifier>PMID: 32701078</identifier><language>eng</language><publisher>England: Royal Society of Chemistry</publisher><subject>Annotations ; Biomarkers ; Chromatography, Liquid ; Data integration ; Data processing ; Feature selection ; Metabolomics ; Metabolomics - methods ; Modelling ; Models, Biological ; Reproducibility of Results ; Signal processing ; Software ; Tandem Mass Spectrometry ; Workflow</subject><ispartof>Analytical methods, 2020-07, Vol.12 (28), p.3582-3591</ispartof><rights>Copyright Royal Society of Chemistry 2020</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c400t-12532b8e1b64f81eef1930b16fe3ac4204883961a417c9a83532776f3d311c953</citedby><cites>FETCH-LOGICAL-c400t-12532b8e1b64f81eef1930b16fe3ac4204883961a417c9a83532776f3d311c953</cites><orcidid>0000-0003-3883-4695 ; 0000-0002-0588-6870</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32701078$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Plyushchenko, Ivan</creatorcontrib><creatorcontrib>Shakhmatov, Dmitry</creatorcontrib><creatorcontrib>Bolotnik, Timofey</creatorcontrib><creatorcontrib>Baygildiev, Timur</creatorcontrib><creatorcontrib>Nesterenko, Pavel N</creatorcontrib><creatorcontrib>Rodin, Igor</creatorcontrib><title>An approach for feature selection with data modelling in LC-MS metabolomics</title><title>Analytical methods</title><addtitle>Anal Methods</addtitle><description>The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion. The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling.</description><subject>Annotations</subject><subject>Biomarkers</subject><subject>Chromatography, Liquid</subject><subject>Data integration</subject><subject>Data processing</subject><subject>Feature selection</subject><subject>Metabolomics</subject><subject>Metabolomics - methods</subject><subject>Modelling</subject><subject>Models, Biological</subject><subject>Reproducibility of Results</subject><subject>Signal processing</subject><subject>Software</subject><subject>Tandem Mass Spectrometry</subject><subject>Workflow</subject><issn>1759-9660</issn><issn>1759-9679</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp90UFPwyAYBmBiNG5OL941GC_GpPpROijHZTo1znhQD54aSsHVtGVCG7N_L3NzJh48QcKTj5cXhA4JXBCg4rIAuQCIITFbqE_4UESCcbG92TPooT3v3wGYoIzsoh6NORDgaR_djxos53NnpZphYx02Wrad09jrSqu2tA3-LNsZLmQrcW0LXVVl84bLBk_H0cMTrnUrc1vZulR-H-0YWXl9sF4H6GVy_Ty-jaaPN3fj0TRSCUAbkXhI4zzVJGeJSYnWhggKOWFGU6mS8I40pYIRmRCuhExp4JwzQwtKiBJDOkBnq7kh9kenfZvVpVchmWy07XwWJzEbUh6uCfT0D323nWtCuqXiFFIax0Gdr5Ry1nunTTZ3ZS3dIiOQLSvOrmD0-l3xJODj9cgur3WxoT-dBnCyAs6rzenvH2XzwgRz9J-hX0qAiLk</recordid><startdate>20200728</startdate><enddate>20200728</enddate><creator>Plyushchenko, Ivan</creator><creator>Shakhmatov, Dmitry</creator><creator>Bolotnik, Timofey</creator><creator>Baygildiev, Timur</creator><creator>Nesterenko, Pavel N</creator><creator>Rodin, Igor</creator><general>Royal Society of Chemistry</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SE</scope><scope>7SR</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>FR3</scope><scope>H8G</scope><scope>JG9</scope><scope>L7M</scope><scope>P64</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-3883-4695</orcidid><orcidid>https://orcid.org/0000-0002-0588-6870</orcidid></search><sort><creationdate>20200728</creationdate><title>An approach for feature selection with data modelling in LC-MS metabolomics</title><author>Plyushchenko, Ivan ; Shakhmatov, Dmitry ; Bolotnik, Timofey ; Baygildiev, Timur ; Nesterenko, Pavel N ; Rodin, Igor</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c400t-12532b8e1b64f81eef1930b16fe3ac4204883961a417c9a83532776f3d311c953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Annotations</topic><topic>Biomarkers</topic><topic>Chromatography, Liquid</topic><topic>Data integration</topic><topic>Data processing</topic><topic>Feature selection</topic><topic>Metabolomics</topic><topic>Metabolomics - methods</topic><topic>Modelling</topic><topic>Models, Biological</topic><topic>Reproducibility of Results</topic><topic>Signal processing</topic><topic>Software</topic><topic>Tandem Mass Spectrometry</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Plyushchenko, Ivan</creatorcontrib><creatorcontrib>Shakhmatov, Dmitry</creatorcontrib><creatorcontrib>Bolotnik, Timofey</creatorcontrib><creatorcontrib>Baygildiev, Timur</creatorcontrib><creatorcontrib>Nesterenko, Pavel N</creatorcontrib><creatorcontrib>Rodin, Igor</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Copper Technical Reference Library</collection><collection>Materials Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Analytical methods</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Plyushchenko, Ivan</au><au>Shakhmatov, Dmitry</au><au>Bolotnik, Timofey</au><au>Baygildiev, Timur</au><au>Nesterenko, Pavel N</au><au>Rodin, Igor</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An approach for feature selection with data modelling in LC-MS metabolomics</atitle><jtitle>Analytical methods</jtitle><addtitle>Anal Methods</addtitle><date>2020-07-28</date><risdate>2020</risdate><volume>12</volume><issue>28</issue><spage>3582</spage><epage>3591</epage><pages>3582-3591</pages><issn>1759-9660</issn><eissn>1759-9679</eissn><abstract>The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion. The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling.</abstract><cop>England</cop><pub>Royal Society of Chemistry</pub><pmid>32701078</pmid><doi>10.1039/d0ay00204f</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-3883-4695</orcidid><orcidid>https://orcid.org/0000-0002-0588-6870</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1759-9660
ispartof Analytical methods, 2020-07, Vol.12 (28), p.3582-3591
issn 1759-9660
1759-9679
language eng
recordid cdi_crossref_primary_10_1039_D0AY00204F
source MEDLINE; Royal Society Of Chemistry Journals 2008-
subjects Annotations
Biomarkers
Chromatography, Liquid
Data integration
Data processing
Feature selection
Metabolomics
Metabolomics - methods
Modelling
Models, Biological
Reproducibility of Results
Signal processing
Software
Tandem Mass Spectrometry
Workflow
title An approach for feature selection with data modelling in LC-MS metabolomics
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T06%3A43%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20approach%20for%20feature%20selection%20with%20data%20modelling%20in%20LC-MS%20metabolomics&rft.jtitle=Analytical%20methods&rft.au=Plyushchenko,%20Ivan&rft.date=2020-07-28&rft.volume=12&rft.issue=28&rft.spage=3582&rft.epage=3591&rft.pages=3582-3591&rft.issn=1759-9660&rft.eissn=1759-9679&rft_id=info:doi/10.1039/d0ay00204f&rft_dat=%3Cproquest_cross%3E2427308322%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2427308322&rft_id=info:pmid/32701078&rfr_iscdi=true