Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case

With urinary proteomics profiling (UPP) as exemplary omics technology, this review describes a workflow for the analysis of omics data in large study populations. The proposed workflow includes: (i) planning omics studies and sample size considerations; (ii) preparing the data for analysis; (iii) pr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mass spectrometry reviews 2024-11, Vol.43 (6), p.1237-1254
Hauptverfasser: An, De-Wei, Yu, Yu-Ling, Martens, Dries S, Latosinska, Agnieszka, Zhang, Zhen-Yu, Mischak, Harald, Nawrot, Tim S, Staessen, Jan A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With urinary proteomics profiling (UPP) as exemplary omics technology, this review describes a workflow for the analysis of omics data in large study populations. The proposed workflow includes: (i) planning omics studies and sample size considerations; (ii) preparing the data for analysis; (iii) preprocessing the UPP data; (iv) the basic statistical steps required for data curation; (v) the selection of covariables; (vi) relating continuously distributed or categorical outcomes to a series of single markers (e.g., sequenced urinary peptide fragments identifying the parental proteins); (vii) showing the added diagnostic or prognostic value of the UPP markers over and beyond classical risk factors, and (viii) pathway analysis to identify targets for personalized intervention in disease prevention or treatment. Additionally, two short sections respectively address multiomics studies and machine learning. In conclusion, the analysis of adverse health outcomes in relation to omics biomarkers rests on the same statistical principle as any other data collected in large population or patient cohorts. The large number of biomarkers, which have to be considered simultaneously requires planning ahead how the study database will be structured and curated, imported in statistical software packages, analysis results will be triaged for clinical relevance, and presented.
ISSN:0277-7037
1098-2787
1098-2787
DOI:10.1002/mas.21849