Intersession reproducibility of mass spectrometry profiles and its effect on accuracy of multivariate classification models

Motivation: The ‘reproducibility’ of mass spectrometry proteomic profiling has become an intensely controversial topic. The mere mention of concern over the ‘reproducibility’ of data generated from any particular platform can lead to the anxiety over the generalizability of its results and its role...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2007-11, Vol.23 (22), p.3065-3072
Hauptverfasser: Pelikan, Richard, Bigbee, William L., Malehorn, David, Lyons-Weiler, James, Hauskrecht, Milos
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivation: The ‘reproducibility’ of mass spectrometry proteomic profiling has become an intensely controversial topic. The mere mention of concern over the ‘reproducibility’ of data generated from any particular platform can lead to the anxiety over the generalizability of its results and its role in the future of discovery proteomics. In this study, we examine the reproducibility of proteomic profiles generated by surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) across multiple data-generation sessions. We analyze the problem in terms of the reproducibility of signals, reproducibility of discriminative features and reproducibility of multivariate classification models on profiles for serum samples from early lung cancer and healthy control subjects. Results: Proteomic profiles in individual data-generation sessions experience within-session variability. We show that combining data from multiple sessions introduces additional (inter-session) noise. While additional noise can affect the discriminative analysis, we show that its average effect on profiles in our study is relatively small. Moreover, for the purposes of prediction on future (previously unseen) data, classifiers trained on multi-session data are able to adapt to inter-session noise and improve their classification accuracy. Contact: milos@cs.pitt.edu
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btm415