MZDASoft: a software architecture that enables large-scale comparison of protein expression levels over multiple samples based on liquid chromatography/tandem mass spectrometry

Rationale Without accurate peak linking/alignment, only the expression levels of a small percentage of proteins can be compared across multiple samples in Liquid Chromatography/Mass Spectrometry/Tandem Mass Spectrometry (LC/MS/MS) due to the selective nature of tandem MS peptide identification. This...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Rapid communications in mass spectrometry 2015-10, Vol.29 (19), p.1841-1848
Hauptverfasser: Ghanat Bari, Mehrab, Ramirez, Nelson, Wang, Zhiwei, Zhang, Jianqiu (Michelle)
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Rationale Without accurate peak linking/alignment, only the expression levels of a small percentage of proteins can be compared across multiple samples in Liquid Chromatography/Mass Spectrometry/Tandem Mass Spectrometry (LC/MS/MS) due to the selective nature of tandem MS peptide identification. This greatly hampers biomedical research that aims at finding biomarkers for disease diagnosis, treatment, and the understanding of disease mechanisms. A recent algorithm, PeakLink, has allowed the accurate linking of LC/MS peaks without tandem MS identifications to their corresponding ones with identifications across multiple samples collected from different instruments, tissues and labs, which greatly enhanced the ability of comparing proteins. However, PeakLink cannot be implemented practically for large numbers of samples based on existing software architectures, because it requires access to peak elution profiles from multiple LC/MS/MS samples simultaneously. Methods We propose a new architecture based on parallel processing, which extracts LC/MS peak features, and saves them in database files to enable the implementation of PeakLink for multiple samples. The software has been deployed in High‐Performance Computing (HPC) environments. The core part of the software, MZDASoft Parallel Peak Extractor (PPE), can be downloaded with a user and developer's guide, and it can be run on HPC centers directly. The quantification applications, MZDASoft TandemQuant and MZDASoft PeakLink, are written in Matlab, which are compiled with a Matlab runtime compiler. A sample script that incorporates all necessary processing steps of MZDASoft for LC/MS/MS quantification in a parallel processing environment is available. The project webpage is http://compgenomics.utsa.edu/zgroup/MZDASoft. Results The proposed architecture enables the implementation of PeakLink for multiple samples. Significantly more (100%–500%) proteins can be compared over multiple samples with better quantification accuracy in test cases. Conclusion MZDASoft enables large‐scale comparison of protein expression levels over multiple samples with much larger protein comparison coverage and better quantification accuracy. It is an efficient implementation based on parallel processing which can be used to process large amounts of data. Copyright © 2015 John Wiley & Sons, Ltd.
ISSN:0951-4198
1097-0231
DOI:10.1002/rcm.7272