Integrated Proteomic Pipeline Using Multiple Search Engines for a Proteogenomic Study with a Controlled Protein False Discovery Rate

In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of proteome research 2016-11, Vol.15 (11), p.4082-4090
Hauptverfasser:	Park, Gun Wook, Hwang, Heeyoun, Kim, Kwang Hoe, Lee, Ju Yeon, Lee, Hyun Kyoung, Park, Ji Yeong, Ji, Eun Sun, Park, Sung-Kyu Robin, Yates, John R, Kwon, Kyung-Hoon, Park, Young Mok, Lee, Hyoung-Joo, Paik, Young-Ki, Kim, Jin Young, Yoo, Jong Shin
Format:	Artikel
Sprache:	eng
Schlagworte:	Alternative Splicing Computational Biology - methods Databases, Protein False Positive Reactions Hippocampus - chemistry Humans Mass Spectrometry - methods Proteogenomics - methods Proteomics - methods Search Engine
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of
ISSN:	1535-3893 1535-3907
DOI:	10.1021/acs.jproteome.6b00376