NovoBoard: A Comprehensive Framework for Evaluating the False Discovery Rate and Accuracy of De Novo Peptide Sequencing
De novo peptide sequencing is one of the most fundamental research areas in mass spectrometry–based proteomics. Many methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate...
Gespeichert in:
Veröffentlicht in: | Molecular & cellular proteomics 2024-11, Vol.23 (11), p.100849, Article 100849 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | De novo peptide sequencing is one of the most fundamental research areas in mass spectrometry–based proteomics. Many methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate the false discovery rate (FDR) of de novo peptide-spectrum matches. Here we propose NovoBoard, a comprehensive framework to evaluate the performance of de novo peptide-sequencing methods. The framework consists of diverse benchmark datasets (including tryptic, nontryptic, immunopeptidomics, and different species) and a standard set of accuracy metrics to evaluate the fragment ions, amino acids, and peptides of the de novo results. More importantly, a new approach is designed to evaluate de novo peptide-sequencing methods on target-decoy spectra and to estimate and validate their FDRs. Our FDR estimation provides valuable information to assess the reliability of new peptides identified by de novo sequencing tools, especially when no ground-truth information is available to evaluate their accuracy. The FDR estimation can also be used to evaluate the capability of de novo peptide sequencing tools to distinguish between de novo peptide-spectrum matches and random matches. Our results thoroughly reveal the strengths and weaknesses of different de novo peptide-sequencing methods and how their performances depend on specific applications and the types of data.
[Display omitted]
•We proposed a new method to estimate the FDR of de novo peptide sequencing.•A framework consisting of benchmark datasets and accuracy metrics were presented.•The results revealed the strengths and weaknesses of de novo sequencing tools.
De novo peptide sequencing methods have often been evaluated using a couple of simple metrics that do not fully reflect their overall performance. Moreover, there has not been an established method to estimate their false discovery rate (FDR). We proposed NovoBoard, a comprehensive framework that includes diverse benchmark datasets, accuracy metrics, and a new approach to estimate and validate the FDR for de novo peptide sequencing. Our FDR estimation provides valuable information to assess the significance of de novo peptides. |
---|---|
ISSN: | 1535-9476 1535-9484 1535-9484 |
DOI: | 10.1016/j.mcpro.2024.100849 |