Cleaved and Missed Sites for Trypsin, Lys-C, and Lys‑N Can Be Predicted with High Confidence on the Basis of Sequence Context

Trypsin, Lys-C, and Lys-N are the most broadly used enzymes in proteomics. Here, on the basis of large-scale peptide mass spectrometry (MS) data sets, an approach is described to confidently identify missed cleavage sites in either phosphorylated or unmodified substrates for these three proteases, o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of proteome research 2014-02, Vol.13 (2), p.702-709
1. Verfasser: Gershon, Paul D
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Trypsin, Lys-C, and Lys-N are the most broadly used enzymes in proteomics. Here, on the basis of large-scale peptide mass spectrometry (MS) data sets, an approach is described to confidently identify missed cleavage sites in either phosphorylated or unmodified substrates for these three proteases, or any protease, on the basis of side chain species present within 15 residues of the cleavage-specificity residue. Previously known effects of proline, negatively charged side chains, and phospho-modified residues have been quantified, and additional side chain effects were noted. By applying a set of quantitative side chain rules established for each of the three proteases, scissile and nonscissile sites could be established, on the basis of protein sequence alone, with near certainty for Lys-C, and with a high degree of confidence for trypsin or Lys-N. These rules were applicable to orthogonal peptide data sets, including the two largest in the PeptideAtlas database. The approach described here facilitates the comprehensive modeling of substrate recognition in proteolysis.
ISSN:1535-3893
1535-3907
DOI:10.1021/pr400802z