pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization

MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of proteome research 2018-01, Vol.17 (1), p.119-128
Hauptverfasser: Yang, Hao, Chi, Hao, Zhou, Wen-Jing, Zeng, Wen-Feng, Liu, Chao, Wang, Rui-Min, Wang, Zhao-Wei, Niu, Xiu-Nan, Chen, Zhen-Lin, He, Si-Min
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 128
container_issue 1
container_start_page 119
container_title Journal of proteome research
container_volume 17
creator Yang, Hao
Chi, Hao
Zhou, Wen-Jing
Zeng, Wen-Feng
Liu, Chao
Wang, Rui-Min
Wang, Zhao-Wei
Niu, Xiu-Nan
Chen, Zhen-Lin
He, Si-Min
description MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length peptides obtained by de novo peptide sequencing. A semi-supervised learning approach was used to discriminate correct amino acids from random one; then, an expectation-maximization algorithm was used to adaptively control the false amino-acid rate (FAR). On three test data sets, pSite recalled 86% more amino acids on average than PEAKS at the FAR of 5%. pSite also performed superiorly on the modification site localization problem, which is essentially a special case of amino acid confidence evaluation. On three phosphopeptide data sets, at the false localization rate of 1%, the average recall of pSite was 91% while those of Ascore and phosphoRS were 64 and 63%, respectively. pSite covered 98% of Ascore and phosphoRS results and contributed 21% more phosphorylation sites. Further analyses show that the use of distinct fragmentation features in high-resolution MS/MS spectra, such as neutral loss ions, played an important role in improving the precision of pSite. In summary, the effective and universal model together with the extensive use of spectral information makes pSite an excellent quality control tool for both de novo peptide sequencing and modification site localization.
doi_str_mv 10.1021/acs.jproteome.7b00428
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1963477471</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1963477471</sourcerecordid><originalsourceid>FETCH-LOGICAL-a351t-f549ef47bef0b7429f49597459ddd42b91f20bede033154cd56a2764a5c507333</originalsourceid><addsrcrecordid>eNqFkMtOAyEUhonRaK0-goalm1YYoIi7ptZLUm9R1xMGDoZmZqjDjEmNDy-96NYVhPzffw4fQieUDCnJ6Lk2cThfNKGFUMFQFoTw7GIH9ahgYsAUkbu_9wvFDtBhjHNCqJCE7aODTFFGGCE99L148S1c4nHl64DHxls8CbXzFmoDePqpy063PtTYhQY_d7r07XKVaJtQ4uDwFeCH8BnwEyzaBOEX-OgS6ut3rGuL74P1zptNxWoSngWTSr7WL0doz-kywvH27KO36-nr5HYwe7y5m4xnA80EbQdOcAWOywIcKSTPlONKKMmFstbyrFDUZaQAC4QxKrixYqQzOeJaGEEkY6yPzja9yVdaL7Z55aOBstQ1hC7mVI0Yl5JLmqJiEzVNiLEBly8aX-lmmVOSr8TnSXz-Jz7fik_c6XZEV1Rg_6hf0ylAN4E1H7qmTj_-p_QHHaaUgw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1963477471</pqid></control><display><type>article</type><title>pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization</title><source>ACS Publications</source><creator>Yang, Hao ; Chi, Hao ; Zhou, Wen-Jing ; Zeng, Wen-Feng ; Liu, Chao ; Wang, Rui-Min ; Wang, Zhao-Wei ; Niu, Xiu-Nan ; Chen, Zhen-Lin ; He, Si-Min</creator><creatorcontrib>Yang, Hao ; Chi, Hao ; Zhou, Wen-Jing ; Zeng, Wen-Feng ; Liu, Chao ; Wang, Rui-Min ; Wang, Zhao-Wei ; Niu, Xiu-Nan ; Chen, Zhen-Lin ; He, Si-Min</creatorcontrib><description>MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length peptides obtained by de novo peptide sequencing. A semi-supervised learning approach was used to discriminate correct amino acids from random one; then, an expectation-maximization algorithm was used to adaptively control the false amino-acid rate (FAR). On three test data sets, pSite recalled 86% more amino acids on average than PEAKS at the FAR of 5%. pSite also performed superiorly on the modification site localization problem, which is essentially a special case of amino acid confidence evaluation. On three phosphopeptide data sets, at the false localization rate of 1%, the average recall of pSite was 91% while those of Ascore and phosphoRS were 64 and 63%, respectively. pSite covered 98% of Ascore and phosphoRS results and contributed 21% more phosphorylation sites. Further analyses show that the use of distinct fragmentation features in high-resolution MS/MS spectra, such as neutral loss ions, played an important role in improving the precision of pSite. In summary, the effective and universal model together with the extensive use of spectral information makes pSite an excellent quality control tool for both de novo peptide sequencing and modification site localization.</description><identifier>ISSN: 1535-3893</identifier><identifier>EISSN: 1535-3907</identifier><identifier>DOI: 10.1021/acs.jproteome.7b00428</identifier><identifier>PMID: 29130300</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><ispartof>Journal of proteome research, 2018-01, Vol.17 (1), p.119-128</ispartof><rights>Copyright © 2017 American Chemical Society</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a351t-f549ef47bef0b7429f49597459ddd42b91f20bede033154cd56a2764a5c507333</citedby><cites>FETCH-LOGICAL-a351t-f549ef47bef0b7429f49597459ddd42b91f20bede033154cd56a2764a5c507333</cites><orcidid>0000-0003-4325-2147 ; 0000-0002-1277-2628</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/acs.jproteome.7b00428$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/acs.jproteome.7b00428$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,777,781,2752,27057,27905,27906,56719,56769</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29130300$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yang, Hao</creatorcontrib><creatorcontrib>Chi, Hao</creatorcontrib><creatorcontrib>Zhou, Wen-Jing</creatorcontrib><creatorcontrib>Zeng, Wen-Feng</creatorcontrib><creatorcontrib>Liu, Chao</creatorcontrib><creatorcontrib>Wang, Rui-Min</creatorcontrib><creatorcontrib>Wang, Zhao-Wei</creatorcontrib><creatorcontrib>Niu, Xiu-Nan</creatorcontrib><creatorcontrib>Chen, Zhen-Lin</creatorcontrib><creatorcontrib>He, Si-Min</creatorcontrib><title>pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization</title><title>Journal of proteome research</title><addtitle>J. Proteome Res</addtitle><description>MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length peptides obtained by de novo peptide sequencing. A semi-supervised learning approach was used to discriminate correct amino acids from random one; then, an expectation-maximization algorithm was used to adaptively control the false amino-acid rate (FAR). On three test data sets, pSite recalled 86% more amino acids on average than PEAKS at the FAR of 5%. pSite also performed superiorly on the modification site localization problem, which is essentially a special case of amino acid confidence evaluation. On three phosphopeptide data sets, at the false localization rate of 1%, the average recall of pSite was 91% while those of Ascore and phosphoRS were 64 and 63%, respectively. pSite covered 98% of Ascore and phosphoRS results and contributed 21% more phosphorylation sites. Further analyses show that the use of distinct fragmentation features in high-resolution MS/MS spectra, such as neutral loss ions, played an important role in improving the precision of pSite. In summary, the effective and universal model together with the extensive use of spectral information makes pSite an excellent quality control tool for both de novo peptide sequencing and modification site localization.</description><issn>1535-3893</issn><issn>1535-3907</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqFkMtOAyEUhonRaK0-goalm1YYoIi7ptZLUm9R1xMGDoZmZqjDjEmNDy-96NYVhPzffw4fQieUDCnJ6Lk2cThfNKGFUMFQFoTw7GIH9ahgYsAUkbu_9wvFDtBhjHNCqJCE7aODTFFGGCE99L148S1c4nHl64DHxls8CbXzFmoDePqpy063PtTYhQY_d7r07XKVaJtQ4uDwFeCH8BnwEyzaBOEX-OgS6ut3rGuL74P1zptNxWoSngWTSr7WL0doz-kywvH27KO36-nr5HYwe7y5m4xnA80EbQdOcAWOywIcKSTPlONKKMmFstbyrFDUZaQAC4QxKrixYqQzOeJaGEEkY6yPzja9yVdaL7Z55aOBstQ1hC7mVI0Yl5JLmqJiEzVNiLEBly8aX-lmmVOSr8TnSXz-Jz7fik_c6XZEV1Rg_6hf0ylAN4E1H7qmTj_-p_QHHaaUgw</recordid><startdate>20180105</startdate><enddate>20180105</enddate><creator>Yang, Hao</creator><creator>Chi, Hao</creator><creator>Zhou, Wen-Jing</creator><creator>Zeng, Wen-Feng</creator><creator>Liu, Chao</creator><creator>Wang, Rui-Min</creator><creator>Wang, Zhao-Wei</creator><creator>Niu, Xiu-Nan</creator><creator>Chen, Zhen-Lin</creator><creator>He, Si-Min</creator><general>American Chemical Society</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-4325-2147</orcidid><orcidid>https://orcid.org/0000-0002-1277-2628</orcidid></search><sort><creationdate>20180105</creationdate><title>pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization</title><author>Yang, Hao ; Chi, Hao ; Zhou, Wen-Jing ; Zeng, Wen-Feng ; Liu, Chao ; Wang, Rui-Min ; Wang, Zhao-Wei ; Niu, Xiu-Nan ; Chen, Zhen-Lin ; He, Si-Min</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a351t-f549ef47bef0b7429f49597459ddd42b91f20bede033154cd56a2764a5c507333</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Hao</creatorcontrib><creatorcontrib>Chi, Hao</creatorcontrib><creatorcontrib>Zhou, Wen-Jing</creatorcontrib><creatorcontrib>Zeng, Wen-Feng</creatorcontrib><creatorcontrib>Liu, Chao</creatorcontrib><creatorcontrib>Wang, Rui-Min</creatorcontrib><creatorcontrib>Wang, Zhao-Wei</creatorcontrib><creatorcontrib>Niu, Xiu-Nan</creatorcontrib><creatorcontrib>Chen, Zhen-Lin</creatorcontrib><creatorcontrib>He, Si-Min</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of proteome research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Hao</au><au>Chi, Hao</au><au>Zhou, Wen-Jing</au><au>Zeng, Wen-Feng</au><au>Liu, Chao</au><au>Wang, Rui-Min</au><au>Wang, Zhao-Wei</au><au>Niu, Xiu-Nan</au><au>Chen, Zhen-Lin</au><au>He, Si-Min</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization</atitle><jtitle>Journal of proteome research</jtitle><addtitle>J. Proteome Res</addtitle><date>2018-01-05</date><risdate>2018</risdate><volume>17</volume><issue>1</issue><spage>119</spage><epage>128</epage><pages>119-128</pages><issn>1535-3893</issn><eissn>1535-3907</eissn><abstract>MS-based de novo peptide sequencing has been improved remarkably with significant development of mass-spectrometry and computational approaches but still lacks quality-control methods. Here we proposed a novel algorithm pSite to evaluate the confidence of each amino acid rather than the full-length peptides obtained by de novo peptide sequencing. A semi-supervised learning approach was used to discriminate correct amino acids from random one; then, an expectation-maximization algorithm was used to adaptively control the false amino-acid rate (FAR). On three test data sets, pSite recalled 86% more amino acids on average than PEAKS at the FAR of 5%. pSite also performed superiorly on the modification site localization problem, which is essentially a special case of amino acid confidence evaluation. On three phosphopeptide data sets, at the false localization rate of 1%, the average recall of pSite was 91% while those of Ascore and phosphoRS were 64 and 63%, respectively. pSite covered 98% of Ascore and phosphoRS results and contributed 21% more phosphorylation sites. Further analyses show that the use of distinct fragmentation features in high-resolution MS/MS spectra, such as neutral loss ions, played an important role in improving the precision of pSite. In summary, the effective and universal model together with the extensive use of spectral information makes pSite an excellent quality control tool for both de novo peptide sequencing and modification site localization.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>29130300</pmid><doi>10.1021/acs.jproteome.7b00428</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0003-4325-2147</orcidid><orcidid>https://orcid.org/0000-0002-1277-2628</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 1535-3893
ispartof Journal of proteome research, 2018-01, Vol.17 (1), p.119-128
issn 1535-3893
1535-3907
language eng
recordid cdi_proquest_miscellaneous_1963477471
source ACS Publications
title pSite: Amino Acid Confidence Evaluation for Quality Control of De Novo Peptide Sequencing and Modification Site Localization
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T10%3A09%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=pSite:%20Amino%20Acid%20Confidence%20Evaluation%20for%20Quality%20Control%20of%20De%20Novo%20Peptide%20Sequencing%20and%20Modification%20Site%20Localization&rft.jtitle=Journal%20of%20proteome%20research&rft.au=Yang,%20Hao&rft.date=2018-01-05&rft.volume=17&rft.issue=1&rft.spage=119&rft.epage=128&rft.pages=119-128&rft.issn=1535-3893&rft.eissn=1535-3907&rft_id=info:doi/10.1021/acs.jproteome.7b00428&rft_dat=%3Cproquest_cross%3E1963477471%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1963477471&rft_id=info:pmid/29130300&rfr_iscdi=true