Qscore: an algorithm for evaluating SEQUEST database search results

A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein ide...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Society for Mass Spectrometry 2002-04, Vol.13 (4), p.378-386
Hauptverfasser: Moore, Roger E., Young, Mary K., Lee, Terry D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 386
container_issue 4
container_start_page 378
container_title Journal of the American Society for Mass Spectrometry
container_volume 13
creator Moore, Roger E.
Young, Mary K.
Lee, Terry D.
description A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein identification has come about by chance. The probability is based on the number of identified peptides from the protein, the total number of identified peptides, and the fraction of distinct tryptic peptides from the database that are present in the identified protein. The score is not strictly a probability, as it also incorporates information about the quality of the individual peptide matches. The result of using Qscore on a large test set of data was similar to that achieved using approaches that validate individual spectral matches, with only a narrow overlap in scores between identified proteins and false positive matches. In direct comparison with a published method of evaluating Sequest results, Qscore was able to identify an equivalent number of proteins without any identifiable false positive assignments. Qscore greatly reduces the number of Sequest protein identifications that have to be validated manually.
doi_str_mv 10.1016/S1044-0305(02)00352-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_71604467</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1044030502003525</els_id><sourcerecordid>71604467</sourcerecordid><originalsourceid>FETCH-LOGICAL-c485t-887dad4ade6b9cc745ece3756cd5b41af1ef6f63bbc162f7a6d099a6dd3c26ee3</originalsourceid><addsrcrecordid>eNqFkE1LxDAQhoMofqz-BKUgih6qSZuPjReRZf0AQWT1HKbJVCPdVpNW8N8b3RXBi5eZOTwzzPsQssvoCaNMns4Y5TynJRVHtDimtBRFLlbIJhsrnTNWlKtp_kE2yFaML5QyRbVaJxuMacG0kptkch9tF_AsgzaD5qkLvn-eZ3UXMnyHZoDet0_ZbHr_OJ09ZA56qCBiFhGCfc4CxqHp4zZZq6GJuLPsI_J4OX2YXOe3d1c3k4vb3PKx6PPxWDlwHBzKSluruECLpRLSOlFxBjXDWtayrCrLZFErkI5qnaorbSERyxE5XNx9Dd3bgLE3cx8tNg202A3RKCZTYKkSuP8HfOmG0KbfTApecC54EjYiYkHZ0MUYsDavwc8hfBhGzZdj8-3YfAk0tDDfjo1Ie3vL60M1R_e7tZSagIMlANFCUwdorY-_XCm0FFQn7nzBYZL27jGYaD22Fp0PaHvjOv_PK58YDpem</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1952445400</pqid></control><display><type>article</type><title>Qscore: an algorithm for evaluating SEQUEST database search results</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><source>SpringerLink Journals - AutoHoldings</source><creator>Moore, Roger E. ; Young, Mary K. ; Lee, Terry D.</creator><creatorcontrib>Moore, Roger E. ; Young, Mary K. ; Lee, Terry D.</creatorcontrib><description>A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein identification has come about by chance. The probability is based on the number of identified peptides from the protein, the total number of identified peptides, and the fraction of distinct tryptic peptides from the database that are present in the identified protein. The score is not strictly a probability, as it also incorporates information about the quality of the individual peptide matches. The result of using Qscore on a large test set of data was similar to that achieved using approaches that validate individual spectral matches, with only a narrow overlap in scores between identified proteins and false positive matches. In direct comparison with a published method of evaluating Sequest results, Qscore was able to identify an equivalent number of proteins without any identifiable false positive assignments. Qscore greatly reduces the number of Sequest protein identifications that have to be validated manually.</description><identifier>ISSN: 1044-0305</identifier><identifier>EISSN: 1879-1123</identifier><identifier>DOI: 10.1016/S1044-0305(02)00352-5</identifier><identifier>PMID: 11951976</identifier><language>eng</language><publisher>New York, NY: Elsevier Inc</publisher><subject>Algorithms ; Amino Acid Sequence ; Analytical, structural and metabolic biochemistry ; Biological and medical sciences ; Chromatography, High Pressure Liquid ; Databases, Factual ; Fundamental and applied biological sciences. Psychology ; General aspects, investigation methods ; Mass Spectrometry ; Molecular Sequence Data ; Peptides ; Peptides - chemistry ; Proteins ; Proteins - chemistry ; Software ; Spectrometry, Mass, Electrospray Ionization ; Statistical analysis</subject><ispartof>Journal of the American Society for Mass Spectrometry, 2002-04, Vol.13 (4), p.378-386</ispartof><rights>2002 American Society for Mass Spectrometry</rights><rights>2002 INIST-CNRS</rights><rights>American Society for Mass Spectrometry 2002</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c485t-887dad4ade6b9cc745ece3756cd5b41af1ef6f63bbc162f7a6d099a6dd3c26ee3</citedby><cites>FETCH-LOGICAL-c485t-887dad4ade6b9cc745ece3756cd5b41af1ef6f63bbc162f7a6d099a6dd3c26ee3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=13596509$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/11951976$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Moore, Roger E.</creatorcontrib><creatorcontrib>Young, Mary K.</creatorcontrib><creatorcontrib>Lee, Terry D.</creatorcontrib><title>Qscore: an algorithm for evaluating SEQUEST database search results</title><title>Journal of the American Society for Mass Spectrometry</title><addtitle>J Am Soc Mass Spectrom</addtitle><description>A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein identification has come about by chance. The probability is based on the number of identified peptides from the protein, the total number of identified peptides, and the fraction of distinct tryptic peptides from the database that are present in the identified protein. The score is not strictly a probability, as it also incorporates information about the quality of the individual peptide matches. The result of using Qscore on a large test set of data was similar to that achieved using approaches that validate individual spectral matches, with only a narrow overlap in scores between identified proteins and false positive matches. In direct comparison with a published method of evaluating Sequest results, Qscore was able to identify an equivalent number of proteins without any identifiable false positive assignments. Qscore greatly reduces the number of Sequest protein identifications that have to be validated manually.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Biological and medical sciences</subject><subject>Chromatography, High Pressure Liquid</subject><subject>Databases, Factual</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects, investigation methods</subject><subject>Mass Spectrometry</subject><subject>Molecular Sequence Data</subject><subject>Peptides</subject><subject>Peptides - chemistry</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><subject>Software</subject><subject>Spectrometry, Mass, Electrospray Ionization</subject><subject>Statistical analysis</subject><issn>1044-0305</issn><issn>1879-1123</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqFkE1LxDAQhoMofqz-BKUgih6qSZuPjReRZf0AQWT1HKbJVCPdVpNW8N8b3RXBi5eZOTwzzPsQssvoCaNMns4Y5TynJRVHtDimtBRFLlbIJhsrnTNWlKtp_kE2yFaML5QyRbVaJxuMacG0kptkch9tF_AsgzaD5qkLvn-eZ3UXMnyHZoDet0_ZbHr_OJ09ZA56qCBiFhGCfc4CxqHp4zZZq6GJuLPsI_J4OX2YXOe3d1c3k4vb3PKx6PPxWDlwHBzKSluruECLpRLSOlFxBjXDWtayrCrLZFErkI5qnaorbSERyxE5XNx9Dd3bgLE3cx8tNg202A3RKCZTYKkSuP8HfOmG0KbfTApecC54EjYiYkHZ0MUYsDavwc8hfBhGzZdj8-3YfAk0tDDfjo1Ie3vL60M1R_e7tZSagIMlANFCUwdorY-_XCm0FFQn7nzBYZL27jGYaD22Fp0PaHvjOv_PK58YDpem</recordid><startdate>20020401</startdate><enddate>20020401</enddate><creator>Moore, Roger E.</creator><creator>Young, Mary K.</creator><creator>Lee, Terry D.</creator><general>Elsevier Inc</general><general>Elsevier Science</general><general>Springer Nature B.V</general><scope>6I.</scope><scope>AAFTH</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FE</scope><scope>8FG</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope></search><sort><creationdate>20020401</creationdate><title>Qscore: an algorithm for evaluating SEQUEST database search results</title><author>Moore, Roger E. ; Young, Mary K. ; Lee, Terry D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c485t-887dad4ade6b9cc745ece3756cd5b41af1ef6f63bbc162f7a6d099a6dd3c26ee3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Biological and medical sciences</topic><topic>Chromatography, High Pressure Liquid</topic><topic>Databases, Factual</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects, investigation methods</topic><topic>Mass Spectrometry</topic><topic>Molecular Sequence Data</topic><topic>Peptides</topic><topic>Peptides - chemistry</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><topic>Software</topic><topic>Spectrometry, Mass, Electrospray Ionization</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Moore, Roger E.</creatorcontrib><creatorcontrib>Young, Mary K.</creatorcontrib><creatorcontrib>Lee, Terry D.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of the American Society for Mass Spectrometry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Moore, Roger E.</au><au>Young, Mary K.</au><au>Lee, Terry D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Qscore: an algorithm for evaluating SEQUEST database search results</atitle><jtitle>Journal of the American Society for Mass Spectrometry</jtitle><addtitle>J Am Soc Mass Spectrom</addtitle><date>2002-04-01</date><risdate>2002</risdate><volume>13</volume><issue>4</issue><spage>378</spage><epage>386</epage><pages>378-386</pages><issn>1044-0305</issn><eissn>1879-1123</eissn><abstract>A scoring procedure is described for measuring the quality of the results for protein identifications obtained from spectral matching of MS/MS data using the Sequest database search program. The scoring system is essentially probabilistic and operates by estimating the probability that a protein identification has come about by chance. The probability is based on the number of identified peptides from the protein, the total number of identified peptides, and the fraction of distinct tryptic peptides from the database that are present in the identified protein. The score is not strictly a probability, as it also incorporates information about the quality of the individual peptide matches. The result of using Qscore on a large test set of data was similar to that achieved using approaches that validate individual spectral matches, with only a narrow overlap in scores between identified proteins and false positive matches. In direct comparison with a published method of evaluating Sequest results, Qscore was able to identify an equivalent number of proteins without any identifiable false positive assignments. Qscore greatly reduces the number of Sequest protein identifications that have to be validated manually.</abstract><cop>New York, NY</cop><pub>Elsevier Inc</pub><pmid>11951976</pmid><doi>10.1016/S1044-0305(02)00352-5</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1044-0305
ispartof Journal of the American Society for Mass Spectrometry, 2002-04, Vol.13 (4), p.378-386
issn 1044-0305
1879-1123
language eng
recordid cdi_proquest_miscellaneous_71604467
source MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry; SpringerLink Journals - AutoHoldings
subjects Algorithms
Amino Acid Sequence
Analytical, structural and metabolic biochemistry
Biological and medical sciences
Chromatography, High Pressure Liquid
Databases, Factual
Fundamental and applied biological sciences. Psychology
General aspects, investigation methods
Mass Spectrometry
Molecular Sequence Data
Peptides
Peptides - chemistry
Proteins
Proteins - chemistry
Software
Spectrometry, Mass, Electrospray Ionization
Statistical analysis
title Qscore: an algorithm for evaluating SEQUEST database search results
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T05%3A57%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Qscore:%20an%20algorithm%20for%20evaluating%20SEQUEST%20database%20search%20results&rft.jtitle=Journal%20of%20the%20American%20Society%20for%20Mass%20Spectrometry&rft.au=Moore,%20Roger%20E.&rft.date=2002-04-01&rft.volume=13&rft.issue=4&rft.spage=378&rft.epage=386&rft.pages=378-386&rft.issn=1044-0305&rft.eissn=1879-1123&rft_id=info:doi/10.1016/S1044-0305(02)00352-5&rft_dat=%3Cproquest_cross%3E71604467%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1952445400&rft_id=info:pmid/11951976&rft_els_id=S1044030502003525&rfr_iscdi=true