A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics
Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the...
Gespeichert in:
Veröffentlicht in: | Analytical chemistry (Washington) 2004-07, Vol.76 (14), p.4193-4201 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 4201 |
---|---|
container_issue | 14 |
container_start_page | 4193 |
container_title | Analytical chemistry (Washington) |
container_volume | 76 |
creator | Liu, Hongbin Sadygov, Rovshan G Yates, John R |
description | Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein. |
doi_str_mv | 10.1021/ac0498563 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_66707683</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>66707683</sourcerecordid><originalsourceid>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</originalsourceid><addsrcrecordid>eNqF0U1vEzEQBmALgWgoHPgDyEICqYcF2xN_7DGKCkVqS5SUK5bXa5ctu3awdxH8exwlaio4cLKtefTKM4PQS0reUcLoe2PJvFZcwCM0o5yRSijFHqMZIQQqJgk5Qc9yviOEUkLFU3RSEAchYIa-LvBVbF2PfUx4bUIbB7wxw7bvwi0uT3yex24wYxcDjh6vXV_uPx1epTi6LuBFM4XWBOtweWy-xfF2CvtiHDqbn6Mn3vTZvTicp-jLh_Ob5UV1-fnjp-XisjKcwFgZ5b2rvWuoZDUIO1fga0ckN4LVjFhuAEA5P2deUO7bFpilTcPrmlkQjsMpervP3ab4Y3J51EOXret7E1ycshZCEikU_BdSCYRJukt8_Re8i1MKpQnNqFQKFOzQ2R7ZFHNOzuttKtNKvzUlercafb-aYl8dAqdmcO1RHnZRwJsDMNma3qcy1i4_cHWJkaq4au-6PLpf93WTvmshQXJ9s9ro9bWai6sl6NUx19h8bOLfD_4B1HWuyQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>217883835</pqid></control><display><type>article</type><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><source>MEDLINE</source><source>ACS Publications</source><creator>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</creator><creatorcontrib>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</creatorcontrib><description>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</description><identifier>ISSN: 0003-2700</identifier><identifier>EISSN: 1520-6882</identifier><identifier>DOI: 10.1021/ac0498563</identifier><identifier>PMID: 15253663</identifier><identifier>CODEN: ANCHAM</identifier><language>eng</language><publisher>Washington, DC: American Chemical Society</publisher><subject>Analytical biochemistry: general aspects, technics, instrumentation ; Analytical chemistry ; Analytical, structural and metabolic biochemistry ; Biological and medical sciences ; Chemistry ; Data Collection - methods ; Exact sciences and technology ; Fundamental and applied biological sciences. Psychology ; Models, Statistical ; Proteins ; Proteins - analysis ; Proteomics - methods ; Proteomics - statistics & numerical data ; Sampling techniques ; Spectrometric and optical methods ; Spectrum analysis</subject><ispartof>Analytical chemistry (Washington), 2004-07, Vol.76 (14), p.4193-4201</ispartof><rights>Copyright © 2004 American Chemical Society</rights><rights>2005 INIST-CNRS</rights><rights>Copyright American Chemical Society Jul 15, 2004</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</citedby><cites>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/ac0498563$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/ac0498563$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,777,781,2752,27057,27905,27906,56719,56769</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15956378$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/15253663$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Hongbin</creatorcontrib><creatorcontrib>Sadygov, Rovshan G</creatorcontrib><creatorcontrib>Yates, John R</creatorcontrib><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><title>Analytical chemistry (Washington)</title><addtitle>Anal. Chem</addtitle><description>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</description><subject>Analytical biochemistry: general aspects, technics, instrumentation</subject><subject>Analytical chemistry</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Biological and medical sciences</subject><subject>Chemistry</subject><subject>Data Collection - methods</subject><subject>Exact sciences and technology</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Models, Statistical</subject><subject>Proteins</subject><subject>Proteins - analysis</subject><subject>Proteomics - methods</subject><subject>Proteomics - statistics & numerical data</subject><subject>Sampling techniques</subject><subject>Spectrometric and optical methods</subject><subject>Spectrum analysis</subject><issn>0003-2700</issn><issn>1520-6882</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqF0U1vEzEQBmALgWgoHPgDyEICqYcF2xN_7DGKCkVqS5SUK5bXa5ctu3awdxH8exwlaio4cLKtefTKM4PQS0reUcLoe2PJvFZcwCM0o5yRSijFHqMZIQQqJgk5Qc9yviOEUkLFU3RSEAchYIa-LvBVbF2PfUx4bUIbB7wxw7bvwi0uT3yex24wYxcDjh6vXV_uPx1epTi6LuBFM4XWBOtweWy-xfF2CvtiHDqbn6Mn3vTZvTicp-jLh_Ob5UV1-fnjp-XisjKcwFgZ5b2rvWuoZDUIO1fga0ckN4LVjFhuAEA5P2deUO7bFpilTcPrmlkQjsMpervP3ab4Y3J51EOXret7E1ycshZCEikU_BdSCYRJukt8_Re8i1MKpQnNqFQKFOzQ2R7ZFHNOzuttKtNKvzUlercafb-aYl8dAqdmcO1RHnZRwJsDMNma3qcy1i4_cHWJkaq4au-6PLpf93WTvmshQXJ9s9ro9bWai6sl6NUx19h8bOLfD_4B1HWuyQ</recordid><startdate>20040715</startdate><enddate>20040715</enddate><creator>Liu, Hongbin</creator><creator>Sadygov, Rovshan G</creator><creator>Yates, John R</creator><general>American Chemical Society</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7U5</scope><scope>7U7</scope><scope>7U9</scope><scope>8BQ</scope><scope>8FD</scope><scope>C1K</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20040715</creationdate><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><author>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Analytical biochemistry: general aspects, technics, instrumentation</topic><topic>Analytical chemistry</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Biological and medical sciences</topic><topic>Chemistry</topic><topic>Data Collection - methods</topic><topic>Exact sciences and technology</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Models, Statistical</topic><topic>Proteins</topic><topic>Proteins - analysis</topic><topic>Proteomics - methods</topic><topic>Proteomics - statistics & numerical data</topic><topic>Sampling techniques</topic><topic>Spectrometric and optical methods</topic><topic>Spectrum analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Hongbin</creatorcontrib><creatorcontrib>Sadygov, Rovshan G</creatorcontrib><creatorcontrib>Yates, John R</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Analytical chemistry (Washington)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Hongbin</au><au>Sadygov, Rovshan G</au><au>Yates, John R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</atitle><jtitle>Analytical chemistry (Washington)</jtitle><addtitle>Anal. Chem</addtitle><date>2004-07-15</date><risdate>2004</risdate><volume>76</volume><issue>14</issue><spage>4193</spage><epage>4201</epage><pages>4193-4201</pages><issn>0003-2700</issn><eissn>1520-6882</eissn><coden>ANCHAM</coden><abstract>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</abstract><cop>Washington, DC</cop><pub>American Chemical Society</pub><pmid>15253663</pmid><doi>10.1021/ac0498563</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0003-2700 |
ispartof | Analytical chemistry (Washington), 2004-07, Vol.76 (14), p.4193-4201 |
issn | 0003-2700 1520-6882 |
language | eng |
recordid | cdi_proquest_miscellaneous_66707683 |
source | MEDLINE; ACS Publications |
subjects | Analytical biochemistry: general aspects, technics, instrumentation Analytical chemistry Analytical, structural and metabolic biochemistry Biological and medical sciences Chemistry Data Collection - methods Exact sciences and technology Fundamental and applied biological sciences. Psychology Models, Statistical Proteins Proteins - analysis Proteomics - methods Proteomics - statistics & numerical data Sampling techniques Spectrometric and optical methods Spectrum analysis |
title | A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A11%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Model%20for%20Random%20Sampling%20and%20Estimation%20of%20Relative%20Protein%20Abundance%20in%20Shotgun%20Proteomics&rft.jtitle=Analytical%20chemistry%20(Washington)&rft.au=Liu,%20Hongbin&rft.date=2004-07-15&rft.volume=76&rft.issue=14&rft.spage=4193&rft.epage=4201&rft.pages=4193-4201&rft.issn=0003-2700&rft.eissn=1520-6882&rft.coden=ANCHAM&rft_id=info:doi/10.1021/ac0498563&rft_dat=%3Cproquest_cross%3E66707683%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=217883835&rft_id=info:pmid/15253663&rfr_iscdi=true |