A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics

Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Analytical chemistry (Washington) 2004-07, Vol.76 (14), p.4193-4201
Hauptverfasser:	Liu, Hongbin, Sadygov, Rovshan G, Yates, John R
Format:	Artikel
Sprache:	eng
Schlagworte:	Analytical biochemistry: general aspects, technics, instrumentation Analytical chemistry Analytical, structural and metabolic biochemistry Biological and medical sciences Chemistry Data Collection - methods Exact sciences and technology Fundamental and applied biological sciences. Psychology Models, Statistical Proteins Proteins - analysis Proteomics - methods Proteomics - statistics & numerical data Sampling techniques Spectrometric and optical methods Spectrum analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	4201
container_issue	14
container_start_page	4193
container_title	Analytical chemistry (Washington)
container_volume	76
creator	Liu, Hongbin Sadygov, Rovshan G Yates, John R
description	Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.
doi_str_mv	10.1021/ac0498563
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_66707683</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>66707683</sourcerecordid><originalsourceid>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</originalsourceid><addsrcrecordid>eNqF0U1vEzEQBmALgWgoHPgDyEICqYcF2xN_7DGKCkVqS5SUK5bXa5ctu3awdxH8exwlaio4cLKtefTKM4PQS0reUcLoe2PJvFZcwCM0o5yRSijFHqMZIQQqJgk5Qc9yviOEUkLFU3RSEAchYIa-LvBVbF2PfUx4bUIbB7wxw7bvwi0uT3yex24wYxcDjh6vXV_uPx1epTi6LuBFM4XWBOtweWy-xfF2CvtiHDqbn6Mn3vTZvTicp-jLh_Ob5UV1-fnjp-XisjKcwFgZ5b2rvWuoZDUIO1fga0ckN4LVjFhuAEA5P2deUO7bFpilTcPrmlkQjsMpervP3ab4Y3J51EOXret7E1ycshZCEikU_BdSCYRJukt8_Re8i1MKpQnNqFQKFOzQ2R7ZFHNOzuttKtNKvzUlercafb-aYl8dAqdmcO1RHnZRwJsDMNma3qcy1i4_cHWJkaq4au-6PLpf93WTvmshQXJ9s9ro9bWai6sl6NUx19h8bOLfD_4B1HWuyQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>217883835</pqid></control><display><type>article</type><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><source>MEDLINE</source><source>ACS Publications</source><creator>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</creator><creatorcontrib>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</creatorcontrib><description>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</description><identifier>ISSN: 0003-2700</identifier><identifier>EISSN: 1520-6882</identifier><identifier>DOI: 10.1021/ac0498563</identifier><identifier>PMID: 15253663</identifier><identifier>CODEN: ANCHAM</identifier><language>eng</language><publisher>Washington, DC: American Chemical Society</publisher><subject>Analytical biochemistry: general aspects, technics, instrumentation ; Analytical chemistry ; Analytical, structural and metabolic biochemistry ; Biological and medical sciences ; Chemistry ; Data Collection - methods ; Exact sciences and technology ; Fundamental and applied biological sciences. Psychology ; Models, Statistical ; Proteins ; Proteins - analysis ; Proteomics - methods ; Proteomics - statistics & numerical data ; Sampling techniques ; Spectrometric and optical methods ; Spectrum analysis</subject><ispartof>Analytical chemistry (Washington), 2004-07, Vol.76 (14), p.4193-4201</ispartof><rights>Copyright © 2004 American Chemical Society</rights><rights>2005 INIST-CNRS</rights><rights>Copyright American Chemical Society Jul 15, 2004</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</citedby><cites>FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://pubs.acs.org/doi/pdf/10.1021/ac0498563$$EPDF$$P50$$Gacs$$H</linktopdf><linktohtml>$$Uhttps://pubs.acs.org/doi/10.1021/ac0498563$$EHTML$$P50$$Gacs$$H</linktohtml><link.rule.ids>314,777,781,2752,27057,27905,27906,56719,56769</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=15956378$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/15253663$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Liu, Hongbin</creatorcontrib><creatorcontrib>Sadygov, Rovshan G</creatorcontrib><creatorcontrib>Yates, John R</creatorcontrib><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><title>Analytical chemistry (Washington)</title><addtitle>Anal. Chem</addtitle><description>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</description><subject>Analytical biochemistry: general aspects, technics, instrumentation</subject><subject>Analytical chemistry</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Biological and medical sciences</subject><subject>Chemistry</subject><subject>Data Collection - methods</subject><subject>Exact sciences and technology</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Models, Statistical</subject><subject>Proteins</subject><subject>Proteins - analysis</subject><subject>Proteomics - methods</subject><subject>Proteomics - statistics & numerical data</subject><subject>Sampling techniques</subject><subject>Spectrometric and optical methods</subject><subject>Spectrum analysis</subject><issn>0003-2700</issn><issn>1520-6882</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqF0U1vEzEQBmALgWgoHPgDyEICqYcF2xN_7DGKCkVqS5SUK5bXa5ctu3awdxH8exwlaio4cLKtefTKM4PQS0reUcLoe2PJvFZcwCM0o5yRSijFHqMZIQQqJgk5Qc9yviOEUkLFU3RSEAchYIa-LvBVbF2PfUx4bUIbB7wxw7bvwi0uT3yex24wYxcDjh6vXV_uPx1epTi6LuBFM4XWBOtweWy-xfF2CvtiHDqbn6Mn3vTZvTicp-jLh_Ob5UV1-fnjp-XisjKcwFgZ5b2rvWuoZDUIO1fga0ckN4LVjFhuAEA5P2deUO7bFpilTcPrmlkQjsMpervP3ab4Y3J51EOXret7E1ycshZCEikU_BdSCYRJukt8_Re8i1MKpQnNqFQKFOzQ2R7ZFHNOzuttKtNKvzUlercafb-aYl8dAqdmcO1RHnZRwJsDMNma3qcy1i4_cHWJkaq4au-6PLpf93WTvmshQXJ9s9ro9bWai6sl6NUx19h8bOLfD_4B1HWuyQ</recordid><startdate>20040715</startdate><enddate>20040715</enddate><creator>Liu, Hongbin</creator><creator>Sadygov, Rovshan G</creator><creator>Yates, John R</creator><general>American Chemical Society</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7U5</scope><scope>7U7</scope><scope>7U9</scope><scope>8BQ</scope><scope>8FD</scope><scope>C1K</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20040715</creationdate><title>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</title><author>Liu, Hongbin ; Sadygov, Rovshan G ; Yates, John R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a503t-a8ffe9feb172936c483f9e075a62920c5a3338ef42f615fdd32c1bb5992c36e53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Analytical biochemistry: general aspects, technics, instrumentation</topic><topic>Analytical chemistry</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Biological and medical sciences</topic><topic>Chemistry</topic><topic>Data Collection - methods</topic><topic>Exact sciences and technology</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Models, Statistical</topic><topic>Proteins</topic><topic>Proteins - analysis</topic><topic>Proteomics - methods</topic><topic>Proteomics - statistics & numerical data</topic><topic>Sampling techniques</topic><topic>Spectrometric and optical methods</topic><topic>Spectrum analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Liu, Hongbin</creatorcontrib><creatorcontrib>Sadygov, Rovshan G</creatorcontrib><creatorcontrib>Yates, John R</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Analytical chemistry (Washington)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Liu, Hongbin</au><au>Sadygov, Rovshan G</au><au>Yates, John R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics</atitle><jtitle>Analytical chemistry (Washington)</jtitle><addtitle>Anal. Chem</addtitle><date>2004-07-15</date><risdate>2004</risdate><volume>76</volume><issue>14</issue><spage>4193</spage><epage>4201</epage><pages>4193-4201</pages><issn>0003-2700</issn><eissn>1520-6882</eissn><coden>ANCHAM</coden><abstract>Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.</abstract><cop>Washington, DC</cop><pub>American Chemical Society</pub><pmid>15253663</pmid><doi>10.1021/ac0498563</doi><tpages>9</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0003-2700
ispartof	Analytical chemistry (Washington), 2004-07, Vol.76 (14), p.4193-4201
issn	0003-2700 1520-6882
language	eng
recordid	cdi_proquest_miscellaneous_66707683
source	MEDLINE; ACS Publications
subjects	Analytical biochemistry: general aspects, technics, instrumentation Analytical chemistry Analytical, structural and metabolic biochemistry Biological and medical sciences Chemistry Data Collection - methods Exact sciences and technology Fundamental and applied biological sciences. Psychology Models, Statistical Proteins Proteins - analysis Proteomics - methods Proteomics - statistics & numerical data Sampling techniques Spectrometric and optical methods Spectrum analysis
title	A Model for Random Sampling and Estimation of Relative Protein Abundance in Shotgun Proteomics
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A11%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Model%20for%20Random%20Sampling%20and%20Estimation%20of%20Relative%20Protein%20Abundance%20in%20Shotgun%20Proteomics&rft.jtitle=Analytical%20chemistry%20(Washington)&rft.au=Liu,%20Hongbin&rft.date=2004-07-15&rft.volume=76&rft.issue=14&rft.spage=4193&rft.epage=4201&rft.pages=4193-4201&rft.issn=0003-2700&rft.eissn=1520-6882&rft.coden=ANCHAM&rft_id=info:doi/10.1021/ac0498563&rft_dat=%3Cproquest_cross%3E66707683%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=217883835&rft_id=info:pmid/15253663&rfr_iscdi=true