Implementation and application of a versatile clustering tool for tandem mass spectrometry data
High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss...
Gespeichert in:
Veröffentlicht in: | Proteomics (Weinheim) 2007-09, Vol.7 (18), p.3245-3258 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3258 |
---|---|
container_issue | 18 |
container_start_page | 3245 |
container_title | Proteomics (Weinheim) |
container_volume | 7 |
creator | Flikka, Kristian Meukens, Jeroen Helsens, Kenny Vandekerckhove, Joël Eidhammer, Ingvar Gevaert, Kris Martens, Lennart |
description | High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach. |
doi_str_mv | 10.1002/pmic.200700160 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_68283049</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>68283049</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4350-a997e3948006063174b2289ed2a78214dc11673282161f7c95d1dc7282361a953</originalsourceid><addsrcrecordid>eNqFkE1v1DAQhiNERT_gyhF8obdsx3bijyPa0mXVFpCgQuJiTR2nCiRxsL3A_nu8ymrLjdPMSM87M3qK4iWFBQVgF9PQ2QUDkABUwJPihApal1oJ-vTQ1_y4OI3xe0ak0vJZcUylBFVrflKY9TD1bnBjwtT5keDYEJymvrPz7FuC5JcLMY-9I7bfxORCNz6Q5H1PWh9Iyhk3kAFjJHFyNgU_uBS2pMGEz4ujFvvoXuzrWXF39e7L8n1583G1Xr69KW3FayhRa-m4rhSAAMGprO4ZU9o1DKVitGospUJylntBW2l13dDGyjxzQVHX_Kw4n_dOwf_cuJjM0EXr-h5H5zfRCMUUh0pncDGDNvgYg2vNFLoBw9ZQMDulZqfUHJTmwKv95s394JpHfO8wA2_2AEaLfRtwtF185DToutK7y3rmfmeT2_-cNZ9u18t_nyjnbJf1_zlkMfwwWYuszdcPK3N1udKX19-UYZl_PfMteoMPIf9z95kB5QAKlKgF_wsmt6eG</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>68283049</pqid></control><display><type>article</type><title>Implementation and application of a versatile clustering tool for tandem mass spectrometry data</title><source>MEDLINE</source><source>Access via Wiley Online Library</source><creator>Flikka, Kristian ; Meukens, Jeroen ; Helsens, Kenny ; Vandekerckhove, Joël ; Eidhammer, Ingvar ; Gevaert, Kris ; Martens, Lennart</creator><creatorcontrib>Flikka, Kristian ; Meukens, Jeroen ; Helsens, Kenny ; Vandekerckhove, Joël ; Eidhammer, Ingvar ; Gevaert, Kris ; Martens, Lennart</creatorcontrib><description>High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.</description><identifier>ISSN: 1615-9853</identifier><identifier>EISSN: 1615-9861</identifier><identifier>DOI: 10.1002/pmic.200700160</identifier><identifier>PMID: 17708593</identifier><language>eng</language><publisher>Weinheim: Wiley-VCH Verlag</publisher><subject>Algorithms ; Amino Acid Sequence ; Analytical, structural and metabolic biochemistry ; Bioinformatics ; Biological and medical sciences ; Cell Line, Tumor ; Cluster Analysis ; Fundamental and applied biological sciences. Psychology ; Humans ; Mass spectrometry ; Miscellaneous ; Molecular Sequence Data ; Proteins ; Proteomics ; Spectrum clustering ; Tandem Mass Spectrometry - methods</subject><ispartof>Proteomics (Weinheim), 2007-09, Vol.7 (18), p.3245-3258</ispartof><rights>Copyright © 2007 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim</rights><rights>2007 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4350-a997e3948006063174b2289ed2a78214dc11673282161f7c95d1dc7282361a953</citedby><cites>FETCH-LOGICAL-c4350-a997e3948006063174b2289ed2a78214dc11673282161f7c95d1dc7282361a953</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fpmic.200700160$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fpmic.200700160$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=19095499$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17708593$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Flikka, Kristian</creatorcontrib><creatorcontrib>Meukens, Jeroen</creatorcontrib><creatorcontrib>Helsens, Kenny</creatorcontrib><creatorcontrib>Vandekerckhove, Joël</creatorcontrib><creatorcontrib>Eidhammer, Ingvar</creatorcontrib><creatorcontrib>Gevaert, Kris</creatorcontrib><creatorcontrib>Martens, Lennart</creatorcontrib><title>Implementation and application of a versatile clustering tool for tandem mass spectrometry data</title><title>Proteomics (Weinheim)</title><addtitle>Proteomics</addtitle><description>High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.</description><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Bioinformatics</subject><subject>Biological and medical sciences</subject><subject>Cell Line, Tumor</subject><subject>Cluster Analysis</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Humans</subject><subject>Mass spectrometry</subject><subject>Miscellaneous</subject><subject>Molecular Sequence Data</subject><subject>Proteins</subject><subject>Proteomics</subject><subject>Spectrum clustering</subject><subject>Tandem Mass Spectrometry - methods</subject><issn>1615-9853</issn><issn>1615-9861</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkE1v1DAQhiNERT_gyhF8obdsx3bijyPa0mXVFpCgQuJiTR2nCiRxsL3A_nu8ymrLjdPMSM87M3qK4iWFBQVgF9PQ2QUDkABUwJPihApal1oJ-vTQ1_y4OI3xe0ak0vJZcUylBFVrflKY9TD1bnBjwtT5keDYEJymvrPz7FuC5JcLMY-9I7bfxORCNz6Q5H1PWh9Iyhk3kAFjJHFyNgU_uBS2pMGEz4ujFvvoXuzrWXF39e7L8n1583G1Xr69KW3FayhRa-m4rhSAAMGprO4ZU9o1DKVitGospUJylntBW2l13dDGyjxzQVHX_Kw4n_dOwf_cuJjM0EXr-h5H5zfRCMUUh0pncDGDNvgYg2vNFLoBw9ZQMDulZqfUHJTmwKv95s394JpHfO8wA2_2AEaLfRtwtF185DToutK7y3rmfmeT2_-cNZ9u18t_nyjnbJf1_zlkMfwwWYuszdcPK3N1udKX19-UYZl_PfMteoMPIf9z95kB5QAKlKgF_wsmt6eG</recordid><startdate>20070901</startdate><enddate>20070901</enddate><creator>Flikka, Kristian</creator><creator>Meukens, Jeroen</creator><creator>Helsens, Kenny</creator><creator>Vandekerckhove, Joël</creator><creator>Eidhammer, Ingvar</creator><creator>Gevaert, Kris</creator><creator>Martens, Lennart</creator><general>Wiley-VCH Verlag</general><general>WILEY-VCH Verlag</general><general>WILEY‐VCH Verlag</general><general>Wiley-VCH</general><scope>FBQ</scope><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20070901</creationdate><title>Implementation and application of a versatile clustering tool for tandem mass spectrometry data</title><author>Flikka, Kristian ; Meukens, Jeroen ; Helsens, Kenny ; Vandekerckhove, Joël ; Eidhammer, Ingvar ; Gevaert, Kris ; Martens, Lennart</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4350-a997e3948006063174b2289ed2a78214dc11673282161f7c95d1dc7282361a953</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Bioinformatics</topic><topic>Biological and medical sciences</topic><topic>Cell Line, Tumor</topic><topic>Cluster Analysis</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Humans</topic><topic>Mass spectrometry</topic><topic>Miscellaneous</topic><topic>Molecular Sequence Data</topic><topic>Proteins</topic><topic>Proteomics</topic><topic>Spectrum clustering</topic><topic>Tandem Mass Spectrometry - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Flikka, Kristian</creatorcontrib><creatorcontrib>Meukens, Jeroen</creatorcontrib><creatorcontrib>Helsens, Kenny</creatorcontrib><creatorcontrib>Vandekerckhove, Joël</creatorcontrib><creatorcontrib>Eidhammer, Ingvar</creatorcontrib><creatorcontrib>Gevaert, Kris</creatorcontrib><creatorcontrib>Martens, Lennart</creatorcontrib><collection>AGRIS</collection><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Proteomics (Weinheim)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Flikka, Kristian</au><au>Meukens, Jeroen</au><au>Helsens, Kenny</au><au>Vandekerckhove, Joël</au><au>Eidhammer, Ingvar</au><au>Gevaert, Kris</au><au>Martens, Lennart</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Implementation and application of a versatile clustering tool for tandem mass spectrometry data</atitle><jtitle>Proteomics (Weinheim)</jtitle><addtitle>Proteomics</addtitle><date>2007-09-01</date><risdate>2007</risdate><volume>7</volume><issue>18</issue><spage>3245</spage><epage>3258</epage><pages>3245-3258</pages><issn>1615-9853</issn><eissn>1615-9861</eissn><abstract>High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.</abstract><cop>Weinheim</cop><pub>Wiley-VCH Verlag</pub><pmid>17708593</pmid><doi>10.1002/pmic.200700160</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1615-9853 |
ispartof | Proteomics (Weinheim), 2007-09, Vol.7 (18), p.3245-3258 |
issn | 1615-9853 1615-9861 |
language | eng |
recordid | cdi_proquest_miscellaneous_68283049 |
source | MEDLINE; Access via Wiley Online Library |
subjects | Algorithms Amino Acid Sequence Analytical, structural and metabolic biochemistry Bioinformatics Biological and medical sciences Cell Line, Tumor Cluster Analysis Fundamental and applied biological sciences. Psychology Humans Mass spectrometry Miscellaneous Molecular Sequence Data Proteins Proteomics Spectrum clustering Tandem Mass Spectrometry - methods |
title | Implementation and application of a versatile clustering tool for tandem mass spectrometry data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T05%3A05%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Implementation%20and%20application%20of%20a%20versatile%20clustering%20tool%20for%20tandem%20mass%20spectrometry%20data&rft.jtitle=Proteomics%20(Weinheim)&rft.au=Flikka,%20Kristian&rft.date=2007-09-01&rft.volume=7&rft.issue=18&rft.spage=3245&rft.epage=3258&rft.pages=3245-3258&rft.issn=1615-9853&rft.eissn=1615-9861&rft_id=info:doi/10.1002/pmic.200700160&rft_dat=%3Cproquest_cross%3E68283049%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=68283049&rft_id=info:pmid/17708593&rfr_iscdi=true |