NovoLign: metaproteomics by sequence alignment

Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	ISME Communications 2024-01, Vol.4 (1), p.ycae121
Hauptverfasser:	Kleikamp, Hugo B C, van der Zwaan, Ramon, van Valderen, Ramon, van Ede, Jitske M, Pronk, Mario, Schaasberg, Pim, Allaart, Maximilienne T, van Loosdrecht, Mark C M, Pabst, Martin
Format:	Artikel
Sprache:	eng
Schlagworte:	Original
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue	1
container_start_page	ycae121
container_title	ISME Communications
container_volume	4
creator	Kleikamp, Hugo B C van der Zwaan, Ramon van Valderen, Ramon van Ede, Jitske M Pronk, Mario Schaasberg, Pim Allaart, Maximilienne T van Loosdrecht, Mark C M Pabst, Martin
description	Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.
doi_str_mv	10.1093/ismeco/ycae121
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11530927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3123807386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</originalsourceid><addsrcrecordid>eNpVkM1PwzAMxSMEYmjsyhH1yKVbnLRNwwWhiS9pggucozR1RlHTjKabtP-eoI1pnGzJP_v5PUKugE6BSj5rgkPjZ1ujERickAsmOE0LyOH0qB-RSQhflFKWA2cA52TEZSZ5IeCCTF_9xi-aZXebOBz0qvcDeteYkFTbJOD3GjuDiW4j4bAbLsmZ1W3Ayb6Oycfjw_v8OV28Pb3M7xepYaIYUilR1nWmtWHWSgrC2lwgSJpDJrMiKgPQvLSiplUmgZbWVpqVLJPG1KBrPiZ3u7urdeWwNlG6161a9Y3T_VZ53aj_k675VEu_UQA5pzJ6H5Ob_YXeRxdhUK4JBttWd-jXQXFgvKSCl0VEpzvU9D6EHu1BB6j6DVrtglb7oOPC9fF3B_wvVv4DyfR8Bg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3123807386</pqid></control><display><type>article</type><title>NovoLign: metaproteomics by sequence alignment</title><source>Springer Nature OA Free Journals</source><source>Oxford Journals Open Access Collection</source><source>Nature Free</source><source>PubMed Central</source><creator>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</creator><creatorcontrib>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</creatorcontrib><description>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</description><identifier>ISSN: 2730-6151</identifier><identifier>EISSN: 2730-6151</identifier><identifier>DOI: 10.1093/ismeco/ycae121</identifier><identifier>PMID: 39493671</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Original</subject><ispartof>ISME Communications, 2024-01, Vol.4 (1), p.ycae121</ispartof><rights>The Author(s) 2024. Published by Oxford University Press on behalf of the International Society for Microbial Ecology.</rights><rights>The Author(s) 2024. Published by Oxford University Press on behalf of the International Society for Microbial Ecology. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530927/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530927/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39493671$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kleikamp, Hugo B C</creatorcontrib><creatorcontrib>van der Zwaan, Ramon</creatorcontrib><creatorcontrib>van Valderen, Ramon</creatorcontrib><creatorcontrib>van Ede, Jitske M</creatorcontrib><creatorcontrib>Pronk, Mario</creatorcontrib><creatorcontrib>Schaasberg, Pim</creatorcontrib><creatorcontrib>Allaart, Maximilienne T</creatorcontrib><creatorcontrib>van Loosdrecht, Mark C M</creatorcontrib><creatorcontrib>Pabst, Martin</creatorcontrib><title>NovoLign: metaproteomics by sequence alignment</title><title>ISME Communications</title><addtitle>ISME Commun</addtitle><description>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</description><subject>Original</subject><issn>2730-6151</issn><issn>2730-6151</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpVkM1PwzAMxSMEYmjsyhH1yKVbnLRNwwWhiS9pggucozR1RlHTjKabtP-eoI1pnGzJP_v5PUKugE6BSj5rgkPjZ1ujERickAsmOE0LyOH0qB-RSQhflFKWA2cA52TEZSZ5IeCCTF_9xi-aZXebOBz0qvcDeteYkFTbJOD3GjuDiW4j4bAbLsmZ1W3Ayb6Oycfjw_v8OV28Pb3M7xepYaIYUilR1nWmtWHWSgrC2lwgSJpDJrMiKgPQvLSiplUmgZbWVpqVLJPG1KBrPiZ3u7urdeWwNlG6161a9Y3T_VZ53aj_k675VEu_UQA5pzJ6H5Ob_YXeRxdhUK4JBttWd-jXQXFgvKSCl0VEpzvU9D6EHu1BB6j6DVrtglb7oOPC9fF3B_wvVv4DyfR8Bg</recordid><startdate>202401</startdate><enddate>202401</enddate><creator>Kleikamp, Hugo B C</creator><creator>van der Zwaan, Ramon</creator><creator>van Valderen, Ramon</creator><creator>van Ede, Jitske M</creator><creator>Pronk, Mario</creator><creator>Schaasberg, Pim</creator><creator>Allaart, Maximilienne T</creator><creator>van Loosdrecht, Mark C M</creator><creator>Pabst, Martin</creator><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>202401</creationdate><title>NovoLign: metaproteomics by sequence alignment</title><author>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Original</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kleikamp, Hugo B C</creatorcontrib><creatorcontrib>van der Zwaan, Ramon</creatorcontrib><creatorcontrib>van Valderen, Ramon</creatorcontrib><creatorcontrib>van Ede, Jitske M</creatorcontrib><creatorcontrib>Pronk, Mario</creatorcontrib><creatorcontrib>Schaasberg, Pim</creatorcontrib><creatorcontrib>Allaart, Maximilienne T</creatorcontrib><creatorcontrib>van Loosdrecht, Mark C M</creatorcontrib><creatorcontrib>Pabst, Martin</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>ISME Communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kleikamp, Hugo B C</au><au>van der Zwaan, Ramon</au><au>van Valderen, Ramon</au><au>van Ede, Jitske M</au><au>Pronk, Mario</au><au>Schaasberg, Pim</au><au>Allaart, Maximilienne T</au><au>van Loosdrecht, Mark C M</au><au>Pabst, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NovoLign: metaproteomics by sequence alignment</atitle><jtitle>ISME Communications</jtitle><addtitle>ISME Commun</addtitle><date>2024-01</date><risdate>2024</risdate><volume>4</volume><issue>1</issue><spage>ycae121</spage><pages>ycae121-</pages><issn>2730-6151</issn><eissn>2730-6151</eissn><abstract>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>39493671</pmid><doi>10.1093/ismeco/ycae121</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2730-6151
ispartof	ISME Communications, 2024-01, Vol.4 (1), p.ycae121
issn	2730-6151 2730-6151
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11530927
source	Springer Nature OA Free Journals; Oxford Journals Open Access Collection; Nature Free; PubMed Central
subjects	Original
title	NovoLign: metaproteomics by sequence alignment
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T16%3A31%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NovoLign:%20metaproteomics%20by%20sequence%20alignment&rft.jtitle=ISME%20Communications&rft.au=Kleikamp,%20Hugo%20B%20C&rft.date=2024-01&rft.volume=4&rft.issue=1&rft.spage=ycae121&rft.pages=ycae121-&rft.issn=2730-6151&rft.eissn=2730-6151&rft_id=info:doi/10.1093/ismeco/ycae121&rft_dat=%3Cproquest_pubme%3E3123807386%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3123807386&rft_id=info:pmid/39493671&rfr_iscdi=true