NovoLign: metaproteomics by sequence alignment

Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ISME Communications 2024-01, Vol.4 (1), p.ycae121
Hauptverfasser: Kleikamp, Hugo B C, van der Zwaan, Ramon, van Valderen, Ramon, van Ede, Jitske M, Pronk, Mario, Schaasberg, Pim, Allaart, Maximilienne T, van Loosdrecht, Mark C M, Pabst, Martin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 1
container_start_page ycae121
container_title ISME Communications
container_volume 4
creator Kleikamp, Hugo B C
van der Zwaan, Ramon
van Valderen, Ramon
van Ede, Jitske M
Pronk, Mario
Schaasberg, Pim
Allaart, Maximilienne T
van Loosdrecht, Mark C M
Pabst, Martin
description Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.
doi_str_mv 10.1093/ismeco/ycae121
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11530927</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3123807386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</originalsourceid><addsrcrecordid>eNpVkM1PwzAMxSMEYmjsyhH1yKVbnLRNwwWhiS9pggucozR1RlHTjKabtP-eoI1pnGzJP_v5PUKugE6BSj5rgkPjZ1ujERickAsmOE0LyOH0qB-RSQhflFKWA2cA52TEZSZ5IeCCTF_9xi-aZXebOBz0qvcDeteYkFTbJOD3GjuDiW4j4bAbLsmZ1W3Ayb6Oycfjw_v8OV28Pb3M7xepYaIYUilR1nWmtWHWSgrC2lwgSJpDJrMiKgPQvLSiplUmgZbWVpqVLJPG1KBrPiZ3u7urdeWwNlG6161a9Y3T_VZ53aj_k675VEu_UQA5pzJ6H5Ob_YXeRxdhUK4JBttWd-jXQXFgvKSCl0VEpzvU9D6EHu1BB6j6DVrtglb7oOPC9fF3B_wvVv4DyfR8Bg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3123807386</pqid></control><display><type>article</type><title>NovoLign: metaproteomics by sequence alignment</title><source>Springer Nature OA Free Journals</source><source>Oxford Journals Open Access Collection</source><source>Nature Free</source><source>PubMed Central</source><creator>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</creator><creatorcontrib>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</creatorcontrib><description>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</description><identifier>ISSN: 2730-6151</identifier><identifier>EISSN: 2730-6151</identifier><identifier>DOI: 10.1093/ismeco/ycae121</identifier><identifier>PMID: 39493671</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Original</subject><ispartof>ISME Communications, 2024-01, Vol.4 (1), p.ycae121</ispartof><rights>The Author(s) 2024. Published by Oxford University Press on behalf of the International Society for Microbial Ecology.</rights><rights>The Author(s) 2024. Published by Oxford University Press on behalf of the International Society for Microbial Ecology. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530927/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530927/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39493671$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kleikamp, Hugo B C</creatorcontrib><creatorcontrib>van der Zwaan, Ramon</creatorcontrib><creatorcontrib>van Valderen, Ramon</creatorcontrib><creatorcontrib>van Ede, Jitske M</creatorcontrib><creatorcontrib>Pronk, Mario</creatorcontrib><creatorcontrib>Schaasberg, Pim</creatorcontrib><creatorcontrib>Allaart, Maximilienne T</creatorcontrib><creatorcontrib>van Loosdrecht, Mark C M</creatorcontrib><creatorcontrib>Pabst, Martin</creatorcontrib><title>NovoLign: metaproteomics by sequence alignment</title><title>ISME Communications</title><addtitle>ISME Commun</addtitle><description>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</description><subject>Original</subject><issn>2730-6151</issn><issn>2730-6151</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpVkM1PwzAMxSMEYmjsyhH1yKVbnLRNwwWhiS9pggucozR1RlHTjKabtP-eoI1pnGzJP_v5PUKugE6BSj5rgkPjZ1ujERickAsmOE0LyOH0qB-RSQhflFKWA2cA52TEZSZ5IeCCTF_9xi-aZXebOBz0qvcDeteYkFTbJOD3GjuDiW4j4bAbLsmZ1W3Ayb6Oycfjw_v8OV28Pb3M7xepYaIYUilR1nWmtWHWSgrC2lwgSJpDJrMiKgPQvLSiplUmgZbWVpqVLJPG1KBrPiZ3u7urdeWwNlG6161a9Y3T_VZ53aj_k675VEu_UQA5pzJ6H5Ob_YXeRxdhUK4JBttWd-jXQXFgvKSCl0VEpzvU9D6EHu1BB6j6DVrtglb7oOPC9fF3B_wvVv4DyfR8Bg</recordid><startdate>202401</startdate><enddate>202401</enddate><creator>Kleikamp, Hugo B C</creator><creator>van der Zwaan, Ramon</creator><creator>van Valderen, Ramon</creator><creator>van Ede, Jitske M</creator><creator>Pronk, Mario</creator><creator>Schaasberg, Pim</creator><creator>Allaart, Maximilienne T</creator><creator>van Loosdrecht, Mark C M</creator><creator>Pabst, Martin</creator><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>202401</creationdate><title>NovoLign: metaproteomics by sequence alignment</title><author>Kleikamp, Hugo B C ; van der Zwaan, Ramon ; van Valderen, Ramon ; van Ede, Jitske M ; Pronk, Mario ; Schaasberg, Pim ; Allaart, Maximilienne T ; van Loosdrecht, Mark C M ; Pabst, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c276t-99e9dd4aac2ff9017ff57e19051494667111058f7d0b49108ffba28249ccd1ad3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Original</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kleikamp, Hugo B C</creatorcontrib><creatorcontrib>van der Zwaan, Ramon</creatorcontrib><creatorcontrib>van Valderen, Ramon</creatorcontrib><creatorcontrib>van Ede, Jitske M</creatorcontrib><creatorcontrib>Pronk, Mario</creatorcontrib><creatorcontrib>Schaasberg, Pim</creatorcontrib><creatorcontrib>Allaart, Maximilienne T</creatorcontrib><creatorcontrib>van Loosdrecht, Mark C M</creatorcontrib><creatorcontrib>Pabst, Martin</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>ISME Communications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kleikamp, Hugo B C</au><au>van der Zwaan, Ramon</au><au>van Valderen, Ramon</au><au>van Ede, Jitske M</au><au>Pronk, Mario</au><au>Schaasberg, Pim</au><au>Allaart, Maximilienne T</au><au>van Loosdrecht, Mark C M</au><au>Pabst, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>NovoLign: metaproteomics by sequence alignment</atitle><jtitle>ISME Communications</jtitle><addtitle>ISME Commun</addtitle><date>2024-01</date><risdate>2024</risdate><volume>4</volume><issue>1</issue><spage>ycae121</spage><pages>ycae121-</pages><issn>2730-6151</issn><eissn>2730-6151</eissn><abstract>Tremendous advances in mass spectrometric and bioinformatic approaches have expanded proteomics into the field of microbial ecology. The commonly used spectral annotation method for metaproteomics data relies on database searching, which requires sample-specific databases obtained from whole metagenome sequencing experiments. However, creating these databases is complex, time-consuming, and prone to errors, potentially biasing experimental outcomes and conclusions. This asks for alternative approaches that can provide rapid and orthogonal insights into metaproteomics data. Here, we present NovoLign, a metaproteomics pipeline that performs sequence alignment of sequences from complete metaproteomics experiments. The pipeline enables rapid taxonomic profiling of complex communities and evaluates the taxonomic coverage of metaproteomics outcomes obtained from database searches. Furthermore, the NovoLign pipeline supports the creation of reference sequence databases for database searching to ensure comprehensive coverage. We assessed the NovoLign pipeline for taxonomic coverage and false positive annotations using a wide range of and experimental data, including pure reference strains, laboratory enrichment cultures, synthetic communities, and environmental microbial communities. In summary, we present NovoLign, a metaproteomics pipeline that employs large-scale sequence alignment to enable rapid taxonomic profiling, evaluation of database searching outcomes, and the creation of reference sequence databases. The NovoLign pipeline is publicly available via: https://github.com/hbckleikamp/NovoLign.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>39493671</pmid><doi>10.1093/ismeco/ycae121</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2730-6151
ispartof ISME Communications, 2024-01, Vol.4 (1), p.ycae121
issn 2730-6151
2730-6151
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11530927
source Springer Nature OA Free Journals; Oxford Journals Open Access Collection; Nature Free; PubMed Central
subjects Original
title NovoLign: metaproteomics by sequence alignment
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T16%3A31%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=NovoLign:%20metaproteomics%20by%20sequence%20alignment&rft.jtitle=ISME%20Communications&rft.au=Kleikamp,%20Hugo%20B%20C&rft.date=2024-01&rft.volume=4&rft.issue=1&rft.spage=ycae121&rft.pages=ycae121-&rft.issn=2730-6151&rft.eissn=2730-6151&rft_id=info:doi/10.1093/ismeco/ycae121&rft_dat=%3Cproquest_pubme%3E3123807386%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3123807386&rft_id=info:pmid/39493671&rfr_iscdi=true