Inference of B cell clonal families using heavy/light chain pairing information

Next generation sequencing of B cell receptor (BCR) repertoires has become a ubiquitous tool for understanding the antibody-mediated immune response: it is now common to have large volumes of sequence data coding for both the heavy and light chain subunits of the BCR. However, until the recent devel...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS computational biology 2022-11, Vol.18 (11), p.e1010723-e1010723
Hauptverfasser: Ralph, Duncan K, Matsen, 4th, Frederick A
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e1010723
container_issue 11
container_start_page e1010723
container_title PLoS computational biology
container_volume 18
creator Ralph, Duncan K
Matsen, 4th, Frederick A
description Next generation sequencing of B cell receptor (BCR) repertoires has become a ubiquitous tool for understanding the antibody-mediated immune response: it is now common to have large volumes of sequence data coding for both the heavy and light chain subunits of the BCR. However, until the recent development of high throughput methods of preserving heavy/light chain pairing information, these samples contained no explicit information on which heavy chain sequence pairs with which light chain sequence. One of the first steps in analyzing such BCR repertoire samples is grouping sequences into clonally related families, where each stems from a single rearrangement event. Many methods of accomplishing this have been developed, however, none so far has taken full advantage of the newly-available pairing information. This information can dramatically improve clustering performance, especially for the light chain. The light chain has traditionally been challenging for clonal family inference because of its low diversity and consequent abundance of non-clonal families with indistinguishable naive rearrangements. Here we present a method of incorporating this pairing information into the clustering process in order to arrive at a more accurate partition of the data into clonally related families. We also demonstrate two methods of fixing imperfect pairing information, which may allow for simplified sample preparation and increased sequencing depth. Finally, we describe several other improvements to the partis software package.
doi_str_mv 10.1371/journal.pcbi.1010723
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2755183592</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A728964557</galeid><doaj_id>oai_doaj_org_article_dd03343b526e4b24a4c7866dfddfb81e</doaj_id><sourcerecordid>A728964557</sourcerecordid><originalsourceid>FETCH-LOGICAL-c661t-12251fa894602becc18f90c2901d9fe8acfb0e8dff4adc8c8ddbcdce1bcdff9a3</originalsourceid><addsrcrecordid>eNqVkllr3DAQx01paY72G5TW0Jf2YTe6Lb8U0tBjITTQ41nIOrxabGkr2aH59pG7TohLXopAGmZ-85dmNEXxCoI1xBU824Uxetmt96pxawggqBB-UhxDSvGqwpQ_fWAfFScp7QDIZs2eF0eYEQI54MfF1cZbE41Xpgy2_Fgq03Wl6kJWLq3sXedMKsfkfFtujby-Oetcux1KtZXOl3vp4hRx3obYy8EF_6J4ZmWXzMv5PC1-ff708-Lr6vLqy-bi_HKlGIPDCiJEoZW8JgygxigFua2BQjWAuraGS2UbYLi2lkituOJaN0orA_NubS3xafHmoLvvQhJzL5JAFaWQY1qjTGwOhA5yJ_bR9TLeiCCd-OsIsRUyDk51RmgNMCa4oYgZ0iAiiao4Y9pqbRsOTdb6MN82Nr3J7_BDlN1CdBnxbivacC3qCkPCWBZ4NwvE8Hs0aRC9S1OvpTdhnN5NEKM1wzCjb_9BH69uplqZC5g-IN-rJlFxXqH8y4TSKlPrR6i8tOmdCt5Yl_2LhPeLhMwM5s_QyjElsfnx_T_Yb0uWHFgVQ0rR2PveQSCmcb4rUkzjLOZxzmmvH_b9PulufvEtfVfyQA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2755183592</pqid></control><display><type>article</type><title>Inference of B cell clonal families using heavy/light chain pairing information</title><source>DOAJ Directory of Open Access Journals</source><source>Public Library of Science (PLoS) Journals Open Access</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Ralph, Duncan K ; Matsen, 4th, Frederick A</creator><contributor>Regoes, Roland R.</contributor><creatorcontrib>Ralph, Duncan K ; Matsen, 4th, Frederick A ; Regoes, Roland R.</creatorcontrib><description>Next generation sequencing of B cell receptor (BCR) repertoires has become a ubiquitous tool for understanding the antibody-mediated immune response: it is now common to have large volumes of sequence data coding for both the heavy and light chain subunits of the BCR. However, until the recent development of high throughput methods of preserving heavy/light chain pairing information, these samples contained no explicit information on which heavy chain sequence pairs with which light chain sequence. One of the first steps in analyzing such BCR repertoire samples is grouping sequences into clonally related families, where each stems from a single rearrangement event. Many methods of accomplishing this have been developed, however, none so far has taken full advantage of the newly-available pairing information. This information can dramatically improve clustering performance, especially for the light chain. The light chain has traditionally been challenging for clonal family inference because of its low diversity and consequent abundance of non-clonal families with indistinguishable naive rearrangements. Here we present a method of incorporating this pairing information into the clustering process in order to arrive at a more accurate partition of the data into clonally related families. We also demonstrate two methods of fixing imperfect pairing information, which may allow for simplified sample preparation and increased sequencing depth. Finally, we describe several other improvements to the partis software package.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1010723</identifier><identifier>PMID: 36441808</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Analysis ; Antibodies ; Antigens ; B cells ; B-cell receptor ; Biology and Life Sciences ; Chains ; Clustering ; Computer and Information Sciences ; DNA sequencing ; Genes ; Genetic aspects ; Identification and classification ; Immune response ; Inference ; Information processing ; Light ; Medicine and Health Sciences ; Methods ; Mutation ; Next-generation sequencing ; Nucleotide sequencing ; Physical sciences ; Research and Analysis Methods ; Sample preparation ; Viral antibodies</subject><ispartof>PLoS computational biology, 2022-11, Vol.18 (11), p.e1010723-e1010723</ispartof><rights>Copyright: © 2022 Ralph, Matsen. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</rights><rights>COPYRIGHT 2022 Public Library of Science</rights><rights>2022 Ralph, Matsen. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2022 Ralph, Matsen 2022 Ralph, Matsen</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c661t-12251fa894602becc18f90c2901d9fe8acfb0e8dff4adc8c8ddbcdce1bcdff9a3</citedby><cites>FETCH-LOGICAL-c661t-12251fa894602becc18f90c2901d9fe8acfb0e8dff4adc8c8ddbcdce1bcdff9a3</cites><orcidid>0000-0003-0607-6025 ; 0000-0002-2527-8610</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9731466/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9731466/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,2102,2928,23866,27924,27925,53791,53793,79600,79601</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36441808$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Regoes, Roland R.</contributor><creatorcontrib>Ralph, Duncan K</creatorcontrib><creatorcontrib>Matsen, 4th, Frederick A</creatorcontrib><title>Inference of B cell clonal families using heavy/light chain pairing information</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>Next generation sequencing of B cell receptor (BCR) repertoires has become a ubiquitous tool for understanding the antibody-mediated immune response: it is now common to have large volumes of sequence data coding for both the heavy and light chain subunits of the BCR. However, until the recent development of high throughput methods of preserving heavy/light chain pairing information, these samples contained no explicit information on which heavy chain sequence pairs with which light chain sequence. One of the first steps in analyzing such BCR repertoire samples is grouping sequences into clonally related families, where each stems from a single rearrangement event. Many methods of accomplishing this have been developed, however, none so far has taken full advantage of the newly-available pairing information. This information can dramatically improve clustering performance, especially for the light chain. The light chain has traditionally been challenging for clonal family inference because of its low diversity and consequent abundance of non-clonal families with indistinguishable naive rearrangements. Here we present a method of incorporating this pairing information into the clustering process in order to arrive at a more accurate partition of the data into clonally related families. We also demonstrate two methods of fixing imperfect pairing information, which may allow for simplified sample preparation and increased sequencing depth. Finally, we describe several other improvements to the partis software package.</description><subject>Analysis</subject><subject>Antibodies</subject><subject>Antigens</subject><subject>B cells</subject><subject>B-cell receptor</subject><subject>Biology and Life Sciences</subject><subject>Chains</subject><subject>Clustering</subject><subject>Computer and Information Sciences</subject><subject>DNA sequencing</subject><subject>Genes</subject><subject>Genetic aspects</subject><subject>Identification and classification</subject><subject>Immune response</subject><subject>Inference</subject><subject>Information processing</subject><subject>Light</subject><subject>Medicine and Health Sciences</subject><subject>Methods</subject><subject>Mutation</subject><subject>Next-generation sequencing</subject><subject>Nucleotide sequencing</subject><subject>Physical sciences</subject><subject>Research and Analysis Methods</subject><subject>Sample preparation</subject><subject>Viral antibodies</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>DOA</sourceid><recordid>eNqVkllr3DAQx01paY72G5TW0Jf2YTe6Lb8U0tBjITTQ41nIOrxabGkr2aH59pG7TohLXopAGmZ-85dmNEXxCoI1xBU824Uxetmt96pxawggqBB-UhxDSvGqwpQ_fWAfFScp7QDIZs2eF0eYEQI54MfF1cZbE41Xpgy2_Fgq03Wl6kJWLq3sXedMKsfkfFtujby-Oetcux1KtZXOl3vp4hRx3obYy8EF_6J4ZmWXzMv5PC1-ff708-Lr6vLqy-bi_HKlGIPDCiJEoZW8JgygxigFua2BQjWAuraGS2UbYLi2lkituOJaN0orA_NubS3xafHmoLvvQhJzL5JAFaWQY1qjTGwOhA5yJ_bR9TLeiCCd-OsIsRUyDk51RmgNMCa4oYgZ0iAiiao4Y9pqbRsOTdb6MN82Nr3J7_BDlN1CdBnxbivacC3qCkPCWBZ4NwvE8Hs0aRC9S1OvpTdhnN5NEKM1wzCjb_9BH69uplqZC5g-IN-rJlFxXqH8y4TSKlPrR6i8tOmdCt5Yl_2LhPeLhMwM5s_QyjElsfnx_T_Yb0uWHFgVQ0rR2PveQSCmcb4rUkzjLOZxzmmvH_b9PulufvEtfVfyQA</recordid><startdate>20221101</startdate><enddate>20221101</enddate><creator>Ralph, Duncan K</creator><creator>Matsen, 4th, Frederick A</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>COVID</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>LK8</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-0607-6025</orcidid><orcidid>https://orcid.org/0000-0002-2527-8610</orcidid></search><sort><creationdate>20221101</creationdate><title>Inference of B cell clonal families using heavy/light chain pairing information</title><author>Ralph, Duncan K ; Matsen, 4th, Frederick A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c661t-12251fa894602becc18f90c2901d9fe8acfb0e8dff4adc8c8ddbcdce1bcdff9a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Analysis</topic><topic>Antibodies</topic><topic>Antigens</topic><topic>B cells</topic><topic>B-cell receptor</topic><topic>Biology and Life Sciences</topic><topic>Chains</topic><topic>Clustering</topic><topic>Computer and Information Sciences</topic><topic>DNA sequencing</topic><topic>Genes</topic><topic>Genetic aspects</topic><topic>Identification and classification</topic><topic>Immune response</topic><topic>Inference</topic><topic>Information processing</topic><topic>Light</topic><topic>Medicine and Health Sciences</topic><topic>Methods</topic><topic>Mutation</topic><topic>Next-generation sequencing</topic><topic>Nucleotide sequencing</topic><topic>Physical sciences</topic><topic>Research and Analysis Methods</topic><topic>Sample preparation</topic><topic>Viral antibodies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ralph, Duncan K</creatorcontrib><creatorcontrib>Matsen, 4th, Frederick A</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>Coronavirus Research Database</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ralph, Duncan K</au><au>Matsen, 4th, Frederick A</au><au>Regoes, Roland R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Inference of B cell clonal families using heavy/light chain pairing information</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2022-11-01</date><risdate>2022</risdate><volume>18</volume><issue>11</issue><spage>e1010723</spage><epage>e1010723</epage><pages>e1010723-e1010723</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>Next generation sequencing of B cell receptor (BCR) repertoires has become a ubiquitous tool for understanding the antibody-mediated immune response: it is now common to have large volumes of sequence data coding for both the heavy and light chain subunits of the BCR. However, until the recent development of high throughput methods of preserving heavy/light chain pairing information, these samples contained no explicit information on which heavy chain sequence pairs with which light chain sequence. One of the first steps in analyzing such BCR repertoire samples is grouping sequences into clonally related families, where each stems from a single rearrangement event. Many methods of accomplishing this have been developed, however, none so far has taken full advantage of the newly-available pairing information. This information can dramatically improve clustering performance, especially for the light chain. The light chain has traditionally been challenging for clonal family inference because of its low diversity and consequent abundance of non-clonal families with indistinguishable naive rearrangements. Here we present a method of incorporating this pairing information into the clustering process in order to arrive at a more accurate partition of the data into clonally related families. We also demonstrate two methods of fixing imperfect pairing information, which may allow for simplified sample preparation and increased sequencing depth. Finally, we describe several other improvements to the partis software package.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>36441808</pmid><doi>10.1371/journal.pcbi.1010723</doi><tpages>e1010723</tpages><orcidid>https://orcid.org/0000-0003-0607-6025</orcidid><orcidid>https://orcid.org/0000-0002-2527-8610</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2022-11, Vol.18 (11), p.e1010723-e1010723
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_2755183592
source DOAJ Directory of Open Access Journals; Public Library of Science (PLoS) Journals Open Access; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Analysis
Antibodies
Antigens
B cells
B-cell receptor
Biology and Life Sciences
Chains
Clustering
Computer and Information Sciences
DNA sequencing
Genes
Genetic aspects
Identification and classification
Immune response
Inference
Information processing
Light
Medicine and Health Sciences
Methods
Mutation
Next-generation sequencing
Nucleotide sequencing
Physical sciences
Research and Analysis Methods
Sample preparation
Viral antibodies
title Inference of B cell clonal families using heavy/light chain pairing information
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T05%3A54%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Inference%20of%20B%20cell%20clonal%20families%20using%20heavy/light%20chain%20pairing%20information&rft.jtitle=PLoS%20computational%20biology&rft.au=Ralph,%20Duncan%20K&rft.date=2022-11-01&rft.volume=18&rft.issue=11&rft.spage=e1010723&rft.epage=e1010723&rft.pages=e1010723-e1010723&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1010723&rft_dat=%3Cgale_plos_%3EA728964557%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2755183592&rft_id=info:pmid/36441808&rft_galeid=A728964557&rft_doaj_id=oai_doaj_org_article_dd03343b526e4b24a4c7866dfddfb81e&rfr_iscdi=true