Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels
Abstract Over the last years, there has been a considerable expansion of genome-wide association studies (GWAS) for discovering biological pathways underlying pathological conditions or disease biomarkers. These GWAS are often limited to binary or quantitative traits analyzed through linear or logis...
Gespeichert in:
Veröffentlicht in: | NAR Genomics and Bioinformatics 2023-06, Vol.5 (2), p.lqad062-lqad062 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | lqad062 |
---|---|
container_issue | 2 |
container_start_page | lqad062 |
container_title | NAR Genomics and Bioinformatics |
container_volume | 5 |
creator | Munsch, Gaëlle Proust, Carole Labrouche-Colomer, Sylvie Aïssi, Dylan Boland, Anne Morange, Pierre-Emmanuel Roche, Anne de Chaisemartin, Luc Harroche, Annie Olaso, Robert Deleuze, Jean-François James, Chloé Emmerich, Joseph Smadja, David M Jacqmin-Gadda, Hélène Trégouët, David-Alexandre |
description | Abstract
Over the last years, there has been a considerable expansion of genome-wide association studies (GWAS) for discovering biological pathways underlying pathological conditions or disease biomarkers. These GWAS are often limited to binary or quantitative traits analyzed through linear or logistic models, respectively. In some situations, the distribution of the outcome may require more complex modeling, such as when the outcome exhibits a semicontinuous distribution characterized by an excess of zero values followed by a non-negative and right-skewed distribution. We here investigate three different modeling for semicontinuous data: Tobit, Negative Binomial and Compound Poisson-Gamma. Using both simulated data and a real GWAS on Neutrophil Extracellular Traps (NETs), an emerging biomarker in immuno-thrombosis, we demonstrate that Compound Poisson-Gamma was the most robust model with respect to low allele frequencies and outliers. This model further identified the MIR155HG locus as significantly (P = 1.4 × 10−8) associated with NETs plasma levels in a sample of 657 participants, a locus recently highlighted to be involved in NETs formation in mice. This work highlights the importance of the modeling strategy for GWAS of a semicontinuous outcome and suggests Compound Poisson-Gamma as an elegant but neglected alternative to Negative Binomial for modeling semicontinuous outcome in the context of genomic investigations. |
doi_str_mv | 10.1093/nargab/lqad062 |
format | Article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10304785</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A777499783</galeid><oup_id>10.1093/nargab/lqad062</oup_id><sourcerecordid>A777499783</sourcerecordid><originalsourceid>FETCH-LOGICAL-c526t-ddea601a7ffed883ea4f7716a5048625d0e4436fe4dec2e1b09a7fb9b3d3f8713</originalsourceid><addsrcrecordid>eNqFUk1v1DAQjRCIVqVXjshHOKT1RxI7XNCqKi3SCi7lbDnxJGvkxKmdbNlfxN_E2WyXhQvyweOZN28-_JLkLcFXBJfsule-VdW1fVQaF_RFck4LRtKSFuLliX2WXIbwA2NM8yzPMHmdnDHOhBCkPE9-3UHvOkifjAakQnC1UaNxPQrjpHfINUihAJ2pXT-afnJTQKNXZvyIjLVTiPYeHXHjBpDpBlWPz6_OabCmb9EeBu0uer2b2s0-eizwFabRu2FjLLr9GZE1RGarPHrwagjIwhZseJO8apQNcHm4L5Lvn28fbu7T9be7LzerdVrntBhTrUEVmCjeNKCFYKCyhnNSqBxnoqC5xpBlrGgg01BTIBUuI7YqK6ZZIzhhF8mnhXeYqg50DX3syMrBm075nXTKyL8jvdnI1m0lwQxnXOSR4cPCsPkn7361lrMPZ5RiXOLtXO39oZp3jxOEUXYmzPOrHuKqJRWM5pyRgkXo1QJtlQVp-sbNq4pHL78DjYn-Fec8K0suThJq70Lw0BybIVjO8pGLfORBPjHh3enoR_izWP5M5qbhf2S_AZ9w1nw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2832573163</pqid></control><display><type>article</type><title>Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels</title><source>DOAJ Directory of Open Access Journals</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Munsch, Gaëlle ; Proust, Carole ; Labrouche-Colomer, Sylvie ; Aïssi, Dylan ; Boland, Anne ; Morange, Pierre-Emmanuel ; Roche, Anne ; de Chaisemartin, Luc ; Harroche, Annie ; Olaso, Robert ; Deleuze, Jean-François ; James, Chloé ; Emmerich, Joseph ; Smadja, David M ; Jacqmin-Gadda, Hélène ; Trégouët, David-Alexandre</creator><creatorcontrib>Munsch, Gaëlle ; Proust, Carole ; Labrouche-Colomer, Sylvie ; Aïssi, Dylan ; Boland, Anne ; Morange, Pierre-Emmanuel ; Roche, Anne ; de Chaisemartin, Luc ; Harroche, Annie ; Olaso, Robert ; Deleuze, Jean-François ; James, Chloé ; Emmerich, Joseph ; Smadja, David M ; Jacqmin-Gadda, Hélène ; Trégouët, David-Alexandre</creatorcontrib><description>Abstract
Over the last years, there has been a considerable expansion of genome-wide association studies (GWAS) for discovering biological pathways underlying pathological conditions or disease biomarkers. These GWAS are often limited to binary or quantitative traits analyzed through linear or logistic models, respectively. In some situations, the distribution of the outcome may require more complex modeling, such as when the outcome exhibits a semicontinuous distribution characterized by an excess of zero values followed by a non-negative and right-skewed distribution. We here investigate three different modeling for semicontinuous data: Tobit, Negative Binomial and Compound Poisson-Gamma. Using both simulated data and a real GWAS on Neutrophil Extracellular Traps (NETs), an emerging biomarker in immuno-thrombosis, we demonstrate that Compound Poisson-Gamma was the most robust model with respect to low allele frequencies and outliers. This model further identified the MIR155HG locus as significantly (P = 1.4 × 10−8) associated with NETs plasma levels in a sample of 657 participants, a locus recently highlighted to be involved in NETs formation in mice. This work highlights the importance of the modeling strategy for GWAS of a semicontinuous outcome and suggests Compound Poisson-Gamma as an elegant but neglected alternative to Negative Binomial for modeling semicontinuous outcome in the context of genomic investigations.</description><identifier>ISSN: 2631-9268</identifier><identifier>EISSN: 2631-9268</identifier><identifier>DOI: 10.1093/nargab/lqad062</identifier><identifier>PMID: 37388819</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Analysis ; Genomes ; Genomics ; Life Sciences ; Santé publique et épidémiologie</subject><ispartof>NAR Genomics and Bioinformatics, 2023-06, Vol.5 (2), p.lqad062-lqad062</ispartof><rights>The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.</rights><rights>COPYRIGHT 2023 Oxford University Press</rights><rights>Attribution - NonCommercial</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c526t-ddea601a7ffed883ea4f7716a5048625d0e4436fe4dec2e1b09a7fb9b3d3f8713</citedby><cites>FETCH-LOGICAL-c526t-ddea601a7ffed883ea4f7716a5048625d0e4436fe4dec2e1b09a7fb9b3d3f8713</cites><orcidid>0000-0001-8973-1536 ; 0000-0002-5358-4463 ; 0000-0003-1564-9825</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10304785/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10304785/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,1604,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37388819$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-04220090$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Munsch, Gaëlle</creatorcontrib><creatorcontrib>Proust, Carole</creatorcontrib><creatorcontrib>Labrouche-Colomer, Sylvie</creatorcontrib><creatorcontrib>Aïssi, Dylan</creatorcontrib><creatorcontrib>Boland, Anne</creatorcontrib><creatorcontrib>Morange, Pierre-Emmanuel</creatorcontrib><creatorcontrib>Roche, Anne</creatorcontrib><creatorcontrib>de Chaisemartin, Luc</creatorcontrib><creatorcontrib>Harroche, Annie</creatorcontrib><creatorcontrib>Olaso, Robert</creatorcontrib><creatorcontrib>Deleuze, Jean-François</creatorcontrib><creatorcontrib>James, Chloé</creatorcontrib><creatorcontrib>Emmerich, Joseph</creatorcontrib><creatorcontrib>Smadja, David M</creatorcontrib><creatorcontrib>Jacqmin-Gadda, Hélène</creatorcontrib><creatorcontrib>Trégouët, David-Alexandre</creatorcontrib><title>Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels</title><title>NAR Genomics and Bioinformatics</title><addtitle>NAR Genom Bioinform</addtitle><description>Abstract
Over the last years, there has been a considerable expansion of genome-wide association studies (GWAS) for discovering biological pathways underlying pathological conditions or disease biomarkers. These GWAS are often limited to binary or quantitative traits analyzed through linear or logistic models, respectively. In some situations, the distribution of the outcome may require more complex modeling, such as when the outcome exhibits a semicontinuous distribution characterized by an excess of zero values followed by a non-negative and right-skewed distribution. We here investigate three different modeling for semicontinuous data: Tobit, Negative Binomial and Compound Poisson-Gamma. Using both simulated data and a real GWAS on Neutrophil Extracellular Traps (NETs), an emerging biomarker in immuno-thrombosis, we demonstrate that Compound Poisson-Gamma was the most robust model with respect to low allele frequencies and outliers. This model further identified the MIR155HG locus as significantly (P = 1.4 × 10−8) associated with NETs plasma levels in a sample of 657 participants, a locus recently highlighted to be involved in NETs formation in mice. This work highlights the importance of the modeling strategy for GWAS of a semicontinuous outcome and suggests Compound Poisson-Gamma as an elegant but neglected alternative to Negative Binomial for modeling semicontinuous outcome in the context of genomic investigations.</description><subject>Analysis</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Life Sciences</subject><subject>Santé publique et épidémiologie</subject><issn>2631-9268</issn><issn>2631-9268</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFUk1v1DAQjRCIVqVXjshHOKT1RxI7XNCqKi3SCi7lbDnxJGvkxKmdbNlfxN_E2WyXhQvyweOZN28-_JLkLcFXBJfsule-VdW1fVQaF_RFck4LRtKSFuLliX2WXIbwA2NM8yzPMHmdnDHOhBCkPE9-3UHvOkifjAakQnC1UaNxPQrjpHfINUihAJ2pXT-afnJTQKNXZvyIjLVTiPYeHXHjBpDpBlWPz6_OabCmb9EeBu0uer2b2s0-eizwFabRu2FjLLr9GZE1RGarPHrwagjIwhZseJO8apQNcHm4L5Lvn28fbu7T9be7LzerdVrntBhTrUEVmCjeNKCFYKCyhnNSqBxnoqC5xpBlrGgg01BTIBUuI7YqK6ZZIzhhF8mnhXeYqg50DX3syMrBm075nXTKyL8jvdnI1m0lwQxnXOSR4cPCsPkn7361lrMPZ5RiXOLtXO39oZp3jxOEUXYmzPOrHuKqJRWM5pyRgkXo1QJtlQVp-sbNq4pHL78DjYn-Fec8K0suThJq70Lw0BybIVjO8pGLfORBPjHh3enoR_izWP5M5qbhf2S_AZ9w1nw</recordid><startdate>20230601</startdate><enddate>20230601</enddate><creator>Munsch, Gaëlle</creator><creator>Proust, Carole</creator><creator>Labrouche-Colomer, Sylvie</creator><creator>Aïssi, Dylan</creator><creator>Boland, Anne</creator><creator>Morange, Pierre-Emmanuel</creator><creator>Roche, Anne</creator><creator>de Chaisemartin, Luc</creator><creator>Harroche, Annie</creator><creator>Olaso, Robert</creator><creator>Deleuze, Jean-François</creator><creator>James, Chloé</creator><creator>Emmerich, Joseph</creator><creator>Smadja, David M</creator><creator>Jacqmin-Gadda, Hélène</creator><creator>Trégouët, David-Alexandre</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IAO</scope><scope>7X8</scope><scope>1XC</scope><scope>VOOES</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-8973-1536</orcidid><orcidid>https://orcid.org/0000-0002-5358-4463</orcidid><orcidid>https://orcid.org/0000-0003-1564-9825</orcidid></search><sort><creationdate>20230601</creationdate><title>Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels</title><author>Munsch, Gaëlle ; Proust, Carole ; Labrouche-Colomer, Sylvie ; Aïssi, Dylan ; Boland, Anne ; Morange, Pierre-Emmanuel ; Roche, Anne ; de Chaisemartin, Luc ; Harroche, Annie ; Olaso, Robert ; Deleuze, Jean-François ; James, Chloé ; Emmerich, Joseph ; Smadja, David M ; Jacqmin-Gadda, Hélène ; Trégouët, David-Alexandre</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c526t-ddea601a7ffed883ea4f7716a5048625d0e4436fe4dec2e1b09a7fb9b3d3f8713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Analysis</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Life Sciences</topic><topic>Santé publique et épidémiologie</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Munsch, Gaëlle</creatorcontrib><creatorcontrib>Proust, Carole</creatorcontrib><creatorcontrib>Labrouche-Colomer, Sylvie</creatorcontrib><creatorcontrib>Aïssi, Dylan</creatorcontrib><creatorcontrib>Boland, Anne</creatorcontrib><creatorcontrib>Morange, Pierre-Emmanuel</creatorcontrib><creatorcontrib>Roche, Anne</creatorcontrib><creatorcontrib>de Chaisemartin, Luc</creatorcontrib><creatorcontrib>Harroche, Annie</creatorcontrib><creatorcontrib>Olaso, Robert</creatorcontrib><creatorcontrib>Deleuze, Jean-François</creatorcontrib><creatorcontrib>James, Chloé</creatorcontrib><creatorcontrib>Emmerich, Joseph</creatorcontrib><creatorcontrib>Smadja, David M</creatorcontrib><creatorcontrib>Jacqmin-Gadda, Hélène</creatorcontrib><creatorcontrib>Trégouët, David-Alexandre</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale Academic OneFile</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>NAR Genomics and Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Munsch, Gaëlle</au><au>Proust, Carole</au><au>Labrouche-Colomer, Sylvie</au><au>Aïssi, Dylan</au><au>Boland, Anne</au><au>Morange, Pierre-Emmanuel</au><au>Roche, Anne</au><au>de Chaisemartin, Luc</au><au>Harroche, Annie</au><au>Olaso, Robert</au><au>Deleuze, Jean-François</au><au>James, Chloé</au><au>Emmerich, Joseph</au><au>Smadja, David M</au><au>Jacqmin-Gadda, Hélène</au><au>Trégouët, David-Alexandre</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels</atitle><jtitle>NAR Genomics and Bioinformatics</jtitle><addtitle>NAR Genom Bioinform</addtitle><date>2023-06-01</date><risdate>2023</risdate><volume>5</volume><issue>2</issue><spage>lqad062</spage><epage>lqad062</epage><pages>lqad062-lqad062</pages><issn>2631-9268</issn><eissn>2631-9268</eissn><abstract>Abstract
Over the last years, there has been a considerable expansion of genome-wide association studies (GWAS) for discovering biological pathways underlying pathological conditions or disease biomarkers. These GWAS are often limited to binary or quantitative traits analyzed through linear or logistic models, respectively. In some situations, the distribution of the outcome may require more complex modeling, such as when the outcome exhibits a semicontinuous distribution characterized by an excess of zero values followed by a non-negative and right-skewed distribution. We here investigate three different modeling for semicontinuous data: Tobit, Negative Binomial and Compound Poisson-Gamma. Using both simulated data and a real GWAS on Neutrophil Extracellular Traps (NETs), an emerging biomarker in immuno-thrombosis, we demonstrate that Compound Poisson-Gamma was the most robust model with respect to low allele frequencies and outliers. This model further identified the MIR155HG locus as significantly (P = 1.4 × 10−8) associated with NETs plasma levels in a sample of 657 participants, a locus recently highlighted to be involved in NETs formation in mice. This work highlights the importance of the modeling strategy for GWAS of a semicontinuous outcome and suggests Compound Poisson-Gamma as an elegant but neglected alternative to Negative Binomial for modeling semicontinuous outcome in the context of genomic investigations.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>37388819</pmid><doi>10.1093/nargab/lqad062</doi><orcidid>https://orcid.org/0000-0001-8973-1536</orcidid><orcidid>https://orcid.org/0000-0002-5358-4463</orcidid><orcidid>https://orcid.org/0000-0003-1564-9825</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2631-9268 |
ispartof | NAR Genomics and Bioinformatics, 2023-06, Vol.5 (2), p.lqad062-lqad062 |
issn | 2631-9268 2631-9268 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10304785 |
source | DOAJ Directory of Open Access Journals; Oxford Journals Open Access Collection; PubMed Central |
subjects | Analysis Genomes Genomics Life Sciences Santé publique et épidémiologie |
title | Genome-wide association study of a semicontinuous trait: illustration of the impact of the modeling strategy through the study of Neutrophil Extracellular Traps levels |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A08%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genome-wide%20association%20study%20of%20a%20semicontinuous%20trait:%20illustration%20of%20the%20impact%20of%20the%20modeling%20strategy%20through%20the%20study%20of%20Neutrophil%20Extracellular%20Traps%20levels&rft.jtitle=NAR%20Genomics%20and%20Bioinformatics&rft.au=Munsch,%20Ga%C3%ABlle&rft.date=2023-06-01&rft.volume=5&rft.issue=2&rft.spage=lqad062&rft.epage=lqad062&rft.pages=lqad062-lqad062&rft.issn=2631-9268&rft.eissn=2631-9268&rft_id=info:doi/10.1093/nargab/lqad062&rft_dat=%3Cgale_pubme%3EA777499783%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2832573163&rft_id=info:pmid/37388819&rft_galeid=A777499783&rft_oup_id=10.1093/nargab/lqad062&rfr_iscdi=true |