Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale
Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from m...
Gespeichert in:
Veröffentlicht in: | Genetic epidemiology 2020-04, Vol.44 (3), p.248-260 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 260 |
---|---|
container_issue | 3 |
container_start_page | 248 |
container_title | Genetic epidemiology |
container_volume | 44 |
creator | German, Christopher A. Sinsheimer, Janet S. Klimentidis, Yann C. Zhou, Hua Zhou, Jin J. |
description | Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait. |
doi_str_mv | 10.1002/gepi.22276 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_crossref_primary_10_1002_gepi_22276</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2374085999</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</originalsourceid><addsrcrecordid>eNp9kU1rFTEUhoMo9lrd-AMk4EaEqfmYj2QjaKltoVAXug5nkjO3qTPJmMwo99-b662lunCV5OTh4ZzzEvKSsxPOmHi3xdmfCCG69hHZcKZVVe7iMdmwruYVk7o5Is9yvmWM81o3T8mR5KrTWrENGa6Tw4SOTuu4-BAnDyNNuE2Ys4-BDjHRLQZcvKWQc7Qeln0dAoy77DONA43J-fKk8w2GuOxmzBQW-tHHHsI3mi2M-Jw8GWDM-OLuPCZfP519Ob2orq7PL08_XFW2rlVbtRJb5rjl4LSVaLUYpOjRDr2zMIDtJe-gTKG6tmE1DEqBc4LzVtZCatfKY_L-4J3XfkJnMSwJRjMnP0HamQje_P0T_I3Zxh9GiaatG1YEb-4EKX5fMS9m8tniOELAuGYjpOSiUU2nCvr6H_Q2rqksYk91NVON1rpQbw-UTTHnhMN9M5yZfXxmH5_5HV-BXz1s_x79k1cB-AH46Ufc_Udlzs8-Xx6kvwC1FKg6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2374085999</pqid></control><display><type>article</type><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><source>Wiley-Blackwell Journals</source><source>MEDLINE</source><creator>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</creator><creatorcontrib>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</creatorcontrib><description>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</description><identifier>ISSN: 0741-0395</identifier><identifier>EISSN: 1098-2272</identifier><identifier>DOI: 10.1002/gepi.22276</identifier><identifier>PMID: 31879980</identifier><language>eng</language><publisher>United States: Wiley Subscription Services, Inc</publisher><subject>Algorithms ; Association analysis ; Biobanks ; Biological Specimen Banks ; Case-Control Studies ; Computer Simulation ; electronic health record ; Electronic health records ; Electronic medical records ; Genetic analysis ; Genome-wide association studies ; Genome-Wide Association Study ; Genomes ; Humans ; Hypertension - genetics ; Models, Genetic ; ordered multinomial regression ; Phenotype ; Phenotypes ; Phenotyping ; Polymorphism, Single Nucleotide - genetics ; Pulmonary Disease, Chronic Obstructive - genetics ; Pulmonary Disease, Chronic Obstructive - physiopathology ; Regression Analysis ; Respiratory Function Tests ; Statistical analysis</subject><ispartof>Genetic epidemiology, 2020-04, Vol.44 (3), p.248-260</ispartof><rights>2019 Wiley Periodicals, Inc.</rights><rights>2020 Wiley Periodicals, Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</citedby><cites>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</cites><orcidid>0000-0001-7983-0274</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fgepi.22276$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fgepi.22276$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>230,314,780,784,885,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31879980$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>German, Christopher A.</creatorcontrib><creatorcontrib>Sinsheimer, Janet S.</creatorcontrib><creatorcontrib>Klimentidis, Yann C.</creatorcontrib><creatorcontrib>Zhou, Hua</creatorcontrib><creatorcontrib>Zhou, Jin J.</creatorcontrib><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><title>Genetic epidemiology</title><addtitle>Genet Epidemiol</addtitle><description>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</description><subject>Algorithms</subject><subject>Association analysis</subject><subject>Biobanks</subject><subject>Biological Specimen Banks</subject><subject>Case-Control Studies</subject><subject>Computer Simulation</subject><subject>electronic health record</subject><subject>Electronic health records</subject><subject>Electronic medical records</subject><subject>Genetic analysis</subject><subject>Genome-wide association studies</subject><subject>Genome-Wide Association Study</subject><subject>Genomes</subject><subject>Humans</subject><subject>Hypertension - genetics</subject><subject>Models, Genetic</subject><subject>ordered multinomial regression</subject><subject>Phenotype</subject><subject>Phenotypes</subject><subject>Phenotyping</subject><subject>Polymorphism, Single Nucleotide - genetics</subject><subject>Pulmonary Disease, Chronic Obstructive - genetics</subject><subject>Pulmonary Disease, Chronic Obstructive - physiopathology</subject><subject>Regression Analysis</subject><subject>Respiratory Function Tests</subject><subject>Statistical analysis</subject><issn>0741-0395</issn><issn>1098-2272</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kU1rFTEUhoMo9lrd-AMk4EaEqfmYj2QjaKltoVAXug5nkjO3qTPJmMwo99-b662lunCV5OTh4ZzzEvKSsxPOmHi3xdmfCCG69hHZcKZVVe7iMdmwruYVk7o5Is9yvmWM81o3T8mR5KrTWrENGa6Tw4SOTuu4-BAnDyNNuE2Ys4-BDjHRLQZcvKWQc7Qeln0dAoy77DONA43J-fKk8w2GuOxmzBQW-tHHHsI3mi2M-Jw8GWDM-OLuPCZfP519Ob2orq7PL08_XFW2rlVbtRJb5rjl4LSVaLUYpOjRDr2zMIDtJe-gTKG6tmE1DEqBc4LzVtZCatfKY_L-4J3XfkJnMSwJRjMnP0HamQje_P0T_I3Zxh9GiaatG1YEb-4EKX5fMS9m8tniOELAuGYjpOSiUU2nCvr6H_Q2rqksYk91NVON1rpQbw-UTTHnhMN9M5yZfXxmH5_5HV-BXz1s_x79k1cB-AH46Ufc_Udlzs8-Xx6kvwC1FKg6</recordid><startdate>202004</startdate><enddate>202004</enddate><creator>German, Christopher A.</creator><creator>Sinsheimer, Janet S.</creator><creator>Klimentidis, Yann C.</creator><creator>Zhou, Hua</creator><creator>Zhou, Jin J.</creator><general>Wiley Subscription Services, Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>7QR</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7983-0274</orcidid></search><sort><creationdate>202004</creationdate><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><author>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Association analysis</topic><topic>Biobanks</topic><topic>Biological Specimen Banks</topic><topic>Case-Control Studies</topic><topic>Computer Simulation</topic><topic>electronic health record</topic><topic>Electronic health records</topic><topic>Electronic medical records</topic><topic>Genetic analysis</topic><topic>Genome-wide association studies</topic><topic>Genome-Wide Association Study</topic><topic>Genomes</topic><topic>Humans</topic><topic>Hypertension - genetics</topic><topic>Models, Genetic</topic><topic>ordered multinomial regression</topic><topic>Phenotype</topic><topic>Phenotypes</topic><topic>Phenotyping</topic><topic>Polymorphism, Single Nucleotide - genetics</topic><topic>Pulmonary Disease, Chronic Obstructive - genetics</topic><topic>Pulmonary Disease, Chronic Obstructive - physiopathology</topic><topic>Regression Analysis</topic><topic>Respiratory Function Tests</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>German, Christopher A.</creatorcontrib><creatorcontrib>Sinsheimer, Janet S.</creatorcontrib><creatorcontrib>Klimentidis, Yann C.</creatorcontrib><creatorcontrib>Zhou, Hua</creatorcontrib><creatorcontrib>Zhou, Jin J.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetic epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>German, Christopher A.</au><au>Sinsheimer, Janet S.</au><au>Klimentidis, Yann C.</au><au>Zhou, Hua</au><au>Zhou, Jin J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</atitle><jtitle>Genetic epidemiology</jtitle><addtitle>Genet Epidemiol</addtitle><date>2020-04</date><risdate>2020</risdate><volume>44</volume><issue>3</issue><spage>248</spage><epage>260</epage><pages>248-260</pages><issn>0741-0395</issn><eissn>1098-2272</eissn><abstract>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</abstract><cop>United States</cop><pub>Wiley Subscription Services, Inc</pub><pmid>31879980</pmid><doi>10.1002/gepi.22276</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-7983-0274</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0741-0395 |
ispartof | Genetic epidemiology, 2020-04, Vol.44 (3), p.248-260 |
issn | 0741-0395 1098-2272 |
language | eng |
recordid | cdi_crossref_primary_10_1002_gepi_22276 |
source | Wiley-Blackwell Journals; MEDLINE |
subjects | Algorithms Association analysis Biobanks Biological Specimen Banks Case-Control Studies Computer Simulation electronic health record Electronic health records Electronic medical records Genetic analysis Genome-wide association studies Genome-Wide Association Study Genomes Humans Hypertension - genetics Models, Genetic ordered multinomial regression Phenotype Phenotypes Phenotyping Polymorphism, Single Nucleotide - genetics Pulmonary Disease, Chronic Obstructive - genetics Pulmonary Disease, Chronic Obstructive - physiopathology Regression Analysis Respiratory Function Tests Statistical analysis |
title | Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T14%3A21%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Ordered%20multinomial%20regression%20for%20genetic%20association%20analysis%20of%20ordinal%20phenotypes%20at%20Biobank%20scale&rft.jtitle=Genetic%20epidemiology&rft.au=German,%20Christopher%20A.&rft.date=2020-04&rft.volume=44&rft.issue=3&rft.spage=248&rft.epage=260&rft.pages=248-260&rft.issn=0741-0395&rft.eissn=1098-2272&rft_id=info:doi/10.1002/gepi.22276&rft_dat=%3Cproquest_pubme%3E2374085999%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2374085999&rft_id=info:pmid/31879980&rfr_iscdi=true |