Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale

Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genetic epidemiology 2020-04, Vol.44 (3), p.248-260
Hauptverfasser: German, Christopher A., Sinsheimer, Janet S., Klimentidis, Yann C., Zhou, Hua, Zhou, Jin J.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 260
container_issue 3
container_start_page 248
container_title Genetic epidemiology
container_volume 44
creator German, Christopher A.
Sinsheimer, Janet S.
Klimentidis, Yann C.
Zhou, Hua
Zhou, Jin J.
description Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.
doi_str_mv 10.1002/gepi.22276
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_crossref_primary_10_1002_gepi_22276</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2374085999</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</originalsourceid><addsrcrecordid>eNp9kU1rFTEUhoMo9lrd-AMk4EaEqfmYj2QjaKltoVAXug5nkjO3qTPJmMwo99-b662lunCV5OTh4ZzzEvKSsxPOmHi3xdmfCCG69hHZcKZVVe7iMdmwruYVk7o5Is9yvmWM81o3T8mR5KrTWrENGa6Tw4SOTuu4-BAnDyNNuE2Ys4-BDjHRLQZcvKWQc7Qeln0dAoy77DONA43J-fKk8w2GuOxmzBQW-tHHHsI3mi2M-Jw8GWDM-OLuPCZfP519Ob2orq7PL08_XFW2rlVbtRJb5rjl4LSVaLUYpOjRDr2zMIDtJe-gTKG6tmE1DEqBc4LzVtZCatfKY_L-4J3XfkJnMSwJRjMnP0HamQje_P0T_I3Zxh9GiaatG1YEb-4EKX5fMS9m8tniOELAuGYjpOSiUU2nCvr6H_Q2rqksYk91NVON1rpQbw-UTTHnhMN9M5yZfXxmH5_5HV-BXz1s_x79k1cB-AH46Ufc_Udlzs8-Xx6kvwC1FKg6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2374085999</pqid></control><display><type>article</type><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><source>Wiley-Blackwell Journals</source><source>MEDLINE</source><creator>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</creator><creatorcontrib>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</creatorcontrib><description>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</description><identifier>ISSN: 0741-0395</identifier><identifier>EISSN: 1098-2272</identifier><identifier>DOI: 10.1002/gepi.22276</identifier><identifier>PMID: 31879980</identifier><language>eng</language><publisher>United States: Wiley Subscription Services, Inc</publisher><subject>Algorithms ; Association analysis ; Biobanks ; Biological Specimen Banks ; Case-Control Studies ; Computer Simulation ; electronic health record ; Electronic health records ; Electronic medical records ; Genetic analysis ; Genome-wide association studies ; Genome-Wide Association Study ; Genomes ; Humans ; Hypertension - genetics ; Models, Genetic ; ordered multinomial regression ; Phenotype ; Phenotypes ; Phenotyping ; Polymorphism, Single Nucleotide - genetics ; Pulmonary Disease, Chronic Obstructive - genetics ; Pulmonary Disease, Chronic Obstructive - physiopathology ; Regression Analysis ; Respiratory Function Tests ; Statistical analysis</subject><ispartof>Genetic epidemiology, 2020-04, Vol.44 (3), p.248-260</ispartof><rights>2019 Wiley Periodicals, Inc.</rights><rights>2020 Wiley Periodicals, Inc.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</citedby><cites>FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</cites><orcidid>0000-0001-7983-0274</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fgepi.22276$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fgepi.22276$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>230,314,780,784,885,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31879980$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>German, Christopher A.</creatorcontrib><creatorcontrib>Sinsheimer, Janet S.</creatorcontrib><creatorcontrib>Klimentidis, Yann C.</creatorcontrib><creatorcontrib>Zhou, Hua</creatorcontrib><creatorcontrib>Zhou, Jin J.</creatorcontrib><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><title>Genetic epidemiology</title><addtitle>Genet Epidemiol</addtitle><description>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</description><subject>Algorithms</subject><subject>Association analysis</subject><subject>Biobanks</subject><subject>Biological Specimen Banks</subject><subject>Case-Control Studies</subject><subject>Computer Simulation</subject><subject>electronic health record</subject><subject>Electronic health records</subject><subject>Electronic medical records</subject><subject>Genetic analysis</subject><subject>Genome-wide association studies</subject><subject>Genome-Wide Association Study</subject><subject>Genomes</subject><subject>Humans</subject><subject>Hypertension - genetics</subject><subject>Models, Genetic</subject><subject>ordered multinomial regression</subject><subject>Phenotype</subject><subject>Phenotypes</subject><subject>Phenotyping</subject><subject>Polymorphism, Single Nucleotide - genetics</subject><subject>Pulmonary Disease, Chronic Obstructive - genetics</subject><subject>Pulmonary Disease, Chronic Obstructive - physiopathology</subject><subject>Regression Analysis</subject><subject>Respiratory Function Tests</subject><subject>Statistical analysis</subject><issn>0741-0395</issn><issn>1098-2272</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kU1rFTEUhoMo9lrd-AMk4EaEqfmYj2QjaKltoVAXug5nkjO3qTPJmMwo99-b662lunCV5OTh4ZzzEvKSsxPOmHi3xdmfCCG69hHZcKZVVe7iMdmwruYVk7o5Is9yvmWM81o3T8mR5KrTWrENGa6Tw4SOTuu4-BAnDyNNuE2Ys4-BDjHRLQZcvKWQc7Qeln0dAoy77DONA43J-fKk8w2GuOxmzBQW-tHHHsI3mi2M-Jw8GWDM-OLuPCZfP519Ob2orq7PL08_XFW2rlVbtRJb5rjl4LSVaLUYpOjRDr2zMIDtJe-gTKG6tmE1DEqBc4LzVtZCatfKY_L-4J3XfkJnMSwJRjMnP0HamQje_P0T_I3Zxh9GiaatG1YEb-4EKX5fMS9m8tniOELAuGYjpOSiUU2nCvr6H_Q2rqksYk91NVON1rpQbw-UTTHnhMN9M5yZfXxmH5_5HV-BXz1s_x79k1cB-AH46Ufc_Udlzs8-Xx6kvwC1FKg6</recordid><startdate>202004</startdate><enddate>202004</enddate><creator>German, Christopher A.</creator><creator>Sinsheimer, Janet S.</creator><creator>Klimentidis, Yann C.</creator><creator>Zhou, Hua</creator><creator>Zhou, Jin J.</creator><general>Wiley Subscription Services, Inc</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QP</scope><scope>7QR</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7983-0274</orcidid></search><sort><creationdate>202004</creationdate><title>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</title><author>German, Christopher A. ; Sinsheimer, Janet S. ; Klimentidis, Yann C. ; Zhou, Hua ; Zhou, Jin J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4486-63e60d1c1ad9c3ec92f32becfbdcafacb317a011876504af88add211634239d63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Association analysis</topic><topic>Biobanks</topic><topic>Biological Specimen Banks</topic><topic>Case-Control Studies</topic><topic>Computer Simulation</topic><topic>electronic health record</topic><topic>Electronic health records</topic><topic>Electronic medical records</topic><topic>Genetic analysis</topic><topic>Genome-wide association studies</topic><topic>Genome-Wide Association Study</topic><topic>Genomes</topic><topic>Humans</topic><topic>Hypertension - genetics</topic><topic>Models, Genetic</topic><topic>ordered multinomial regression</topic><topic>Phenotype</topic><topic>Phenotypes</topic><topic>Phenotyping</topic><topic>Polymorphism, Single Nucleotide - genetics</topic><topic>Pulmonary Disease, Chronic Obstructive - genetics</topic><topic>Pulmonary Disease, Chronic Obstructive - physiopathology</topic><topic>Regression Analysis</topic><topic>Respiratory Function Tests</topic><topic>Statistical analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>German, Christopher A.</creatorcontrib><creatorcontrib>Sinsheimer, Janet S.</creatorcontrib><creatorcontrib>Klimentidis, Yann C.</creatorcontrib><creatorcontrib>Zhou, Hua</creatorcontrib><creatorcontrib>Zhou, Jin J.</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetic epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>German, Christopher A.</au><au>Sinsheimer, Janet S.</au><au>Klimentidis, Yann C.</au><au>Zhou, Hua</au><au>Zhou, Jin J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale</atitle><jtitle>Genetic epidemiology</jtitle><addtitle>Genet Epidemiol</addtitle><date>2020-04</date><risdate>2020</risdate><volume>44</volume><issue>3</issue><spage>248</spage><epage>260</epage><pages>248-260</pages><issn>0741-0395</issn><eissn>1098-2272</eissn><abstract>Logistic regression is the primary analysis tool for binary traits in genome‐wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case–control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.</abstract><cop>United States</cop><pub>Wiley Subscription Services, Inc</pub><pmid>31879980</pmid><doi>10.1002/gepi.22276</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-7983-0274</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0741-0395
ispartof Genetic epidemiology, 2020-04, Vol.44 (3), p.248-260
issn 0741-0395
1098-2272
language eng
recordid cdi_crossref_primary_10_1002_gepi_22276
source Wiley-Blackwell Journals; MEDLINE
subjects Algorithms
Association analysis
Biobanks
Biological Specimen Banks
Case-Control Studies
Computer Simulation
electronic health record
Electronic health records
Electronic medical records
Genetic analysis
Genome-wide association studies
Genome-Wide Association Study
Genomes
Humans
Hypertension - genetics
Models, Genetic
ordered multinomial regression
Phenotype
Phenotypes
Phenotyping
Polymorphism, Single Nucleotide - genetics
Pulmonary Disease, Chronic Obstructive - genetics
Pulmonary Disease, Chronic Obstructive - physiopathology
Regression Analysis
Respiratory Function Tests
Statistical analysis
title Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T14%3A21%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Ordered%20multinomial%20regression%20for%20genetic%20association%20analysis%20of%20ordinal%20phenotypes%20at%20Biobank%20scale&rft.jtitle=Genetic%20epidemiology&rft.au=German,%20Christopher%20A.&rft.date=2020-04&rft.volume=44&rft.issue=3&rft.spage=248&rft.epage=260&rft.pages=248-260&rft.issn=0741-0395&rft.eissn=1098-2272&rft_id=info:doi/10.1002/gepi.22276&rft_dat=%3Cproquest_pubme%3E2374085999%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2374085999&rft_id=info:pmid/31879980&rfr_iscdi=true