Evaluation of polygenic prediction methodology within a reference-standardized framework

The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of var...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PLoS genetics 2021-05, Vol.17 (5), p.e1009021
Hauptverfasser: Pain, Oliver, Glanville, Kylie P, Hagenaars, Saskia P, Selzam, Saskia, Fürtjes, Anna E, Gaspar, Héléna A, Coleman, Jonathan R I, Rimfeld, Kaili, Breen, Gerome, Plomin, Robert, Folkersen, Lasse, Lewis, Cathryn M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 5
container_start_page e1009021
container_title PLoS genetics
container_volume 17
creator Pain, Oliver
Glanville, Kylie P
Hagenaars, Saskia P
Selzam, Saskia
Fürtjes, Anna E
Gaspar, Héléna A
Coleman, Jonathan R I
Rimfeld, Kaili
Breen, Gerome
Plomin, Robert
Folkersen, Lasse
Lewis, Cathryn M
description The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods.
doi_str_mv 10.1371/journal.pgen.1009021
format Article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_2541858057</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A663948816</galeid><doaj_id>oai_doaj_org_article_16d89b298f4a47c48c8b77552d412603</doaj_id><sourcerecordid>A663948816</sourcerecordid><originalsourceid>FETCH-LOGICAL-c792t-2d7b52b1f3a9fe4ea805174fd1343a7bf9449ddf9c0a91aaa97f2a476e6efc33</originalsourceid><addsrcrecordid>eNqVk99r1TAUx4sobk7_A9GCIPpwr02aNs2LMMbUC8OBDvEtnOZHb2baXJN28_rXm-5241b2oOQh4eRzvic5P5LkOcqWKKfo3aUbfAd2uWlUt0RZxjKMHiSHqCjyBSUZebh3PkiehHCZZXlRMfo4OchzRuIdPky-n16BHaA3rkudTjfObqOeEenGK2nEjb1V_dpJZ12zTa9NvzZdCqlXWnnVCbUIPXQSvDS_lUy1h1ZdO__jafJIgw3q2bQfJRcfTi9OPi3Ozj-uTo7PFoIy3C-wpHWBa6RzYFoRBVVWIEq0RDnJgdaaEcKk1ExkwBAAMKoxEFqqUmmR50fJy53sxrrAp5wEjguCqiJq0UisdoR0cMk33rTgt9yB4TcG5xsOvjfCKo5KWbEas0qTGEKQSlQ1pUWBJUG4zMZo76doQ90qKVTXe7Az0flNZ9a8cVe8QhjhqogCbyYB734OKvS8NUEoa6FTbhjfjXHOSsRG9NVf6P2_m6gG4gdMp12MK0ZRflyWscxVhcpILe-h4pKqNcJ1Sptonzm8nTlEple_-gaGEPjq65f_YD__O3v-bc6-3mPXCmy_Ds4OY0-GOUh2oPAuhNiYdwVBGR9n5TZzfJwVPs1KdHuxX8w7p9vhyP8ARMQO-g</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2541858057</pqid></control><display><type>article</type><title>Evaluation of polygenic prediction methodology within a reference-standardized framework</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>PubMed Central</source><source>Public Library of Science (PLoS)</source><creator>Pain, Oliver ; Glanville, Kylie P ; Hagenaars, Saskia P ; Selzam, Saskia ; Fürtjes, Anna E ; Gaspar, Héléna A ; Coleman, Jonathan R I ; Rimfeld, Kaili ; Breen, Gerome ; Plomin, Robert ; Folkersen, Lasse ; Lewis, Cathryn M</creator><creatorcontrib>Pain, Oliver ; Glanville, Kylie P ; Hagenaars, Saskia P ; Selzam, Saskia ; Fürtjes, Anna E ; Gaspar, Héléna A ; Coleman, Jonathan R I ; Rimfeld, Kaili ; Breen, Gerome ; Plomin, Robert ; Folkersen, Lasse ; Lewis, Cathryn M</creatorcontrib><description>The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods.</description><identifier>ISSN: 1553-7404</identifier><identifier>ISSN: 1553-7390</identifier><identifier>EISSN: 1553-7404</identifier><identifier>DOI: 10.1371/journal.pgen.1009021</identifier><identifier>PMID: 33945532</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Biobanks ; Biology and Life Sciences ; Breast cancer ; Cardiovascular disease ; Computer Simulation ; Consortia ; Coronary artery ; Datasets as Topic ; Diabetes mellitus ; Estimates ; Ethics ; Genetic diversity ; Genetic research ; Genetic screening ; Genetic variation ; Genome-Wide Association Study ; Genotype ; Health risk assessment ; Heart diseases ; Humans ; Inflammatory bowel diseases ; Linkage disequilibrium ; Medicine and Health Sciences ; Methods ; Models, Genetic ; Multifactorial Inheritance - genetics ; Multiple sclerosis ; Physical Sciences ; Polymorphism, Single Nucleotide - genetics ; Precision Medicine ; Prostate cancer ; Reproducibility of Results ; Research and Analysis Methods ; Rheumatoid arthritis ; Sample size ; Statistics ; Twin Studies as Topic ; Twins - genetics ; United Kingdom</subject><ispartof>PLoS genetics, 2021-05, Vol.17 (5), p.e1009021</ispartof><rights>COPYRIGHT 2021 Public Library of Science</rights><rights>2021 Pain et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2021 Pain et al 2021 Pain et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c792t-2d7b52b1f3a9fe4ea805174fd1343a7bf9449ddf9c0a91aaa97f2a476e6efc33</citedby><cites>FETCH-LOGICAL-c792t-2d7b52b1f3a9fe4ea805174fd1343a7bf9449ddf9c0a91aaa97f2a476e6efc33</cites><orcidid>0000-0001-8321-9435 ; 0000-0002-8249-8476 ; 0000-0001-5680-3281 ; 0000-0003-0708-9530 ; 0000-0003-4985-8174 ; 0000-0002-0756-3629 ; 0000-0001-9697-8596 ; 0000-0001-6590-4957 ; 0000-0002-5540-2707 ; 0000-0002-6759-0944</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121285/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC8121285/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,2096,2915,23845,27901,27902,53766,53768,79343,79344</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33945532$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Pain, Oliver</creatorcontrib><creatorcontrib>Glanville, Kylie P</creatorcontrib><creatorcontrib>Hagenaars, Saskia P</creatorcontrib><creatorcontrib>Selzam, Saskia</creatorcontrib><creatorcontrib>Fürtjes, Anna E</creatorcontrib><creatorcontrib>Gaspar, Héléna A</creatorcontrib><creatorcontrib>Coleman, Jonathan R I</creatorcontrib><creatorcontrib>Rimfeld, Kaili</creatorcontrib><creatorcontrib>Breen, Gerome</creatorcontrib><creatorcontrib>Plomin, Robert</creatorcontrib><creatorcontrib>Folkersen, Lasse</creatorcontrib><creatorcontrib>Lewis, Cathryn M</creatorcontrib><title>Evaluation of polygenic prediction methodology within a reference-standardized framework</title><title>PLoS genetics</title><addtitle>PLoS Genet</addtitle><description>The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods.</description><subject>Biobanks</subject><subject>Biology and Life Sciences</subject><subject>Breast cancer</subject><subject>Cardiovascular disease</subject><subject>Computer Simulation</subject><subject>Consortia</subject><subject>Coronary artery</subject><subject>Datasets as Topic</subject><subject>Diabetes mellitus</subject><subject>Estimates</subject><subject>Ethics</subject><subject>Genetic diversity</subject><subject>Genetic research</subject><subject>Genetic screening</subject><subject>Genetic variation</subject><subject>Genome-Wide Association Study</subject><subject>Genotype</subject><subject>Health risk assessment</subject><subject>Heart diseases</subject><subject>Humans</subject><subject>Inflammatory bowel diseases</subject><subject>Linkage disequilibrium</subject><subject>Medicine and Health Sciences</subject><subject>Methods</subject><subject>Models, Genetic</subject><subject>Multifactorial Inheritance - genetics</subject><subject>Multiple sclerosis</subject><subject>Physical Sciences</subject><subject>Polymorphism, Single Nucleotide - genetics</subject><subject>Precision Medicine</subject><subject>Prostate cancer</subject><subject>Reproducibility of Results</subject><subject>Research and Analysis Methods</subject><subject>Rheumatoid arthritis</subject><subject>Sample size</subject><subject>Statistics</subject><subject>Twin Studies as Topic</subject><subject>Twins - genetics</subject><subject>United Kingdom</subject><issn>1553-7404</issn><issn>1553-7390</issn><issn>1553-7404</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>BENPR</sourceid><sourceid>DOA</sourceid><recordid>eNqVk99r1TAUx4sobk7_A9GCIPpwr02aNs2LMMbUC8OBDvEtnOZHb2baXJN28_rXm-5241b2oOQh4eRzvic5P5LkOcqWKKfo3aUbfAd2uWlUt0RZxjKMHiSHqCjyBSUZebh3PkiehHCZZXlRMfo4OchzRuIdPky-n16BHaA3rkudTjfObqOeEenGK2nEjb1V_dpJZ12zTa9NvzZdCqlXWnnVCbUIPXQSvDS_lUy1h1ZdO__jafJIgw3q2bQfJRcfTi9OPi3Ozj-uTo7PFoIy3C-wpHWBa6RzYFoRBVVWIEq0RDnJgdaaEcKk1ExkwBAAMKoxEFqqUmmR50fJy53sxrrAp5wEjguCqiJq0UisdoR0cMk33rTgt9yB4TcG5xsOvjfCKo5KWbEas0qTGEKQSlQ1pUWBJUG4zMZo76doQ90qKVTXe7Az0flNZ9a8cVe8QhjhqogCbyYB734OKvS8NUEoa6FTbhjfjXHOSsRG9NVf6P2_m6gG4gdMp12MK0ZRflyWscxVhcpILe-h4pKqNcJ1Sptonzm8nTlEple_-gaGEPjq65f_YD__O3v-bc6-3mPXCmy_Ds4OY0-GOUh2oPAuhNiYdwVBGR9n5TZzfJwVPs1KdHuxX8w7p9vhyP8ARMQO-g</recordid><startdate>20210504</startdate><enddate>20210504</enddate><creator>Pain, Oliver</creator><creator>Glanville, Kylie P</creator><creator>Hagenaars, Saskia P</creator><creator>Selzam, Saskia</creator><creator>Fürtjes, Anna E</creator><creator>Gaspar, Héléna A</creator><creator>Coleman, Jonathan R I</creator><creator>Rimfeld, Kaili</creator><creator>Breen, Gerome</creator><creator>Plomin, Robert</creator><creator>Folkersen, Lasse</creator><creator>Lewis, Cathryn M</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISN</scope><scope>ISR</scope><scope>3V.</scope><scope>7QP</scope><scope>7QR</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7TO</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8321-9435</orcidid><orcidid>https://orcid.org/0000-0002-8249-8476</orcidid><orcidid>https://orcid.org/0000-0001-5680-3281</orcidid><orcidid>https://orcid.org/0000-0003-0708-9530</orcidid><orcidid>https://orcid.org/0000-0003-4985-8174</orcidid><orcidid>https://orcid.org/0000-0002-0756-3629</orcidid><orcidid>https://orcid.org/0000-0001-9697-8596</orcidid><orcidid>https://orcid.org/0000-0001-6590-4957</orcidid><orcidid>https://orcid.org/0000-0002-5540-2707</orcidid><orcidid>https://orcid.org/0000-0002-6759-0944</orcidid></search><sort><creationdate>20210504</creationdate><title>Evaluation of polygenic prediction methodology within a reference-standardized framework</title><author>Pain, Oliver ; Glanville, Kylie P ; Hagenaars, Saskia P ; Selzam, Saskia ; Fürtjes, Anna E ; Gaspar, Héléna A ; Coleman, Jonathan R I ; Rimfeld, Kaili ; Breen, Gerome ; Plomin, Robert ; Folkersen, Lasse ; Lewis, Cathryn M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c792t-2d7b52b1f3a9fe4ea805174fd1343a7bf9449ddf9c0a91aaa97f2a476e6efc33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Biobanks</topic><topic>Biology and Life Sciences</topic><topic>Breast cancer</topic><topic>Cardiovascular disease</topic><topic>Computer Simulation</topic><topic>Consortia</topic><topic>Coronary artery</topic><topic>Datasets as Topic</topic><topic>Diabetes mellitus</topic><topic>Estimates</topic><topic>Ethics</topic><topic>Genetic diversity</topic><topic>Genetic research</topic><topic>Genetic screening</topic><topic>Genetic variation</topic><topic>Genome-Wide Association Study</topic><topic>Genotype</topic><topic>Health risk assessment</topic><topic>Heart diseases</topic><topic>Humans</topic><topic>Inflammatory bowel diseases</topic><topic>Linkage disequilibrium</topic><topic>Medicine and Health Sciences</topic><topic>Methods</topic><topic>Models, Genetic</topic><topic>Multifactorial Inheritance - genetics</topic><topic>Multiple sclerosis</topic><topic>Physical Sciences</topic><topic>Polymorphism, Single Nucleotide - genetics</topic><topic>Precision Medicine</topic><topic>Prostate cancer</topic><topic>Reproducibility of Results</topic><topic>Research and Analysis Methods</topic><topic>Rheumatoid arthritis</topic><topic>Sample size</topic><topic>Statistics</topic><topic>Twin Studies as Topic</topic><topic>Twins - genetics</topic><topic>United Kingdom</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Pain, Oliver</creatorcontrib><creatorcontrib>Glanville, Kylie P</creatorcontrib><creatorcontrib>Hagenaars, Saskia P</creatorcontrib><creatorcontrib>Selzam, Saskia</creatorcontrib><creatorcontrib>Fürtjes, Anna E</creatorcontrib><creatorcontrib>Gaspar, Héléna A</creatorcontrib><creatorcontrib>Coleman, Jonathan R I</creatorcontrib><creatorcontrib>Rimfeld, Kaili</creatorcontrib><creatorcontrib>Breen, Gerome</creatorcontrib><creatorcontrib>Plomin, Robert</creatorcontrib><creatorcontrib>Folkersen, Lasse</creatorcontrib><creatorcontrib>Lewis, Cathryn M</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pain, Oliver</au><au>Glanville, Kylie P</au><au>Hagenaars, Saskia P</au><au>Selzam, Saskia</au><au>Fürtjes, Anna E</au><au>Gaspar, Héléna A</au><au>Coleman, Jonathan R I</au><au>Rimfeld, Kaili</au><au>Breen, Gerome</au><au>Plomin, Robert</au><au>Folkersen, Lasse</au><au>Lewis, Cathryn M</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Evaluation of polygenic prediction methodology within a reference-standardized framework</atitle><jtitle>PLoS genetics</jtitle><addtitle>PLoS Genet</addtitle><date>2021-05-04</date><risdate>2021</risdate><volume>17</volume><issue>5</issue><spage>e1009021</spage><pages>e1009021-</pages><issn>1553-7404</issn><issn>1553-7390</issn><eissn>1553-7404</eissn><abstract>The predictive utility of polygenic scores is increasing, and many polygenic scoring methods are available, but it is unclear which method performs best. This study evaluates the predictive utility of polygenic scoring methods within a reference-standardized framework, which uses a common set of variants and reference-based estimates of linkage disequilibrium and allele frequencies to construct scores. Eight polygenic score methods were tested: p-value thresholding and clumping (pT+clump), SBLUP, lassosum, LDpred1, LDpred2, PRScs, DBSLMM and SBayesR, evaluating their performance to predict outcomes in UK Biobank and the Twins Early Development Study (TEDS). Strategies to identify optimal p-value thresholds and shrinkage parameters were compared, including 10-fold cross validation, pseudovalidation and infinitesimal models (with no validation sample), and multi-polygenic score elastic net models. LDpred2, lassosum and PRScs performed strongly using 10-fold cross-validation to identify the most predictive p-value threshold or shrinkage parameter, giving a relative improvement of 16-18% over pT+clump in the correlation between observed and predicted outcome values. Using pseudovalidation, the best methods were PRScs, DBSLMM and SBayesR. PRScs pseudovalidation was only 3% worse than the best polygenic score identified by 10-fold cross validation. Elastic net models containing polygenic scores based on a range of parameters consistently improved prediction over any single polygenic score. Within a reference-standardized framework, the best polygenic prediction was achieved using LDpred2, lassosum and PRScs, modeling multiple polygenic scores derived using multiple parameters. This study will help researchers performing polygenic score studies to select the most powerful and predictive analysis methods.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>33945532</pmid><doi>10.1371/journal.pgen.1009021</doi><orcidid>https://orcid.org/0000-0001-8321-9435</orcidid><orcidid>https://orcid.org/0000-0002-8249-8476</orcidid><orcidid>https://orcid.org/0000-0001-5680-3281</orcidid><orcidid>https://orcid.org/0000-0003-0708-9530</orcidid><orcidid>https://orcid.org/0000-0003-4985-8174</orcidid><orcidid>https://orcid.org/0000-0002-0756-3629</orcidid><orcidid>https://orcid.org/0000-0001-9697-8596</orcidid><orcidid>https://orcid.org/0000-0001-6590-4957</orcidid><orcidid>https://orcid.org/0000-0002-5540-2707</orcidid><orcidid>https://orcid.org/0000-0002-6759-0944</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7404
ispartof PLoS genetics, 2021-05, Vol.17 (5), p.e1009021
issn 1553-7404
1553-7390
1553-7404
language eng
recordid cdi_plos_journals_2541858057
source MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; PubMed Central; Public Library of Science (PLoS)
subjects Biobanks
Biology and Life Sciences
Breast cancer
Cardiovascular disease
Computer Simulation
Consortia
Coronary artery
Datasets as Topic
Diabetes mellitus
Estimates
Ethics
Genetic diversity
Genetic research
Genetic screening
Genetic variation
Genome-Wide Association Study
Genotype
Health risk assessment
Heart diseases
Humans
Inflammatory bowel diseases
Linkage disequilibrium
Medicine and Health Sciences
Methods
Models, Genetic
Multifactorial Inheritance - genetics
Multiple sclerosis
Physical Sciences
Polymorphism, Single Nucleotide - genetics
Precision Medicine
Prostate cancer
Reproducibility of Results
Research and Analysis Methods
Rheumatoid arthritis
Sample size
Statistics
Twin Studies as Topic
Twins - genetics
United Kingdom
title Evaluation of polygenic prediction methodology within a reference-standardized framework
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T00%3A56%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Evaluation%20of%20polygenic%20prediction%20methodology%20within%20a%20reference-standardized%20framework&rft.jtitle=PLoS%20genetics&rft.au=Pain,%20Oliver&rft.date=2021-05-04&rft.volume=17&rft.issue=5&rft.spage=e1009021&rft.pages=e1009021-&rft.issn=1553-7404&rft.eissn=1553-7404&rft_id=info:doi/10.1371/journal.pgen.1009021&rft_dat=%3Cgale_plos_%3EA663948816%3C/gale_plos_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2541858057&rft_id=info:pmid/33945532&rft_galeid=A663948816&rft_doaj_id=oai_doaj_org_article_16d89b298f4a47c48c8b77552d412603&rfr_iscdi=true