Optimal control of false discovery criteria in the two‐group model

The highly influential two‐group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of the Royal Statistical Society. Series B, Statistical methodology Statistical methodology, 2021-02, Vol.83 (1), p.133-155
Hauptverfasser:	Heller, Ruth, Rosset, Saharon
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Criteria Discovery false discovery rate Gene expression Hypotheses infinite linear programming large‐scale inference Model testing multiple testing Optimal control Optimization Policies positive FDR Regression analysis Statistical analysis Statistical methods Statistical tests Statistics
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	155
container_issue	1
container_start_page	133
container_title	Journal of the Royal Statistical Society. Series B, Statistical methodology
container_volume	83
creator	Heller, Ruth Rosset, Saharon
description	The highly influential two‐group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two‐group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.
doi_str_mv	10.1111/rssb.12403
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2488847454</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2488847454</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3343-480e7d3649dd5ab0fce773658310c45a6efe810510cdf3b6894d3338c3b48bb63</originalsourceid><addsrcrecordid>eNp9kE1OwzAQhS0EEqWw4QSW2CGl2B3HdpZQfqVKlSisrcR2IFUaBzuh6o4jcEZOgtuw5m1mRvrejOYhdE7JhEZd-RCKCZ0yAgdoRBkXSSa5PIw98CwRjE6P0UkIKxLFBYzQ7aLtqnVeY-2azrsauxKXeR0sNlXQ7tP6Lda-6qyvclw1uHu3uNu4n6_vN-_6Fq-dsfUpOtp7zv7qGL3e373MHpP54uFpdj1PNACDhElihQHOMmPSvCCltkIATyVQolmac1taSUkaJ1NCwWXGDABIDQWTRcFhjC6Gva13H70NnVq53jfxpJoyKSUTLGWRuhwo7V0I3paq9fFFv1WUqF1KapeS2qcUYTrAm6q2239I9bxc3gyeX0uxaoY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2488847454</pqid></control><display><type>article</type><title>Optimal control of false discovery criteria in the two‐group model</title><source>Business Source Complete</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>Wiley Online Library All Journals</source><creator>Heller, Ruth ; Rosset, Saharon</creator><creatorcontrib>Heller, Ruth ; Rosset, Saharon</creatorcontrib><description>The highly influential two‐group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two‐group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.</description><identifier>ISSN: 1369-7412</identifier><identifier>EISSN: 1467-9868</identifier><identifier>DOI: 10.1111/rssb.12403</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Criteria ; Discovery ; false discovery rate ; Gene expression ; Hypotheses ; infinite linear programming ; large‐scale inference ; Model testing ; multiple testing ; Optimal control ; Optimization ; Policies ; positive FDR ; Regression analysis ; Statistical analysis ; Statistical methods ; Statistical tests ; Statistics</subject><ispartof>Journal of the Royal Statistical Society. Series B, Statistical methodology, 2021-02, Vol.83 (1), p.133-155</ispartof><rights>2020 Royal Statistical Society</rights><rights>Copyright © 2021 The Royal Statistical Society and Blackwell Publishing Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3343-480e7d3649dd5ab0fce773658310c45a6efe810510cdf3b6894d3338c3b48bb63</citedby><cites>FETCH-LOGICAL-c3343-480e7d3649dd5ab0fce773658310c45a6efe810510cdf3b6894d3338c3b48bb63</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2Frssb.12403$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2Frssb.12403$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1417,27924,27925,45574,45575</link.rule.ids></links><search><creatorcontrib>Heller, Ruth</creatorcontrib><creatorcontrib>Rosset, Saharon</creatorcontrib><title>Optimal control of false discovery criteria in the two‐group model</title><title>Journal of the Royal Statistical Society. Series B, Statistical methodology</title><description>The highly influential two‐group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two‐group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.</description><subject>Algorithms</subject><subject>Criteria</subject><subject>Discovery</subject><subject>false discovery rate</subject><subject>Gene expression</subject><subject>Hypotheses</subject><subject>infinite linear programming</subject><subject>large‐scale inference</subject><subject>Model testing</subject><subject>multiple testing</subject><subject>Optimal control</subject><subject>Optimization</subject><subject>Policies</subject><subject>positive FDR</subject><subject>Regression analysis</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><subject>Statistical tests</subject><subject>Statistics</subject><issn>1369-7412</issn><issn>1467-9868</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1OwzAQhS0EEqWw4QSW2CGl2B3HdpZQfqVKlSisrcR2IFUaBzuh6o4jcEZOgtuw5m1mRvrejOYhdE7JhEZd-RCKCZ0yAgdoRBkXSSa5PIw98CwRjE6P0UkIKxLFBYzQ7aLtqnVeY-2azrsauxKXeR0sNlXQ7tP6Lda-6qyvclw1uHu3uNu4n6_vN-_6Fq-dsfUpOtp7zv7qGL3e373MHpP54uFpdj1PNACDhElihQHOMmPSvCCltkIATyVQolmac1taSUkaJ1NCwWXGDABIDQWTRcFhjC6Gva13H70NnVq53jfxpJoyKSUTLGWRuhwo7V0I3paq9fFFv1WUqF1KapeS2qcUYTrAm6q2239I9bxc3gyeX0uxaoY</recordid><startdate>202102</startdate><enddate>202102</enddate><creator>Heller, Ruth</creator><creator>Rosset, Saharon</creator><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8BJ</scope><scope>8FD</scope><scope>FQK</scope><scope>JBE</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>202102</creationdate><title>Optimal control of false discovery criteria in the two‐group model</title><author>Heller, Ruth ; Rosset, Saharon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3343-480e7d3649dd5ab0fce773658310c45a6efe810510cdf3b6894d3338c3b48bb63</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>Criteria</topic><topic>Discovery</topic><topic>false discovery rate</topic><topic>Gene expression</topic><topic>Hypotheses</topic><topic>infinite linear programming</topic><topic>large‐scale inference</topic><topic>Model testing</topic><topic>multiple testing</topic><topic>Optimal control</topic><topic>Optimization</topic><topic>Policies</topic><topic>positive FDR</topic><topic>Regression analysis</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><topic>Statistical tests</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Heller, Ruth</creatorcontrib><creatorcontrib>Rosset, Saharon</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>Technology Research Database</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of the Royal Statistical Society. Series B, Statistical methodology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Heller, Ruth</au><au>Rosset, Saharon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimal control of false discovery criteria in the two‐group model</atitle><jtitle>Journal of the Royal Statistical Society. Series B, Statistical methodology</jtitle><date>2021-02</date><risdate>2021</risdate><volume>83</volume><issue>1</issue><spage>133</spage><epage>155</epage><pages>133-155</pages><issn>1369-7412</issn><eissn>1467-9868</eissn><abstract>The highly influential two‐group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal control of the marginal false discovery rate (mFDR), in the sense that it provides maximal power (expected true discoveries) subject to mFDR control, is known to be achieved by thresholding the local false discovery rate (locFDR), the probability of the hypothesis being null given the set of test statistics, with a fixed threshold. We address the challenge of controlling optimally the popular false discovery rate (FDR) or positive FDR (pFDR) in the general two‐group model, which also allows for dependence between the test statistics. These criteria are less conservative than the mFDR criterion, so they make more rejections in expectation. We derive their optimal multiple testing (OMT) policies, which turn out to be thresholding the locFDR with a threshold that is a function of the entire set of statistics. We develop an efficient algorithm for finding these policies, and use it for problems with thousands of hypotheses. We illustrate these procedures on gene expression studies.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><doi>10.1111/rssb.12403</doi><tpages>23</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1369-7412
ispartof	Journal of the Royal Statistical Society. Series B, Statistical methodology, 2021-02, Vol.83 (1), p.133-155
issn	1369-7412 1467-9868
language	eng
recordid	cdi_proquest_journals_2488847454
source	Business Source Complete; Oxford University Press Journals All Titles (1996-Current); Wiley Online Library All Journals
subjects	Algorithms Criteria Discovery false discovery rate Gene expression Hypotheses infinite linear programming large‐scale inference Model testing multiple testing Optimal control Optimization Policies positive FDR Regression analysis Statistical analysis Statistical methods Statistical tests Statistics
title	Optimal control of false discovery criteria in the two‐group model
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T04%3A10%3A51IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimal%20control%20of%20false%20discovery%20criteria%20in%20the%20two%E2%80%90group%20model&rft.jtitle=Journal%20of%20the%20Royal%20Statistical%20Society.%20Series%20B,%20Statistical%20methodology&rft.au=Heller,%20Ruth&rft.date=2021-02&rft.volume=83&rft.issue=1&rft.spage=133&rft.epage=155&rft.pages=133-155&rft.issn=1369-7412&rft.eissn=1467-9868&rft_id=info:doi/10.1111/rssb.12403&rft_dat=%3Cproquest_cross%3E2488847454%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2488847454&rft_id=info:pmid/&rfr_iscdi=true