TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA

This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Annals of statistics 2020-10, Vol.48 (5), p.2622-2645
Hauptverfasser: Fang, Ethan X., Ning, Yang, Li, Runze
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 2645
container_issue 5
container_start_page 2622
container_title The Annals of statistics
container_volume 48
creator Fang, Ethan X.
Ning, Yang
Li, Runze
description This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 (2002) 479–498) procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.
doi_str_mv 10.1214/19-aos1900
format Article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8277154</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>27028716</jstor_id><sourcerecordid>27028716</sourcerecordid><originalsourceid>FETCH-LOGICAL-c494t-af2ddc3d046c4038b9a112dc4fa04f37f02bac59eaa4d8913dce34ad679a313f3</originalsourceid><addsrcrecordid>eNpdkU1Lw0AQhhdRtFYv3pWAFxGisx_ZzSIIoW3aQEzApudlmw9taZuaTQX_vVtai3oahvfhYYYXoSsMD5hg9oilq2uDJcAR6hDMfdeXnB-jDoAE16OcnaFzY-YA4ElGT9EZZYQLBqKDnrLBOHPS0BlHwyQKo16Q9AZOmL46o2g4cvvRyyAZR2kSxE6cJsMom_Sj7dIPsuACnVR6YcrL_eyiSTjIeiM3TodWFLs5k6x1dUWKIqcFMJ4zoP5UaoxJkbNKA6uoqIBMde7JUmtW-BLTIi8p0wUXUlNMK9pFzzvvejNdljZdtY1eqHUzW-rmS9V6pv4mq9m7eqs_lU-EwB6zgru9oKk_NqVp1XJm8nKx0Kuy3hhFPI9IC_tg0dt_6LzeNCv7niLMw1gA-NxS9zsqb2pjmrI6HINBbTtRWKogHW87sfDN7_MP6E8JFrjeAXPT1s0hJwKILzCn3990itw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2451170086</pqid></control><display><type>article</type><title>TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA</title><source>JSTOR Mathematics &amp; Statistics</source><source>JSTOR Archive Collection A-Z Listing</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Project Euclid Complete</source><creator>Fang, Ethan X. ; Ning, Yang ; Li, Runze</creator><creatorcontrib>Fang, Ethan X. ; Ning, Yang ; Li, Runze</creatorcontrib><description>This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 (2002) 479–498) procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.</description><identifier>ISSN: 0090-5364</identifier><identifier>EISSN: 2168-8966</identifier><identifier>DOI: 10.1214/19-aos1900</identifier><identifier>PMID: 34267407</identifier><language>eng</language><publisher>United States: Institute of Mathematical Statistics</publisher><subject>Asymptotic methods ; Asymptotic properties ; Confidence intervals ; Estimating techniques ; Nuisance ; Parameter estimation ; Polynomials ; Statistical analysis ; Statistical inference ; Statistical tests</subject><ispartof>The Annals of statistics, 2020-10, Vol.48 (5), p.2622-2645</ispartof><rights>Institute of Mathematical Statistics, 2020</rights><rights>Copyright Institute of Mathematical Statistics Oct 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c494t-af2ddc3d046c4038b9a112dc4fa04f37f02bac59eaa4d8913dce34ad679a313f3</citedby><cites>FETCH-LOGICAL-c494t-af2ddc3d046c4038b9a112dc4fa04f37f02bac59eaa4d8913dce34ad679a313f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/27028716$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/27028716$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,780,784,803,832,885,27924,27925,58017,58021,58250,58254</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/34267407$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fang, Ethan X.</creatorcontrib><creatorcontrib>Ning, Yang</creatorcontrib><creatorcontrib>Li, Runze</creatorcontrib><title>TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA</title><title>The Annals of statistics</title><addtitle>Ann Stat</addtitle><description>This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 (2002) 479–498) procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.</description><subject>Asymptotic methods</subject><subject>Asymptotic properties</subject><subject>Confidence intervals</subject><subject>Estimating techniques</subject><subject>Nuisance</subject><subject>Parameter estimation</subject><subject>Polynomials</subject><subject>Statistical analysis</subject><subject>Statistical inference</subject><subject>Statistical tests</subject><issn>0090-5364</issn><issn>2168-8966</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNpdkU1Lw0AQhhdRtFYv3pWAFxGisx_ZzSIIoW3aQEzApudlmw9taZuaTQX_vVtai3oahvfhYYYXoSsMD5hg9oilq2uDJcAR6hDMfdeXnB-jDoAE16OcnaFzY-YA4ElGT9EZZYQLBqKDnrLBOHPS0BlHwyQKo16Q9AZOmL46o2g4cvvRyyAZR2kSxE6cJsMom_Sj7dIPsuACnVR6YcrL_eyiSTjIeiM3TodWFLs5k6x1dUWKIqcFMJ4zoP5UaoxJkbNKA6uoqIBMde7JUmtW-BLTIi8p0wUXUlNMK9pFzzvvejNdljZdtY1eqHUzW-rmS9V6pv4mq9m7eqs_lU-EwB6zgru9oKk_NqVp1XJm8nKx0Kuy3hhFPI9IC_tg0dt_6LzeNCv7niLMw1gA-NxS9zsqb2pjmrI6HINBbTtRWKogHW87sfDN7_MP6E8JFrjeAXPT1s0hJwKILzCn3990itw</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Fang, Ethan X.</creator><creator>Ning, Yang</creator><creator>Li, Runze</creator><general>Institute of Mathematical Statistics</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20201001</creationdate><title>TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA</title><author>Fang, Ethan X. ; Ning, Yang ; Li, Runze</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c494t-af2ddc3d046c4038b9a112dc4fa04f37f02bac59eaa4d8913dce34ad679a313f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Asymptotic methods</topic><topic>Asymptotic properties</topic><topic>Confidence intervals</topic><topic>Estimating techniques</topic><topic>Nuisance</topic><topic>Parameter estimation</topic><topic>Polynomials</topic><topic>Statistical analysis</topic><topic>Statistical inference</topic><topic>Statistical tests</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fang, Ethan X.</creatorcontrib><creatorcontrib>Ning, Yang</creatorcontrib><creatorcontrib>Li, Runze</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>The Annals of statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fang, Ethan X.</au><au>Ning, Yang</au><au>Li, Runze</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA</atitle><jtitle>The Annals of statistics</jtitle><addtitle>Ann Stat</addtitle><date>2020-10-01</date><risdate>2020</risdate><volume>48</volume><issue>5</issue><spage>2622</spage><epage>2645</epage><pages>2622-2645</pages><issn>0090-5364</issn><eissn>2168-8966</eissn><abstract>This paper concerns statistical inference for longitudinal data with ultrahigh dimensional covariates. We first study the problem of constructing confidence intervals and hypothesis tests for a low-dimensional parameter of interest. The major challenge is how to construct a powerful test statistic in the presence of high-dimensional nuisance parameters and sophisticated within-subject correlation of longitudinal data. To deal with the challenge, we propose a new quadratic decorrelated inference function approach which simultaneously removes the impact of nuisance parameters and incorporates the correlation to enhance the efficiency of the estimation procedure. When the parameter of interest is of fixed dimension, we prove that the proposed estimator is asymptotically normal and attains the semiparametric information bound, based on which we can construct an optimal Wald test statistic. We further extend this result and establish the limiting distribution of the estimator under the setting with the dimension of the parameter of interest growing with the sample size at a polynomial rate. Finally, we study how to control the false discovery rate (FDR) when a vector of high-dimensional regression parameters is of interest. We prove that applying the Storey (J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 (2002) 479–498) procedure to the proposed test statistics for each regression parameter controls FDR asymptotically in longitudinal data. We conduct simulation studies to assess the finite sample performance of the proposed procedures. Our simulation results imply that the newly proposed procedure can control both Type I error for testing a low dimensional parameter of interest and the FDR in the multiple testing problem. We also apply the proposed procedure to a real data example.</abstract><cop>United States</cop><pub>Institute of Mathematical Statistics</pub><pmid>34267407</pmid><doi>10.1214/19-aos1900</doi><tpages>24</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0090-5364
ispartof The Annals of statistics, 2020-10, Vol.48 (5), p.2622-2645
issn 0090-5364
2168-8966
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8277154
source JSTOR Mathematics & Statistics; JSTOR Archive Collection A-Z Listing; EZB-FREE-00999 freely available EZB journals; Project Euclid Complete
subjects Asymptotic methods
Asymptotic properties
Confidence intervals
Estimating techniques
Nuisance
Parameter estimation
Polynomials
Statistical analysis
Statistical inference
Statistical tests
title TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T20%3A54%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TEST%20OF%20SIGNIFICANCE%20FOR%20HIGH-DIMENSIONAL%20LONGITUDINAL%20DATA&rft.jtitle=The%20Annals%20of%20statistics&rft.au=Fang,%20Ethan%20X.&rft.date=2020-10-01&rft.volume=48&rft.issue=5&rft.spage=2622&rft.epage=2645&rft.pages=2622-2645&rft.issn=0090-5364&rft.eissn=2168-8966&rft_id=info:doi/10.1214/19-aos1900&rft_dat=%3Cjstor_pubme%3E27028716%3C/jstor_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2451170086&rft_id=info:pmid/34267407&rft_jstor_id=27028716&rfr_iscdi=true