Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction

Nonparametric estimation of the conditional expectation of an outcome Y given a covariate vector U is of primary importance in many statistical applications such as prediction and personalized medicine. In some problems, there is an additional auxiliary variable Z in the training dataset used to con...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the American Statistical Association 2021-07, Vol.116 (535), p.1346-1357
Hauptverfasser: Xie, Bingying, Shao, Jun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1357
container_issue 535
container_start_page 1346
container_title Journal of the American Statistical Association
container_volume 116
creator Xie, Bingying
Shao, Jun
description Nonparametric estimation of the conditional expectation of an outcome Y given a covariate vector U is of primary importance in many statistical applications such as prediction and personalized medicine. In some problems, there is an additional auxiliary variable Z in the training dataset used to construct estimators, but Z is not available for future prediction or selecting patient treatment in personalized medicine. For example, in the training dataset longitudinal outcomes are observed, but only the last outcome Y is concerned in the future prediction or analysis. The longitudinal outcomes other than the last point is then the variable Z that is observed and related with both Y and U. Previous work on how to make use of Z in the estimation of mainly focused on using Z in the construction of a linear function of U to reduce covariate dimension for better estimation. Using , we propose a two-step estimation of inner and outer expectations, respectively, with sufficient dimension reduction for kernel estimation in both steps. The information from Z is utilized not only in dimension reduction, but also directly in the estimation. Because of the existence of different ways for dimension reduction, we construct two estimators that may improve the estimator without using Z. The improvements are shown in the convergence rate of estimators as the sample size increases to infinity as well as in the finite sample simulation performance. A real data analysis about the selection of mammography intervention is presented for illustration.
doi_str_mv 10.1080/01621459.2020.1713793
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1080_01621459_2020_1713793</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2770817821</sourcerecordid><originalsourceid>FETCH-LOGICAL-c366t-3f68bccd828f9dc20478f0cc8518cfe5e5e6d53db079680174b99aaac2a12cba3</originalsourceid><addsrcrecordid>eNp9UE1Lw0AQXUTBWv0JQsBz6n4k2c3NUqsWioIoeFs2-4Fb0mzc3dDm35uQenXmMMzMe4-ZB8AtggsEGbyHqMAoy8sFhngYUURoSc7ADOWEpphmX-dgNmLSEXQJrkLYwSEoYzNgXl3TCi_2Onork3WIdi-idU3iTLJyjbJjI-pkfWy1jNPqYON3suyOtrbC98mmMc6fWKJRyaPd6yaM3btWnRzn1-DCiDrom1Odg8-n9cfqJd2-PW9Wy20qSVHElJiCVVIqhpkplcQwo8xAKVmOmDQ6H7JQOVEVpGXBIKJZVZZCCIkFwrISZA7uJt3Wu59Oh8h3rvPD_YFjSiFDlGE0oPIJJb0LwWvDWz-87XuOIB8t5X-W8tFSfrJ04D1MPDt9fHC-VjyKvnbeeNFIGzj5X-IXYhV_vA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2770817821</pqid></control><display><type>article</type><title>Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction</title><source>Taylor &amp; Francis Journals Complete</source><creator>Xie, Bingying ; Shao, Jun</creator><creatorcontrib>Xie, Bingying ; Shao, Jun</creatorcontrib><description>Nonparametric estimation of the conditional expectation of an outcome Y given a covariate vector U is of primary importance in many statistical applications such as prediction and personalized medicine. In some problems, there is an additional auxiliary variable Z in the training dataset used to construct estimators, but Z is not available for future prediction or selecting patient treatment in personalized medicine. For example, in the training dataset longitudinal outcomes are observed, but only the last outcome Y is concerned in the future prediction or analysis. The longitudinal outcomes other than the last point is then the variable Z that is observed and related with both Y and U. Previous work on how to make use of Z in the estimation of mainly focused on using Z in the construction of a linear function of U to reduce covariate dimension for better estimation. Using , we propose a two-step estimation of inner and outer expectations, respectively, with sufficient dimension reduction for kernel estimation in both steps. The information from Z is utilized not only in dimension reduction, but also directly in the estimation. Because of the existence of different ways for dimension reduction, we construct two estimators that may improve the estimator without using Z. The improvements are shown in the convergence rate of estimators as the sample size increases to infinity as well as in the finite sample simulation performance. A real data analysis about the selection of mammography intervention is presented for illustration.</description><identifier>ISSN: 0162-1459</identifier><identifier>EISSN: 1537-274X</identifier><identifier>DOI: 10.1080/01621459.2020.1713793</identifier><language>eng</language><publisher>Alexandria: Taylor &amp; Francis</publisher><subject>Auxiliary information ; Convergence ; Convergence rate ; Customization ; Data analysis ; Datasets ; Estimators ; Kernel estimation ; Linear functions ; Mammography ; Nonparametric statistics ; Precision medicine ; Reduction ; Regression analysis ; Simulation ; Statistical methods ; Statistics ; Sufficient dimension reduction ; Training ; Two-step regression</subject><ispartof>Journal of the American Statistical Association, 2021-07, Vol.116 (535), p.1346-1357</ispartof><rights>2020 American Statistical Association 2020</rights><rights>2020 American Statistical Association</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c366t-3f68bccd828f9dc20478f0cc8518cfe5e5e6d53db079680174b99aaac2a12cba3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.tandfonline.com/doi/pdf/10.1080/01621459.2020.1713793$$EPDF$$P50$$Ginformaworld$$H</linktopdf><linktohtml>$$Uhttps://www.tandfonline.com/doi/full/10.1080/01621459.2020.1713793$$EHTML$$P50$$Ginformaworld$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,59647,60436</link.rule.ids></links><search><creatorcontrib>Xie, Bingying</creatorcontrib><creatorcontrib>Shao, Jun</creatorcontrib><title>Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction</title><title>Journal of the American Statistical Association</title><description>Nonparametric estimation of the conditional expectation of an outcome Y given a covariate vector U is of primary importance in many statistical applications such as prediction and personalized medicine. In some problems, there is an additional auxiliary variable Z in the training dataset used to construct estimators, but Z is not available for future prediction or selecting patient treatment in personalized medicine. For example, in the training dataset longitudinal outcomes are observed, but only the last outcome Y is concerned in the future prediction or analysis. The longitudinal outcomes other than the last point is then the variable Z that is observed and related with both Y and U. Previous work on how to make use of Z in the estimation of mainly focused on using Z in the construction of a linear function of U to reduce covariate dimension for better estimation. Using , we propose a two-step estimation of inner and outer expectations, respectively, with sufficient dimension reduction for kernel estimation in both steps. The information from Z is utilized not only in dimension reduction, but also directly in the estimation. Because of the existence of different ways for dimension reduction, we construct two estimators that may improve the estimator without using Z. The improvements are shown in the convergence rate of estimators as the sample size increases to infinity as well as in the finite sample simulation performance. A real data analysis about the selection of mammography intervention is presented for illustration.</description><subject>Auxiliary information</subject><subject>Convergence</subject><subject>Convergence rate</subject><subject>Customization</subject><subject>Data analysis</subject><subject>Datasets</subject><subject>Estimators</subject><subject>Kernel estimation</subject><subject>Linear functions</subject><subject>Mammography</subject><subject>Nonparametric statistics</subject><subject>Precision medicine</subject><subject>Reduction</subject><subject>Regression analysis</subject><subject>Simulation</subject><subject>Statistical methods</subject><subject>Statistics</subject><subject>Sufficient dimension reduction</subject><subject>Training</subject><subject>Two-step regression</subject><issn>0162-1459</issn><issn>1537-274X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9UE1Lw0AQXUTBWv0JQsBz6n4k2c3NUqsWioIoeFs2-4Fb0mzc3dDm35uQenXmMMzMe4-ZB8AtggsEGbyHqMAoy8sFhngYUURoSc7ADOWEpphmX-dgNmLSEXQJrkLYwSEoYzNgXl3TCi_2Onork3WIdi-idU3iTLJyjbJjI-pkfWy1jNPqYON3suyOtrbC98mmMc6fWKJRyaPd6yaM3btWnRzn1-DCiDrom1Odg8-n9cfqJd2-PW9Wy20qSVHElJiCVVIqhpkplcQwo8xAKVmOmDQ6H7JQOVEVpGXBIKJZVZZCCIkFwrISZA7uJt3Wu59Oh8h3rvPD_YFjSiFDlGE0oPIJJb0LwWvDWz-87XuOIB8t5X-W8tFSfrJ04D1MPDt9fHC-VjyKvnbeeNFIGzj5X-IXYhV_vA</recordid><startdate>20210703</startdate><enddate>20210703</enddate><creator>Xie, Bingying</creator><creator>Shao, Jun</creator><general>Taylor &amp; Francis</general><general>Taylor &amp; Francis Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8BJ</scope><scope>FQK</scope><scope>JBE</scope><scope>K9.</scope></search><sort><creationdate>20210703</creationdate><title>Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction</title><author>Xie, Bingying ; Shao, Jun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c366t-3f68bccd828f9dc20478f0cc8518cfe5e5e6d53db079680174b99aaac2a12cba3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Auxiliary information</topic><topic>Convergence</topic><topic>Convergence rate</topic><topic>Customization</topic><topic>Data analysis</topic><topic>Datasets</topic><topic>Estimators</topic><topic>Kernel estimation</topic><topic>Linear functions</topic><topic>Mammography</topic><topic>Nonparametric statistics</topic><topic>Precision medicine</topic><topic>Reduction</topic><topic>Regression analysis</topic><topic>Simulation</topic><topic>Statistical methods</topic><topic>Statistics</topic><topic>Sufficient dimension reduction</topic><topic>Training</topic><topic>Two-step regression</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xie, Bingying</creatorcontrib><creatorcontrib>Shao, Jun</creatorcontrib><collection>CrossRef</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><jtitle>Journal of the American Statistical Association</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xie, Bingying</au><au>Shao, Jun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction</atitle><jtitle>Journal of the American Statistical Association</jtitle><date>2021-07-03</date><risdate>2021</risdate><volume>116</volume><issue>535</issue><spage>1346</spage><epage>1357</epage><pages>1346-1357</pages><issn>0162-1459</issn><eissn>1537-274X</eissn><abstract>Nonparametric estimation of the conditional expectation of an outcome Y given a covariate vector U is of primary importance in many statistical applications such as prediction and personalized medicine. In some problems, there is an additional auxiliary variable Z in the training dataset used to construct estimators, but Z is not available for future prediction or selecting patient treatment in personalized medicine. For example, in the training dataset longitudinal outcomes are observed, but only the last outcome Y is concerned in the future prediction or analysis. The longitudinal outcomes other than the last point is then the variable Z that is observed and related with both Y and U. Previous work on how to make use of Z in the estimation of mainly focused on using Z in the construction of a linear function of U to reduce covariate dimension for better estimation. Using , we propose a two-step estimation of inner and outer expectations, respectively, with sufficient dimension reduction for kernel estimation in both steps. The information from Z is utilized not only in dimension reduction, but also directly in the estimation. Because of the existence of different ways for dimension reduction, we construct two estimators that may improve the estimator without using Z. The improvements are shown in the convergence rate of estimators as the sample size increases to infinity as well as in the finite sample simulation performance. A real data analysis about the selection of mammography intervention is presented for illustration.</abstract><cop>Alexandria</cop><pub>Taylor &amp; Francis</pub><doi>10.1080/01621459.2020.1713793</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0162-1459
ispartof Journal of the American Statistical Association, 2021-07, Vol.116 (535), p.1346-1357
issn 0162-1459
1537-274X
language eng
recordid cdi_crossref_primary_10_1080_01621459_2020_1713793
source Taylor & Francis Journals Complete
subjects Auxiliary information
Convergence
Convergence rate
Customization
Data analysis
Datasets
Estimators
Kernel estimation
Linear functions
Mammography
Nonparametric statistics
Precision medicine
Reduction
Regression analysis
Simulation
Statistical methods
Statistics
Sufficient dimension reduction
Training
Two-step regression
title Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T09%3A01%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Nonparametric%20Estimation%20of%20Conditional%20Expectation%20with%20Auxiliary%20Information%20and%20Dimension%20Reduction&rft.jtitle=Journal%20of%20the%20American%20Statistical%20Association&rft.au=Xie,%20Bingying&rft.date=2021-07-03&rft.volume=116&rft.issue=535&rft.spage=1346&rft.epage=1357&rft.pages=1346-1357&rft.issn=0162-1459&rft.eissn=1537-274X&rft_id=info:doi/10.1080/01621459.2020.1713793&rft_dat=%3Cproquest_cross%3E2770817821%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2770817821&rft_id=info:pmid/&rfr_iscdi=true