Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse

Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Lee, Danhyang, Kim, Jae Kwang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Lee, Danhyang
Kim, Jae Kwang
description Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this paper, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. We show that the proposed mixture model achieves a lower approximation error bound to any unknown target density than the Gaussian mixture model in terms of the Kullback-Leibler divergence. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea. Supplementary material is available online.
doi_str_mv 10.48550/arxiv.1909.06534
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1909_06534</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1909_06534</sourcerecordid><originalsourceid>FETCH-LOGICAL-a674-fc89bf1b20d17dbb6c24d839fb4a1949ceaf75b49a50c91d54aedb3ddcf0cabb3</originalsourceid><addsrcrecordid>eNotz7FOwzAUhWEvDKjwAEz4BRLs2k7iEUVQIrUwUBaW6Nq-RpYSJ7ITVN4etTAd_cuRPkLuOCtloxR7gHQK3yXXTJesUkJek893HMMMCUZcUrC0G-d1gSVMkX7kEL9oO0UXzg0D3cGac4BID-G0rAnpYXI4ZLpGh4l2C470dYoJ8zzFjDfkysOQ8fZ_N-T4_HRsX4r9265rH_cFVLUsvG208dxsmeO1M6ayW-kaob2RwLXUFsHXykgNilnNnZKAzgjnrGcWjBEbcv93e8H1cwojpJ_-jOwvSPEL5AFQDw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse</title><source>arXiv.org</source><creator>Lee, Danhyang ; Kim, Jae Kwang</creator><creatorcontrib>Lee, Danhyang ; Kim, Jae Kwang</creatorcontrib><description>Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this paper, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. We show that the proposed mixture model achieves a lower approximation error bound to any unknown target density than the Gaussian mixture model in terms of the Kullback-Leibler divergence. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea. Supplementary material is available online.</description><identifier>DOI: 10.48550/arxiv.1909.06534</identifier><language>eng</language><subject>Statistics - Methodology</subject><creationdate>2019-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1909.06534$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1909.06534$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Lee, Danhyang</creatorcontrib><creatorcontrib>Kim, Jae Kwang</creatorcontrib><title>Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse</title><description>Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this paper, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. We show that the proposed mixture model achieves a lower approximation error bound to any unknown target density than the Gaussian mixture model in terms of the Kullback-Leibler divergence. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea. Supplementary material is available online.</description><subject>Statistics - Methodology</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz7FOwzAUhWEvDKjwAEz4BRLs2k7iEUVQIrUwUBaW6Nq-RpYSJ7ITVN4etTAd_cuRPkLuOCtloxR7gHQK3yXXTJesUkJek893HMMMCUZcUrC0G-d1gSVMkX7kEL9oO0UXzg0D3cGac4BID-G0rAnpYXI4ZLpGh4l2C470dYoJ8zzFjDfkysOQ8fZ_N-T4_HRsX4r9265rH_cFVLUsvG208dxsmeO1M6ayW-kaob2RwLXUFsHXykgNilnNnZKAzgjnrGcWjBEbcv93e8H1cwojpJ_-jOwvSPEL5AFQDw</recordid><startdate>20190914</startdate><enddate>20190914</enddate><creator>Lee, Danhyang</creator><creator>Kim, Jae Kwang</creator><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20190914</creationdate><title>Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse</title><author>Lee, Danhyang ; Kim, Jae Kwang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a674-fc89bf1b20d17dbb6c24d839fb4a1949ceaf75b49a50c91d54aedb3ddcf0cabb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Statistics - Methodology</topic><toplevel>online_resources</toplevel><creatorcontrib>Lee, Danhyang</creatorcontrib><creatorcontrib>Kim, Jae Kwang</creatorcontrib><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Lee, Danhyang</au><au>Kim, Jae Kwang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse</atitle><date>2019-09-14</date><risdate>2019</risdate><abstract>Imputation is a popular technique for handling item nonresponse in survey sampling. Parametric imputation is based on a parametric model for imputation and is less robust against the failure of the imputation model. Nonparametric imputation is fully robust but is not applicable when the dimension of covariates is large due to the curse of dimensionality. Semiparametric imputation is another robust imputation based on a flexible model where the number of model parameters can increase with the sample size. In this paper, we propose another semiparametric imputation based on a more flexible model assumption than the Gaussian mixture model. In the proposed mixture model, we assume a conditional Gaussian model for the study variable given the auxiliary variables, but the marginal distribution of the auxiliary variables is not necessarily Gaussian. We show that the proposed mixture model achieves a lower approximation error bound to any unknown target density than the Gaussian mixture model in terms of the Kullback-Leibler divergence. The proposed method is applicable to high dimensional covariate problem by including a penalty function in the conditional log-likelihood function. The proposed method is applied to 2017 Korean Household Income and Expenditure Survey conducted by Statistics Korea. Supplementary material is available online.</abstract><doi>10.48550/arxiv.1909.06534</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1909.06534
ispartof
issn
language eng
recordid cdi_arxiv_primary_1909_06534
source arXiv.org
subjects Statistics - Methodology
title Semiparametric Imputation Using Conditional Gaussian Mixture Models under Item Nonresponse
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T06%3A39%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Semiparametric%20Imputation%20Using%20Conditional%20Gaussian%20Mixture%20Models%20under%20Item%20Nonresponse&rft.au=Lee,%20Danhyang&rft.date=2019-09-14&rft_id=info:doi/10.48550/arxiv.1909.06534&rft_dat=%3Carxiv_GOX%3E1909_06534%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true