Small-Area Estimation with Zero-Inflated Data – a Simulation Study
Many target variables in official statistics follow a semicontinuous distribution with a mixture of zeros and continuously distributed positive values. Such variables are called zero inflated. When reliable estimates for subpopulations with small sample sizes are required, model-based small-area est...
Gespeichert in:
Veröffentlicht in: | Journal of official statistics 2016-12, Vol.32 (4), p.963-986 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 986 |
---|---|
container_issue | 4 |
container_start_page | 963 |
container_title | Journal of official statistics |
container_volume | 32 |
creator | Krieg, Sabine Boonstra, Harm Jan Smeets, Marc |
description | Many target variables in official statistics follow a semicontinuous distribution with a mixture of zeros and continuously distributed positive values. Such variables are called zero inflated. When reliable estimates for subpopulations with small sample sizes are required, model-based small-area estimators can be used, which improve the accuracy of the estimates by borrowing information from other subpopulations. In this article, three small-area estimators are investigated. The first estimator is the EBLUP, which can be considered the most common small-area estimator and is based on a linear mixed model that assumes normal distributions. Therefore, the EBLUP is model misspecified in the case of zero-inflated variables. The other two small-area estimators are based on a model that takes zero inflation explicitly into account. Both the Bayesian and the frequentist approach are considered. These small-area estimators are compared with each other and with design-based estimation in a simulation study with zero-inflated target variables. Both a simulation with artificial data and a simulation with real data from the Dutch Household Budget Survey are carried out. It is found that the small-area estimators improve the accuracy compared to the design-based estimator. The amount of improvement strongly depends on the properties of the population and the subpopulations of interest. |
doi_str_mv | 10.1515/jos-2016-0051 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2236531032</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1515_jos-2016-0051</sage_id><sourcerecordid>4265420381</sourcerecordid><originalsourceid>FETCH-LOGICAL-c419t-d6dc25fd37345c8bda4f884911430b998545252dd35b8dca37ffdb253fa11a4a3</originalsourceid><addsrcrecordid>eNqFkM1KAzEURoMoWKtL9wPuhGhufmYyuCpt1ULBRRXEzZCZJLVl2qlJhtKd7-Ab-iSmjAsXoqt7F-d8l_shdA7kCgSI62XjMSWQYkIEHKAeJQRwxtLsEPUIlRRzyp6P0Yn3S0JYzij00Gi2UnWNB86oZOzDYqXColkn20V4TV6Ma_BkbWsVjE5GKqjk8_0jUclssWrrDpyFVu9O0ZFVtTdn37OPnm7Hj8N7PH24mwwHU1xxyAPWqa6osJpljItKllpxKyXPATgjZZ5LwQUVVGsmSqkrxTJrdUkFswpAccX66KLL3bjmrTU-FMumdet4sqCUpYIBYfQvCiTnGeHAZaRwR1Wu8d4ZW2xc_N7tCiDFvs5oxdhYZ7GvM_I3Hb9VdTBOm7lrd3H5Ef6rR3mesmhfdrZXc_Of8gW05Ye_</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1844704148</pqid></control><display><type>article</type><title>Small-Area Estimation with Zero-Inflated Data – a Simulation Study</title><source>De Gruyter Open Access Journals</source><source>Sage Journals GOLD Open Access 2024</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Sociological Abstracts</source><creator>Krieg, Sabine ; Boonstra, Harm Jan ; Smeets, Marc</creator><creatorcontrib>Krieg, Sabine ; Boonstra, Harm Jan ; Smeets, Marc</creatorcontrib><description>Many target variables in official statistics follow a semicontinuous distribution with a mixture of zeros and continuously distributed positive values. Such variables are called zero inflated. When reliable estimates for subpopulations with small sample sizes are required, model-based small-area estimators can be used, which improve the accuracy of the estimates by borrowing information from other subpopulations. In this article, three small-area estimators are investigated. The first estimator is the EBLUP, which can be considered the most common small-area estimator and is based on a linear mixed model that assumes normal distributions. Therefore, the EBLUP is model misspecified in the case of zero-inflated variables. The other two small-area estimators are based on a model that takes zero inflation explicitly into account. Both the Bayesian and the frequentist approach are considered. These small-area estimators are compared with each other and with design-based estimation in a simulation study with zero-inflated target variables. Both a simulation with artificial data and a simulation with real data from the Dutch Household Budget Survey are carried out. It is found that the small-area estimators improve the accuracy compared to the design-based estimator. The amount of improvement strongly depends on the properties of the population and the subpopulations of interest.</description><identifier>ISSN: 0282-423X</identifier><identifier>ISSN: 2001-7367</identifier><identifier>EISSN: 2001-7367</identifier><identifier>DOI: 10.1515/jos-2016-0051</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Bayesian analysis ; Computer simulation ; Dutch Household Budget Survey ; EBLUP ; Estimating techniques ; Estimators ; Generalized linear mixed model ; Inflation ; Logit ; Mathematical models ; MCMC ; Simulation ; Statistics</subject><ispartof>Journal of official statistics, 2016-12, Vol.32 (4), p.963-986</ispartof><rights>by Sabine Krieg</rights><rights>Copyright Statistics Sweden (SCB) 2016</rights><rights>2016. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c419t-d6dc25fd37345c8bda4f884911430b998545252dd35b8dca37ffdb253fa11a4a3</citedby><cites>FETCH-LOGICAL-c419t-d6dc25fd37345c8bda4f884911430b998545252dd35b8dca37ffdb253fa11a4a3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://journals.sagepub.com/doi/pdf/10.1515/jos-2016-0051$$EPDF$$P50$$Gsage$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://journals.sagepub.com/doi/10.1515/jos-2016-0051$$EHTML$$P50$$Gsage$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,21945,27321,27830,27901,27902,33751,44921,45309,66901,68685</link.rule.ids></links><search><creatorcontrib>Krieg, Sabine</creatorcontrib><creatorcontrib>Boonstra, Harm Jan</creatorcontrib><creatorcontrib>Smeets, Marc</creatorcontrib><title>Small-Area Estimation with Zero-Inflated Data – a Simulation Study</title><title>Journal of official statistics</title><description>Many target variables in official statistics follow a semicontinuous distribution with a mixture of zeros and continuously distributed positive values. Such variables are called zero inflated. When reliable estimates for subpopulations with small sample sizes are required, model-based small-area estimators can be used, which improve the accuracy of the estimates by borrowing information from other subpopulations. In this article, three small-area estimators are investigated. The first estimator is the EBLUP, which can be considered the most common small-area estimator and is based on a linear mixed model that assumes normal distributions. Therefore, the EBLUP is model misspecified in the case of zero-inflated variables. The other two small-area estimators are based on a model that takes zero inflation explicitly into account. Both the Bayesian and the frequentist approach are considered. These small-area estimators are compared with each other and with design-based estimation in a simulation study with zero-inflated target variables. Both a simulation with artificial data and a simulation with real data from the Dutch Household Budget Survey are carried out. It is found that the small-area estimators improve the accuracy compared to the design-based estimator. The amount of improvement strongly depends on the properties of the population and the subpopulations of interest.</description><subject>Bayesian analysis</subject><subject>Computer simulation</subject><subject>Dutch Household Budget Survey</subject><subject>EBLUP</subject><subject>Estimating techniques</subject><subject>Estimators</subject><subject>Generalized linear mixed model</subject><subject>Inflation</subject><subject>Logit</subject><subject>Mathematical models</subject><subject>MCMC</subject><subject>Simulation</subject><subject>Statistics</subject><issn>0282-423X</issn><issn>2001-7367</issn><issn>2001-7367</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>AFRWT</sourceid><sourceid>BENPR</sourceid><sourceid>BHHNA</sourceid><recordid>eNqFkM1KAzEURoMoWKtL9wPuhGhufmYyuCpt1ULBRRXEzZCZJLVl2qlJhtKd7-Ab-iSmjAsXoqt7F-d8l_shdA7kCgSI62XjMSWQYkIEHKAeJQRwxtLsEPUIlRRzyp6P0Yn3S0JYzij00Gi2UnWNB86oZOzDYqXColkn20V4TV6Ma_BkbWsVjE5GKqjk8_0jUclssWrrDpyFVu9O0ZFVtTdn37OPnm7Hj8N7PH24mwwHU1xxyAPWqa6osJpljItKllpxKyXPATgjZZ5LwQUVVGsmSqkrxTJrdUkFswpAccX66KLL3bjmrTU-FMumdet4sqCUpYIBYfQvCiTnGeHAZaRwR1Wu8d4ZW2xc_N7tCiDFvs5oxdhYZ7GvM_I3Hb9VdTBOm7lrd3H5Ef6rR3mesmhfdrZXc_Of8gW05Ye_</recordid><startdate>20161201</startdate><enddate>20161201</enddate><creator>Krieg, Sabine</creator><creator>Boonstra, Harm Jan</creator><creator>Smeets, Marc</creator><general>SAGE Publications</general><general>De Gruyter Open</general><general>Statistics Sweden (SCB)</general><scope>AFRWT</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>0-V</scope><scope>3V.</scope><scope>7U4</scope><scope>7XB</scope><scope>88J</scope><scope>8C1</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BFMQW</scope><scope>BGLVJ</scope><scope>BHHNA</scope><scope>CCPQU</scope><scope>DWI</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>HEHIP</scope><scope>L6V</scope><scope>M2R</scope><scope>M2S</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>WZK</scope></search><sort><creationdate>20161201</creationdate><title>Small-Area Estimation with Zero-Inflated Data – a Simulation Study</title><author>Krieg, Sabine ; Boonstra, Harm Jan ; Smeets, Marc</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c419t-d6dc25fd37345c8bda4f884911430b998545252dd35b8dca37ffdb253fa11a4a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Bayesian analysis</topic><topic>Computer simulation</topic><topic>Dutch Household Budget Survey</topic><topic>EBLUP</topic><topic>Estimating techniques</topic><topic>Estimators</topic><topic>Generalized linear mixed model</topic><topic>Inflation</topic><topic>Logit</topic><topic>Mathematical models</topic><topic>MCMC</topic><topic>Simulation</topic><topic>Statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Krieg, Sabine</creatorcontrib><creatorcontrib>Boonstra, Harm Jan</creatorcontrib><creatorcontrib>Smeets, Marc</creatorcontrib><collection>Sage Journals GOLD Open Access 2024</collection><collection>CrossRef</collection><collection>ProQuest Social Sciences Premium Collection</collection><collection>ProQuest Central (Corporate)</collection><collection>Sociological Abstracts (pre-2017)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Social Science Database (Alumni Edition)</collection><collection>Public Health Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Social Science Premium Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Continental Europe Database</collection><collection>Technology Collection</collection><collection>Sociological Abstracts</collection><collection>ProQuest One Community College</collection><collection>Sociological Abstracts</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>Sociology Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Social Science Database</collection><collection>Sociology Database</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>Sociological Abstracts (Ovid)</collection><jtitle>Journal of official statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Krieg, Sabine</au><au>Boonstra, Harm Jan</au><au>Smeets, Marc</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Small-Area Estimation with Zero-Inflated Data – a Simulation Study</atitle><jtitle>Journal of official statistics</jtitle><date>2016-12-01</date><risdate>2016</risdate><volume>32</volume><issue>4</issue><spage>963</spage><epage>986</epage><pages>963-986</pages><issn>0282-423X</issn><issn>2001-7367</issn><eissn>2001-7367</eissn><abstract>Many target variables in official statistics follow a semicontinuous distribution with a mixture of zeros and continuously distributed positive values. Such variables are called zero inflated. When reliable estimates for subpopulations with small sample sizes are required, model-based small-area estimators can be used, which improve the accuracy of the estimates by borrowing information from other subpopulations. In this article, three small-area estimators are investigated. The first estimator is the EBLUP, which can be considered the most common small-area estimator and is based on a linear mixed model that assumes normal distributions. Therefore, the EBLUP is model misspecified in the case of zero-inflated variables. The other two small-area estimators are based on a model that takes zero inflation explicitly into account. Both the Bayesian and the frequentist approach are considered. These small-area estimators are compared with each other and with design-based estimation in a simulation study with zero-inflated target variables. Both a simulation with artificial data and a simulation with real data from the Dutch Household Budget Survey are carried out. It is found that the small-area estimators improve the accuracy compared to the design-based estimator. The amount of improvement strongly depends on the properties of the population and the subpopulations of interest.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1515/jos-2016-0051</doi><tpages>24</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0282-423X |
ispartof | Journal of official statistics, 2016-12, Vol.32 (4), p.963-986 |
issn | 0282-423X 2001-7367 2001-7367 |
language | eng |
recordid | cdi_proquest_journals_2236531032 |
source | De Gruyter Open Access Journals; Sage Journals GOLD Open Access 2024; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Sociological Abstracts |
subjects | Bayesian analysis Computer simulation Dutch Household Budget Survey EBLUP Estimating techniques Estimators Generalized linear mixed model Inflation Logit Mathematical models MCMC Simulation Statistics |
title | Small-Area Estimation with Zero-Inflated Data – a Simulation Study |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T18%3A18%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Small-Area%20Estimation%20with%20Zero-Inflated%20Data%20%E2%80%93%20a%20Simulation%20Study&rft.jtitle=Journal%20of%20official%20statistics&rft.au=Krieg,%20Sabine&rft.date=2016-12-01&rft.volume=32&rft.issue=4&rft.spage=963&rft.epage=986&rft.pages=963-986&rft.issn=0282-423X&rft.eissn=2001-7367&rft_id=info:doi/10.1515/jos-2016-0051&rft_dat=%3Cproquest_cross%3E4265420381%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1844704148&rft_id=info:pmid/&rft_sage_id=10.1515_jos-2016-0051&rfr_iscdi=true |