On bandwidth choice for spatial data density estimation

Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of the Royal Statistical Society. Series B, Statistical methodology Statistical methodology, 2020-07, Vol.82 (3), p.817-840
Hauptverfasser: Jiang, Zhenyu, Ling, Nengxiang, Lu, Zudi, Tjøstheim, Dag, Zhang, Qiang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 840
container_issue 3
container_start_page 817
container_title Journal of the Royal Statistical Society. Series B, Statistical methodology
container_volume 82
creator Jiang, Zhenyu
Ling, Nengxiang
Lu, Zudi
Tjøstheim, Dag
Zhang, Qiang
description Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross-validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the cross-validation idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV-selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite sample performance of the SCV adaptive bandwidth choice works quite well. It outperforms the existing R routines such as the ‘rule of thumb’ and the so-called ‘second-generation’ Sheather–Jones bandwidths for moderate and big data sets. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.
doi_str_mv 10.1111/rssb.12367
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_journals_2410525384</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26937896</jstor_id><sourcerecordid>26937896</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3927-f7729cdc339cd8c74276cf23d6f33554b1b7d90fbec5c8319aa6dd0cea8b9c333</originalsourceid><addsrcrecordid>eNp9kMtLAzEQxoMoWKsX70LAm7B1k9nN46jFFxQKVs8hmwfdUjc12VL635u66tE5zAzD75sZPoQuSTkhOW5jSs2EUGD8CI1IxXghBRPHuQcmC14ReorOUlqVORiHEeLzDje6s7vW9ktslqE1DvsQcdrovtVrbHWvsXVdavs9dqlvP_I8dOfoxOt1chc_dYzeHx_eps_FbP70Mr2bFQYk5YXnnEpjDUDOwvCKcmY8Bcs8QF1XDWm4laVvnKmNACK1ZtaWxmnRyKyCMboe9m5i-Nzm-2oVtrHLJxWtSFnTGkSVqZuBMjGkFJ1Xm5gfjXtFSnUwRh2MUd_GZJgM8K5du_0_pHpdLO5_NVeDZpX6EP80lEngQjL4At8bb0U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2410525384</pqid></control><display><type>article</type><title>On bandwidth choice for spatial data density estimation</title><source>Business Source Complete</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>Wiley Online Library All Journals</source><creator>Jiang, Zhenyu ; Ling, Nengxiang ; Lu, Zudi ; Tjøstheim, Dag ; Zhang, Qiang</creator><creatorcontrib>Jiang, Zhenyu ; Ling, Nengxiang ; Lu, Zudi ; Tjøstheim, Dag ; Zhang, Qiang</creatorcontrib><description>Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross-validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the cross-validation idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV-selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite sample performance of the SCV adaptive bandwidth choice works quite well. It outperforms the existing R routines such as the ‘rule of thumb’ and the so-called ‘second-generation’ Sheather–Jones bandwidths for moderate and big data sets. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.</description><identifier>ISSN: 1369-7412</identifier><identifier>EISSN: 1467-9868</identifier><identifier>DOI: 10.1111/rssb.12367</identifier><language>eng</language><publisher>Oxford: Wiley</publisher><subject>Adaptive sampling ; Asymptotic properties ; Bandwidths ; Computer simulation ; Cross‐validation ; Data smoothing ; Density ; Kernel density estimation ; Optimal bandwidth ; Original Articles ; Regression analysis ; Spatial data ; Spatial lattice data ; Spatially adaptive bandwidth choice ; Statistical methods ; Statistics ; Time series ; Validity</subject><ispartof>Journal of the Royal Statistical Society. Series B, Statistical methodology, 2020-07, Vol.82 (3), p.817-840</ispartof><rights>2020 Royal Statistical Society</rights><rights>Copyright © 2020 The Royal Statistical Society and Blackwell Publishing Ltd</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3927-f7729cdc339cd8c74276cf23d6f33554b1b7d90fbec5c8319aa6dd0cea8b9c333</citedby><cites>FETCH-LOGICAL-c3927-f7729cdc339cd8c74276cf23d6f33554b1b7d90fbec5c8319aa6dd0cea8b9c333</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2Frssb.12367$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2Frssb.12367$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,27923,27924,45573,45574</link.rule.ids></links><search><creatorcontrib>Jiang, Zhenyu</creatorcontrib><creatorcontrib>Ling, Nengxiang</creatorcontrib><creatorcontrib>Lu, Zudi</creatorcontrib><creatorcontrib>Tjøstheim, Dag</creatorcontrib><creatorcontrib>Zhang, Qiang</creatorcontrib><title>On bandwidth choice for spatial data density estimation</title><title>Journal of the Royal Statistical Society. Series B, Statistical methodology</title><description>Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross-validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the cross-validation idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV-selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite sample performance of the SCV adaptive bandwidth choice works quite well. It outperforms the existing R routines such as the ‘rule of thumb’ and the so-called ‘second-generation’ Sheather–Jones bandwidths for moderate and big data sets. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.</description><subject>Adaptive sampling</subject><subject>Asymptotic properties</subject><subject>Bandwidths</subject><subject>Computer simulation</subject><subject>Cross‐validation</subject><subject>Data smoothing</subject><subject>Density</subject><subject>Kernel density estimation</subject><subject>Optimal bandwidth</subject><subject>Original Articles</subject><subject>Regression analysis</subject><subject>Spatial data</subject><subject>Spatial lattice data</subject><subject>Spatially adaptive bandwidth choice</subject><subject>Statistical methods</subject><subject>Statistics</subject><subject>Time series</subject><subject>Validity</subject><issn>1369-7412</issn><issn>1467-9868</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp9kMtLAzEQxoMoWKsX70LAm7B1k9nN46jFFxQKVs8hmwfdUjc12VL635u66tE5zAzD75sZPoQuSTkhOW5jSs2EUGD8CI1IxXghBRPHuQcmC14ReorOUlqVORiHEeLzDje6s7vW9ktslqE1DvsQcdrovtVrbHWvsXVdavs9dqlvP_I8dOfoxOt1chc_dYzeHx_eps_FbP70Mr2bFQYk5YXnnEpjDUDOwvCKcmY8Bcs8QF1XDWm4laVvnKmNACK1ZtaWxmnRyKyCMboe9m5i-Nzm-2oVtrHLJxWtSFnTGkSVqZuBMjGkFJ1Xm5gfjXtFSnUwRh2MUd_GZJgM8K5du_0_pHpdLO5_NVeDZpX6EP80lEngQjL4At8bb0U</recordid><startdate>202007</startdate><enddate>202007</enddate><creator>Jiang, Zhenyu</creator><creator>Ling, Nengxiang</creator><creator>Lu, Zudi</creator><creator>Tjøstheim, Dag</creator><creator>Zhang, Qiang</creator><general>Wiley</general><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8BJ</scope><scope>8FD</scope><scope>FQK</scope><scope>JBE</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>202007</creationdate><title>On bandwidth choice for spatial data density estimation</title><author>Jiang, Zhenyu ; Ling, Nengxiang ; Lu, Zudi ; Tjøstheim, Dag ; Zhang, Qiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3927-f7729cdc339cd8c74276cf23d6f33554b1b7d90fbec5c8319aa6dd0cea8b9c333</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Adaptive sampling</topic><topic>Asymptotic properties</topic><topic>Bandwidths</topic><topic>Computer simulation</topic><topic>Cross‐validation</topic><topic>Data smoothing</topic><topic>Density</topic><topic>Kernel density estimation</topic><topic>Optimal bandwidth</topic><topic>Original Articles</topic><topic>Regression analysis</topic><topic>Spatial data</topic><topic>Spatial lattice data</topic><topic>Spatially adaptive bandwidth choice</topic><topic>Statistical methods</topic><topic>Statistics</topic><topic>Time series</topic><topic>Validity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jiang, Zhenyu</creatorcontrib><creatorcontrib>Ling, Nengxiang</creatorcontrib><creatorcontrib>Lu, Zudi</creatorcontrib><creatorcontrib>Tjøstheim, Dag</creatorcontrib><creatorcontrib>Zhang, Qiang</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>International Bibliography of the Social Sciences (IBSS)</collection><collection>Technology Research Database</collection><collection>International Bibliography of the Social Sciences</collection><collection>International Bibliography of the Social Sciences</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of the Royal Statistical Society. Series B, Statistical methodology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jiang, Zhenyu</au><au>Ling, Nengxiang</au><au>Lu, Zudi</au><au>Tjøstheim, Dag</au><au>Zhang, Qiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>On bandwidth choice for spatial data density estimation</atitle><jtitle>Journal of the Royal Statistical Society. Series B, Statistical methodology</jtitle><date>2020-07</date><risdate>2020</risdate><volume>82</volume><issue>3</issue><spage>817</spage><epage>840</epage><pages>817-840</pages><issn>1369-7412</issn><eissn>1467-9868</eissn><abstract>Bandwidth choice is crucial in spatial kernel estimation in exploring non-Gaussian complex spatial data. The paper investigates the choice of adaptive and non-adaptive bandwidths for density estimation given data on a spatial lattice. An adaptive bandwidth depends on local data and hence adaptively conforms with local features of the spatial data. We propose a spatial cross-validation (SCV) choice of a global bandwidth. This is done first with a pilot density involved in the expression for the adaptive bandwidth. The optimality of the procedure is established, and it is shown that a non-adaptive bandwidth choice comes out as a special case. Although the cross-validation idea has been popular for choosing a non-adaptive bandwidth in data-driven smoothing of independent and time series data, its theory and application have not been much investigated for spatial data. For the adaptive case, there is little theory even for independent data. Conditions that ensure asymptotic optimality of the SCV-selected bandwidth are derived, actually, also extending time series and independent data optimality results. Further, for the adaptive bandwidth with an estimated pilot density, oracle properties of the resultant density estimator are obtained asymptotically as if the true pilot were known. Numerical simulations show that finite sample performance of the SCV adaptive bandwidth choice works quite well. It outperforms the existing R routines such as the ‘rule of thumb’ and the so-called ‘second-generation’ Sheather–Jones bandwidths for moderate and big data sets. An empirical application to a set of spatial soil data is further implemented with non-Gaussian features significantly identified.</abstract><cop>Oxford</cop><pub>Wiley</pub><doi>10.1111/rssb.12367</doi><tpages>24</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1369-7412
ispartof Journal of the Royal Statistical Society. Series B, Statistical methodology, 2020-07, Vol.82 (3), p.817-840
issn 1369-7412
1467-9868
language eng
recordid cdi_proquest_journals_2410525384
source Business Source Complete; Oxford University Press Journals All Titles (1996-Current); Wiley Online Library All Journals
subjects Adaptive sampling
Asymptotic properties
Bandwidths
Computer simulation
Cross‐validation
Data smoothing
Density
Kernel density estimation
Optimal bandwidth
Original Articles
Regression analysis
Spatial data
Spatial lattice data
Spatially adaptive bandwidth choice
Statistical methods
Statistics
Time series
Validity
title On bandwidth choice for spatial data density estimation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T12%3A53%3A31IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=On%20bandwidth%20choice%20for%20spatial%20data%20density%20estimation&rft.jtitle=Journal%20of%20the%20Royal%20Statistical%20Society.%20Series%20B,%20Statistical%20methodology&rft.au=Jiang,%20Zhenyu&rft.date=2020-07&rft.volume=82&rft.issue=3&rft.spage=817&rft.epage=840&rft.pages=817-840&rft.issn=1369-7412&rft.eissn=1467-9868&rft_id=info:doi/10.1111/rssb.12367&rft_dat=%3Cjstor_proqu%3E26937896%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2410525384&rft_id=info:pmid/&rft_jstor_id=26937896&rfr_iscdi=true