Rapid outlier detection, model selection and variable selection using penalized likelihood estimation for general spatial models

The outliers in the data set have a potential influence on the statistical inference and can provide some useful information behind the data set, the methodology for outlier detection and accommodation is always an important topic in data analysis. For spatial data, its influence not only affects co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Spatial statistics 2024-06, Vol.61, p.100834, Article 100834
Hauptverfasser: Song, Yunquan, Fang, Minglu, Wang, Yuanfeng, Hou, Yiming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The outliers in the data set have a potential influence on the statistical inference and can provide some useful information behind the data set, the methodology for outlier detection and accommodation is always an important topic in data analysis. For spatial data, its influence not only affects coefficient estimation but model selection. The traditional method usually carries out outlier detection, model selection and variable selection step by step, so the data processing efficiency is not high. In order to further improve the efficiency and accuracy of data processing, based on the general spatial model, we consider a technique to achieve outlier detection, along with model and variable estimation in one step. In the general spatial model, we add a mean shift parameter for each data point to identify outliers. Penalized likelihood estimation (PLE) is proposed to simultaneously detect outliers, and to select spatial models and explanatory variables for spatial data. This method correctly identifies multiple outliers, provides a proper spatial model, and corrects coefficient estimation without removing outliers in numerical simulation and case analysis. Compared to current methods, PLE detects outliers more quickly, and solves the optimization problem to select spatial models and explanatory variables. Calculation is easy using the optimized solnp function in R software.
ISSN:2211-6753
2211-6753
DOI:10.1016/j.spasta.2024.100834