Phase Transition Unbiased Estimation in High Dimensional Settings
An important challenge in statistical analysis concerns the control of the finite sample bias of estimators. For example, the maximum likelihood estimator has a bias that can result in a significant inferential loss. This problem is typically magnified in high-dimensional settings where the number o...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An important challenge in statistical analysis concerns the control of the
finite sample bias of estimators. For example, the maximum likelihood estimator
has a bias that can result in a significant inferential loss. This problem is
typically magnified in high-dimensional settings where the number of variables
$p$ is allowed to diverge with the sample size $n$. However, it is generally
difficult to establish whether an estimator is unbiased and therefore its
asymptotic order is a common approach used (in low-dimensional settings) to
quantify the magnitude of the bias. As an alternative, we introduce a new and
stronger property, possibly for high-dimensional settings, called phase
transition unbiasedness. An estimator satisfying this property is unbiased for
all $n$ greater than a finite sample size $n^\ast$. Moreover, we propose a
phase transition unbiased estimator built upon the idea of matching an initial
estimator computed on the sample and on simulated data. It is not required for
this initial estimator to be consistent and thus it can be chosen for its
computational efficiency and/or for other desirable properties such as
robustness. This estimator can be computed using a suitable simulation based
algorithm, namely the iterative bootstrap, which is shown to converge
exponentially fast. In addition, we demonstrate the consistency and the
limiting distribution of this estimator in high-dimensional settings. Finally,
as an illustration, we use our approach to develop new estimators for the
logistic regression model, with and without random effects, that also enjoy
other properties such as robustness to data contamination and are also not
affected by the problem of separability. In a simulation exercise, the
theoretical results are confirmed in settings where the sample size is
relatively small compared to the model dimension. |
---|---|
DOI: | 10.48550/arxiv.1907.11541 |