Detecting cellwise outliers in multivariate and high-dimensional data
Standard statistical techniques such as least squares regression are very accurate if the underlying distributional assumptions are satisfied, such as Gaussianity. The assumption of Gaussian errors precludes outliers, which are observations that deviate from the fit suggested by the majority of the...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Dissertation |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Standard statistical techniques such as least squares regression
are very accurate if the underlying distributional assumptions are satisfied,
such as Gaussianity. The assumption of Gaussian errors precludes outliers,
which are observations that deviate from the fit suggested by the majority
of the data. But real data often do contain outliers, which destroy the least
squares fit. Nowadays data are often high-dimensional and in that case the
outliers are even harder to detect. Because of thisrobust estimators have
been developed, which are less sensitive to outliers. As a side effect, the
outliers can be detected by their residuals from the robust fit. Unfortunately,
many robust methods currently require substantial computation time, so it
is necessary to develop fasteralgorithms for them. There has been some
progress in the construction of fast algorithms for robust linear regression
and for the robust estimation of multivariate scatter matrices, but there is
much room for improvement. This doctoral project aims to develop efficient
algorithms for robust regression through the origin, for scatter matrices and
principal components with given center, for sparse estimation and variable
selection in high dimension, for robust low-rank approximation of multivariate
data (a kind of singular value decomposition), and for robust estimation for
data containing cellwise outliers. |
---|