Retire: Robust Expectile Regression in High Dimensions

High-dimensional data can often display heterogeneity due to heteroscedastic variance or inhomogeneous covariate effects. Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. The former is computationally challenging due to the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2023-03
Hauptverfasser:	Man, Rebeka, Tan, Kean Ming, Wang, Zian, Wen-Xin, Zhou
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Heterogeneity Iterative methods Mathematical analysis Regression Regularization Robustness (mathematics) Signal strength Statistical analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	High-dimensional data can often display heterogeneity due to heteroscedastic variance or inhomogeneous covariate effects. Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. The former is computationally challenging due to the non-smooth nature of the check loss, and the latter is sensitive to heavy-tailed error distributions. In this paper, we propose and study (penalized) robust expectile regression (retire), with a focus on iteratively reweighted \(\ell_1\)-penalization which reduces the estimation bias from \(\ell_1\)-penalization and leads to oracle properties. Theoretically, we establish the statistical properties of the retire estimator under two regimes: (i) low-dimensional regime in which \(d \ll n\); (ii) high-dimensional regime in which \(s\ll n\ll d\) with \(s\) denoting the number of significant predictors. In the high-dimensional setting, we carefully characterize the solution path of the iteratively reweighted \(\ell_1\)-penalized retire estimation, adapted from the local linear approximation algorithm for folded-concave regularization. Under a mild minimum signal strength condition, we show that after as many as \(\log(\log d)\) iterations the final iterate enjoys the oracle convergence rate. At each iteration, the weighted \(\ell_1\)-penalized convex program can be efficiently solved by a semismooth Newton coordinate descent algorithm. Numerical studies demonstrate the competitive performance of the proposed procedure compared with either non-robust or quantile regression based alternatives.
ISSN:	2331-8422