Inference in High-dimensional Linear Regression
This paper develops an approach to inference in a linear regression model when the number of potential explanatory variables is larger than the sample size. The approach treats each regression coefficient in turn as the interest parameter, the remaining coefficients being nuisance parameters, and se...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper develops an approach to inference in a linear regression model
when the number of potential explanatory variables is larger than the sample
size. The approach treats each regression coefficient in turn as the interest
parameter, the remaining coefficients being nuisance parameters, and seeks an
optimal interest-respecting transformation, inducing sparsity on the relevant
blocks of the notional Fisher information matrix. The induced sparsity is
exploited through a marginal least squares analysis for each variable, as in a
factorial experiment, thereby avoiding penalization. One parameterization of
the problem is found to be particularly convenient, both computationally and
mathematically. In particular, it permits an analytic solution to the optimal
transformation problem, facilitating theoretical analysis and comparison to
other work. In contrast to regularized regression such as the lasso and its
extensions, neither adjustment for selection nor rescaling of the explanatory
variables is needed, ensuring the physical interpretation of regression
coefficients is retained. Recommended usage is within a broader set of
inferential statements, so as to reflect uncertainty over the model as well as
over the parameters. The considerations involved in extending the work to other
regression models are briefly discussed. |
---|---|
DOI: | 10.48550/arxiv.2106.12001 |