Optimal Rate of Kernel Regression in Large Dimensions

We perform a study on kernel regression for large-dimensional data (where the sample size \(n\) is polynomially depending on the dimension \(d\) of the samples, i.e., \(n\asymp d^{\gamma}\) for some \(\gamma >0\) ). We first build a general tool to characterize the upper bound and the minimax low...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-06
Hauptverfasser: Lu, Weihao, Zhang, Haobo, Li, Yicheng, Xu, Manyun, Lin, Qian
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We perform a study on kernel regression for large-dimensional data (where the sample size \(n\) is polynomially depending on the dimension \(d\) of the samples, i.e., \(n\asymp d^{\gamma}\) for some \(\gamma >0\) ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity \(\varepsilon_{n}^{2}\) and the metric entropy \(\bar{\varepsilon}_{n}^{2}\) respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on \(\mathbb{S}^{d}\), we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is \(n^{-1/2}\) when \(n\asymp d^{\gamma}\) for \(\gamma =2, 4, 6, 8, \cdots\). We then further determine the optimal rate of the excess risk of kernel regression for all the \(\gamma>0\) and find that the curve of optimal rate varying along \(\gamma\) exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.
ISSN:2331-8422