Change-point testing for parallel data sets with FDR control

Large parallel data sets—consisting of paired measurements of responses and covariates—collected over time from numerous sources are ubiquitous. It is of great interest to identify the data sources where the underlying regression relationship of each data set has shifted. To be specific, the regress...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computational statistics & data analysis 2023-06, Vol.182, p.107705, Article 107705
Hauptverfasser:	Cui, Junfeng, Wang, Guanghui, Zou, Changliang, Wang, Zhaojun
Format:	Artikel
Sprache:	eng
Schlagworte:	Change-point False discovery rate Multiple testing Sample-splitting
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Large parallel data sets—consisting of paired measurements of responses and covariates—collected over time from numerous sources are ubiquitous. It is of great interest to identify the data sources where the underlying regression relationship of each data set has shifted. To be specific, the regression coefficient is changed to another one at some time point for each data set. Borrowing the strength of recent developments of multiple testing procedures, a residual-aggregated testing (RAT) method is proposed for recovering such data sources by controlling the false discovery rate (FDR). The proposed method can effectively incorporate the dependence structure among different data sets, and is more robust than the conventional Benjamini-Hochberg method based on asymptotic p-values or numerical approximations. Under mild conditions, the asymptotic validity for both the false discovery proportion and FDR control is established. Extensive numerical results further confirm the effectiveness and robustness of the proposed method.
ISSN:	0167-9473 1872-7352
DOI:	10.1016/j.csda.2023.107705