A SPLIT-AND-CONQUER APPROACH FOR ANALYSIS OF EXTRAORDINARILY LARGE DATA

If there are datasets, too large to fit into a single computer or too expensive for a computationally intensive data analysis, what should we do? We propose a split-and-conquer approach and illustrate it using several computationally intensive penalized regression methods, along with a theoretical s...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Statistica Sinica 2014-10, Vol.24 (4), p.1655-1684
Hauptverfasser: Chen, Xueying, Xie, Min-ge
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:If there are datasets, too large to fit into a single computer or too expensive for a computationally intensive data analysis, what should we do? We propose a split-and-conquer approach and illustrate it using several computationally intensive penalized regression methods, along with a theoretical support. We show that the split-and-conquer approach can substantially reduce computing time and computer memory requirements. The proposed methodology is illustrated numerically using both simulation and data examples.
ISSN:1017-0405
1996-8507
DOI:10.5705/ss.2013.088