Parallel subgroup discovery on computing clusters - First results

Data mining tasks often have very high computational costs. In this paper, we present a parallel computation approach for the local pattern mining task of subgroup discovery. Unlike earlier related approaches, we do not distribute the data to be analyzed, but instead distribute portions of the overa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Trabold, Daniel, Grosskreutz, Henrik
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Data mining tasks often have very high computational costs. In this paper, we present a parallel computation approach for the local pattern mining task of subgroup discovery. Unlike earlier related approaches, we do not distribute the data to be analyzed, but instead distribute portions of the overall search space to be considered on different computing nodes. Our approach has low communication costs, only submitting messages when new exceedingly good patterns are visited. While the paper describes work-in-progress, we already present first experiments, witnessing a speedup factor of about 34 on 64 computing units.
DOI:10.1109/BigData.2013.6691625