Performance Evaluation and Optimization of Join Queries for Association Rule Mining

The explosive growth in data collection in business organizations introduces the problem of turning these rapidly expanding data stores into nuggets of actionable knowledge. The state-of-the-art data mining tools available for this integrate loosely with data stored in DMBSSs, typically through a cu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Thomas, Shiby, Chakravarthy, Sharma
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The explosive growth in data collection in business organizations introduces the problem of turning these rapidly expanding data stores into nuggets of actionable knowledge. The state-of-the-art data mining tools available for this integrate loosely with data stored in DMBSSs, typically through a cursor interface. In this paper, we consider several formulations of association rule mining (a typical data mining problem) using SQL-92 queries and study the performance of different join orders and join methods for executing them. We analyze the cost of the different execution plans which provides a basis to incorporate the semantics of association rule mining into future query optimizers. Based on them we identify certain optimizations and develop the Set-oriented Apriori approach. This work is an initial step towards developing “SQL-aware” mining algorithms and exploring the enhancements to current relational DBMSs to make them “mining-aware” thereby bridging the gap between the two.
ISSN:0302-9743
1611-3349
DOI:10.1007/3-540-48298-9_26