Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications

Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algo...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Data mining and knowledge discovery 2000-07, Vol.4 (2-3), p.89
Hauptverfasser: Sarawagi, Sunita, Shiby, Thomas, Agrawal, Rakesh
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 2-3
container_start_page 89
container_title Data mining and knowledge discovery
container_volume 4
creator Sarawagi, Sunita
Shiby, Thomas
Agrawal, Rakesh
description Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability. As a byproduct of this study, we identify some primitives for native support in database systems for decision-support applications.
doi_str_mv 10.1023/A:1009887712954
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_230128621</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>930539721</sourcerecordid><originalsourceid>FETCH-LOGICAL-c228t-92e6ed915310155df261d74cbe584b6ffdea5948c309aa3dee567113c21b0a4a3</originalsourceid><addsrcrecordid>eNotT1FLwzAYDKLgnD77Gnyv5kuaNtlbmU4LE2Eq-DbS9OvsyNLZpIr_3jp9uuOOO-4IuQR2DYyLm2IGjGml8hy4lukRmYDMRZLL7O145EKliVTATslZCFvGmOSCTYgtfcRNb2LrN7QIobPtyDtPV4ND-tj6X_2rje90he7gGEdvTTSVCUifv0PEXZjRwkXs_eh_YqDG17Tc7V1rD4FwTk4a4wJe_OOUvC7uXuYPyfLpvpwXy8RyrmKiOWZYa5ACGEhZNzyDOk9thVKlVdY0NRqpU2UF08aIGlFmOYCwHCpmUiOm5Oqvd993HwOGuN52w7jKhfX4FbjKOIgfyu1YWw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>230128621</pqid></control><display><type>article</type><title>Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications</title><source>SpringerLink Journals</source><creator>Sarawagi, Sunita ; Shiby, Thomas ; Agrawal, Rakesh</creator><creatorcontrib>Sarawagi, Sunita ; Shiby, Thomas ; Agrawal, Rakesh</creatorcontrib><description>Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability. As a byproduct of this study, we identify some primitives for native support in database systems for decision-support applications.</description><identifier>ISSN: 1384-5810</identifier><identifier>EISSN: 1573-756X</identifier><identifier>DOI: 10.1023/A:1009887712954</identifier><language>eng</language><publisher>New York: Springer Nature B.V</publisher><subject>Algorithms ; Architecture ; Associations ; Boolean ; Data mining ; Data warehouses ; Queries ; Relational data bases</subject><ispartof>Data mining and knowledge discovery, 2000-07, Vol.4 (2-3), p.89</ispartof><rights>Kluwer Academic Publishers 2000</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c228t-92e6ed915310155df261d74cbe584b6ffdea5948c309aa3dee567113c21b0a4a3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><creatorcontrib>Sarawagi, Sunita</creatorcontrib><creatorcontrib>Shiby, Thomas</creatorcontrib><creatorcontrib>Agrawal, Rakesh</creatorcontrib><title>Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications</title><title>Data mining and knowledge discovery</title><description>Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability. As a byproduct of this study, we identify some primitives for native support in database systems for decision-support applications.</description><subject>Algorithms</subject><subject>Architecture</subject><subject>Associations</subject><subject>Boolean</subject><subject>Data mining</subject><subject>Data warehouses</subject><subject>Queries</subject><subject>Relational data bases</subject><issn>1384-5810</issn><issn>1573-756X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2000</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>BENPR</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNotT1FLwzAYDKLgnD77Gnyv5kuaNtlbmU4LE2Eq-DbS9OvsyNLZpIr_3jp9uuOOO-4IuQR2DYyLm2IGjGml8hy4lukRmYDMRZLL7O145EKliVTATslZCFvGmOSCTYgtfcRNb2LrN7QIobPtyDtPV4ND-tj6X_2rje90he7gGEdvTTSVCUifv0PEXZjRwkXs_eh_YqDG17Tc7V1rD4FwTk4a4wJe_OOUvC7uXuYPyfLpvpwXy8RyrmKiOWZYa5ACGEhZNzyDOk9thVKlVdY0NRqpU2UF08aIGlFmOYCwHCpmUiOm5Oqvd993HwOGuN52w7jKhfX4FbjKOIgfyu1YWw</recordid><startdate>20000701</startdate><enddate>20000701</enddate><creator>Sarawagi, Sunita</creator><creator>Shiby, Thomas</creator><creator>Agrawal, Rakesh</creator><general>Springer Nature B.V</general><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope></search><sort><creationdate>20000701</creationdate><title>Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications</title><author>Sarawagi, Sunita ; Shiby, Thomas ; Agrawal, Rakesh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c228t-92e6ed915310155df261d74cbe584b6ffdea5948c309aa3dee567113c21b0a4a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2000</creationdate><topic>Algorithms</topic><topic>Architecture</topic><topic>Associations</topic><topic>Boolean</topic><topic>Data mining</topic><topic>Data warehouses</topic><topic>Queries</topic><topic>Relational data bases</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sarawagi, Sunita</creatorcontrib><creatorcontrib>Shiby, Thomas</creatorcontrib><creatorcontrib>Agrawal, Rakesh</creatorcontrib><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Research Library</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><jtitle>Data mining and knowledge discovery</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sarawagi, Sunita</au><au>Shiby, Thomas</au><au>Agrawal, Rakesh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications</atitle><jtitle>Data mining and knowledge discovery</jtitle><date>2000-07-01</date><risdate>2000</risdate><volume>4</volume><issue>2-3</issue><spage>89</spage><pages>89-</pages><issn>1384-5810</issn><eissn>1573-756X</eissn><abstract>Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL cursor interface; encapsulation of a mining algorithm in a stored procedure; caching the data to a file system on-the-fly and mining; tight-coupling using primarily user-defined functions; and SQL implementations for processing in the DBMS. We comprehensively study the option of expressing the mining algorithm in the form of SQL queries using Association rule mining as a case in point. We consider four options in SQL-92 and six options in SQL enhanced with object-relational extensions (SQL-OR). Our evaluation of the different architectural alternatives shows that from a performance perspective, the Cache option is superior, although the performance of the SQL-OR option is within a factor of two. Both the Cache and the SQL-OR approaches incur a higher storage penalty than the loose-coupling approach which performance-wise is a factor of 3 to 4 worse than Cache. The SQL-92 implementations were too slow to qualify as a competitive option. We also compare these alternatives on the basis of qualitative factors like automatic parallelization, development ease, portability and inter-operability. As a byproduct of this study, we identify some primitives for native support in database systems for decision-support applications.</abstract><cop>New York</cop><pub>Springer Nature B.V</pub><doi>10.1023/A:1009887712954</doi></addata></record>
fulltext fulltext
identifier ISSN: 1384-5810
ispartof Data mining and knowledge discovery, 2000-07, Vol.4 (2-3), p.89
issn 1384-5810
1573-756X
language eng
recordid cdi_proquest_journals_230128621
source SpringerLink Journals
subjects Algorithms
Architecture
Associations
Boolean
Data mining
Data warehouses
Queries
Relational data bases
title Integrating Association Rule Mining with Relational Database Systems: Alternatives and Implications
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T09%3A08%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Integrating%20Association%20Rule%20Mining%20with%20Relational%20Database%20Systems:%20Alternatives%20and%20Implications&rft.jtitle=Data%20mining%20and%20knowledge%20discovery&rft.au=Sarawagi,%20Sunita&rft.date=2000-07-01&rft.volume=4&rft.issue=2-3&rft.spage=89&rft.pages=89-&rft.issn=1384-5810&rft.eissn=1573-756X&rft_id=info:doi/10.1023/A:1009887712954&rft_dat=%3Cproquest%3E930539721%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=230128621&rft_id=info:pmid/&rfr_iscdi=true